ZPICS.tar.gz (59.35 MB)

Examples comparisons for random partition algorithms

Version 2 2013-02-05, 16:04

Version 1 2012-10-05, 02:48

dataset

posted on 2013-02-05, 16:04 authored by Kenneth LoceyKenneth Locey

Topic: generating uniform random samples from the set of all integer partitions for a given total N and a number of parts S.

Problem: current random integer partitioning functions of mathematical software can take a long time to generate a single partition for a given N (regardless of S) and an untennable amount of time generating partitions of N with S parts.

Currently, no function among math softwares, peer-reviewed literature, and on stackexchange and arxiv.org generates random partitions with respect to both N and S. Consequently, if one is interested in generating random integer partitions for N having S parts, then one must usually waste time generating random partitions of N and rejecting those not matching S.

Note! I have since solved this problem definitively:

The question is asked and the solution is presented here: stackoverflow.com/questions/10287021/an-algorithm-for-randomly-generating-integer-partitions-of-a-particular-length/12742508#12742508

I've recently published a preprint of a manuscript on figshare outlining a simple and unbiased solution to this question. figshare.com/articles/Random_integer_partitions_with_restricted_numbers_of_parts/156290

However, below is an approach I tried and was unable to eliminate sampling bias from while keeping reasonable speed. I guess this page is more useful as a good way to NOT go about getting random integer partitions for N and S.

Deprecated alternative approach (often biased and slower than the above solution): Generate a single random partition of N and randomly manipulate it until its number of parts equals S. Why? Because randomly perturbing a partition of N until it satisfies S can be faster than generating random partitions based solely on N and rejecting those without S parts.

Contents (results of deprecated algorithm):

Visual comparisons of 500 random samples generated from the new function derived by myself (red curves) against 500 random samples generated using the random partition function found in the Sage mathematical environment (black curves).

Kernel density curves (red ones and black ones) are for statistical evenness across the partition. Statistical evenness is a standardized log-transform of the variance. Kernel density cures that overlap nearly completely reveal that the random samples of partitions generated between the two approaches share a similar structure.

Evenness is estimated using Evar, a transform of the variance of log summand values. Evar is standardized to take values between 0.0 (no evenness) and 1.0 (perfect evenness).

Close agreement between the random manipulation approach and the Sage function (very high rejection rates as most partitions of N don't match S) was also found using other statistical characteristics (e.g. median summand, relative size of largest summand). These results reveal that the statistical quality of evenness (a transform of the variance) is in high agreement between the two approaches (Sage's function and the potential alternative of randomly manipulating integer partitions using conjugates).

Note: I have found biases in skewness and the median summand value with this type of method (randomly manipulate an integer partition to arrive at a uniform random sample based on N and S), and would not recommend this approach.