Ranking and Unranking of Combinations and Permutations
by Derrick Stolee
Now that we’ve discussed a nice standard example of canonical deletion, I want to get into more complicated uses. These techniques use more complicated augmentations and deletions, and hence require a little more delicate description of orbit calculations. In order to get to those, we need to be able to store and manipulate subsets of things. Another way to think of “subsets of things” is combinations, especially when we know the size of the subset.
Given a set , we will discuss the power set of all subsets of , the -subsets , the -tuples , and the -permutations . There are multiple ways to store a given subset, tuple, or permutation, but what is more important is how to store a list of information over the entire collection of these objects. My favorite method is to use a ranking function, which assigns each object an integer which I can use as an array position. Then, an unranking function will invert this process.
This area has a long history, with several different methods of ranking and unranking the objects competing for dominance in several measures of quality. My favorite measure is “readability” and/or “ease of programming” but definitely “computation speed” is important, too. However, sometimes ranking is fast and unranking is slow, sometimes the ranking and unranking is fixed to a certain order, such as lexicographic order.
Today we’ll talk about several ranking and unranking procedures, and the next post will discuss how to use these ranking/unranking functions to compute orbits of these objects for a graph with symmetry.
Since we will be dealing with actual software implementations of these combinatorial objects, we shall use the following conventions:
- For , is the th element of , for .
Subsets are some of the easiest objects to rank, since there is a very simple bijection between the subsets of and the first integers. For a subset , let be the -bit integer whose th coordinate is the indicator with value 1 if and only if . The best part about this ranking is that the integer rank is the set, stored directly. To test membership of in using , simply test if .
Of course, actually ranking all subsets is not recommended for large , but particularly when is near the number of bits in a standard-length integer (32 or 64, depending on your processor/language). Even using that last bit can cause sign errors if you’re not careful!
To rank tuples, consider the simplest way to count -tuples in : there are ways to fix the th coordinate and ways to extend the tuple to the previous positions (recall ). Therefore, we can rank a tuple by . To unrank the tuple, iteratively pull out the residue modulo and then divide by (using integer division) to find the rank of the remaining tuple of shorter length.
In the special case that , we can assign each block of bits to be the value with the same ranking function but a faster unranking procedure (it avoids integer division). In fact, you could use a similar approach for all , except there will be integers which do not correspond to tuples. This may be trivial when both and are small, but the effect grows exponentially with with base .
The most natural way to order combinations of is lexicographically. This is the same way we order words in a dictionary. So, if for all and . Hence the first -set is and the last is . However, the set is before the set .
This ordering provides a natural ranking procedure:
The idea of this ranking is that we go through all of the positions and compute how many -sets start with the first positions of , then have the value , then any -set of elements from . To unrank, we find the positions in increasing order by subtracting these inner sum values, when possible.
unranklex(): while : if : else return
This ranking function is unfortunate that it requires the nested sums (and the binomial requires a loop to calculate, as well). By changing the order slightly, we can improve performance.
Using the co-lexicographic ordering of the combinations can simplify a few things. First, the dependence on will be completely removed. Second, the ranking and unranking methods are much simpler. The co-lex order is given by comparing the elements of the sets from largest to smallest, until there is a difference. So, and .
To rank a set in co-lex, we count how many subsets appear before by using the following fact: the value can appear in the th position of only after all -subsets of appear as the first positions of a set, followed by .
To un-rank an index , we must find the most extremal way to write as a sum of binomial coefficients, placing the largest value in the top of the coefficient of largest value.
unrankco-lex(): while : while : return
You know what? I started this post two weeks ago and delayed writing because I’ve never ranked or unranked a permutation. Apparently, it’s not a simple task. See the references below by Er and Myrvold & Ruskey who present a very detailed bibliography on the techniques here. Also, see Glenn Tesler’s presentation on ranking, unranking, and finding sequential permutations.
A signed object is usually a pairing of a standard (unsigned) object (such as the objects above) and associating a positive or negative value to each element or position in that object. For example, a signed permutation is really a pair where is a bijection and is a function. One simple way to rank such objects is to take a tensor product of the rank of the unsigned object with the rank of the set of elements with negative sign. It only matters which object is “first” in this order. So one option is and another is .
J. Arndt, Matters Computational: Ideas, Algorithms, Source Code. Available on the FXT page.
T. Johnston, Sorting Signed Permutations Using Cut-And-Paste Operations, UNL Honors Thesis (2009).
M. C. Er, Lexicographic Enumeration, Ranking, and Unranking of Permutations of out of Objects, International Journal of Computer Mathematics 23(1) (1987).
W. Myrvold, F. Ruskey, Ranking and Unranking Permutations in Linear Time, Information Processing Letters (2000).