Finding Cliques and Independent Sets in Circulant Graphs

by Derrick Stolee

We recently discussed cliquer, a “fast” algorithm for finding and counting maximum cliques in a graph. Today, I would like to discuss a modification of this algorithm when applied to graphs that have a very restrictive form of symmetry. Specifically, we will discuss circulant graphs and interval subgraphs of distance graphs. These graphs are particularly important for discovering the independence ratio of distance graphs, as discussed in the recently submitted paper, On the independence ratio of distance graphs with James Carraher, David Galvin, Stephen Hartke, and Jamie Radcliffe.

All code and data discussed is available at the distance graph page.

The Problem

Let S be a set of positive integers. For now, we will assume that S is finite. The distance graph G(S) is the graph whose vertices are the integers, and two integers i, j are adjacent if and only if |i-j| \in S. For an integer n, the circulant graph G(n,S) is the graph whose vertices are the residue classes modulo n and two classes are adjacent if and only if |i - j| \in S for some pair of representatives i, j of the two classes. This is really not that complicated: Line up n vertices, and for every distance d \in S, add “chords” between every pair of vertices that are d positions apart.

Observe that the distance graph G(S) can be considered to be the limit object of G(n, S) as n goes to infinity.

Taking the limit of circulant graphs creates a distance graph.

Figure 1. Taking the limit of circulant graphs creates a distance graph.

We consider an independent set X \subseteq \mathbb Z in G(S) that is as dense as possible. That is, we define \delta(X) = \limsup_{n \to \infty} \frac{|X \cap [-N,N]|}{2N+1} and try to maximize \delta(X).

If we take an independent set X_n in a circulant graph G(n,S) and let X be the set of integers that are equivalent to an element of X_n modulo n, then X is an independent set in G(S). Further, it is periodic with period at most n! One of the main results in the paper is the following.

Theorem (Carraher, Galvin, Hartke, Radcliffe, Stolee). Let S be a finite set of integers and s = \max S. There exists a periodic set X \subset \mathbb Z that is independent in G(S) of maximum density and the minimum period of X is at most s2^s.

Thus, periodic sets achieve the optimal value, called the independence ratio \overline{\alpha}(S). Our goal is to discover this value algorithmically using finite computations. Thus, \alpha(G(n,S)) / n is a lower bound on \overline{\alpha}(S) for all n \geq 2\max S+1 and these can be computed. However, unless we extend to n = s2^s where s = \max S, then we cannot be sure our best such is also an upper bound! Thus, we also consider the graph G(S)[m], which is the subgraph of the distance graph that is induced by the integers 0,\dots, m-1. To find an upper bound on the density over the whole line, that density cannot exceed what we can fit in the interval of length m! Thus, we have the following bounds:

\frac{\alpha(G(n,S))}{n} \leq \overline{\alpha}(S) \leq \frac{\alpha(G(S)[m])}{m}.

Thus, we want to compute \alpha(G(n,S)) and G(S)[m]. Luckily, finding large independent sets is equivalent to finding large cliques in the complement. We will use the ideas of cliquer in order to solve this problem!

The Algorithm for Intervals

A quick notational note: We use [m] = \{0,\dots, m-1\} and [i,j)= \{i, \dots, j-1\}.

We begin by considering the values \alpha(G(S)[m]). Observe that for all m_1 \leq m_2, we have G(S)[m_1] \subseteq G(S)[m_2]. In fact, for all i \in \{0,\dots, m_2 - m_1-1\}, we have that G(S)[i,i+m_1) \subseteq G(S)[m_2]. We will use both facts to our advantage, where the standard cliquer algorithm can only use the first.

For each i \geq 0, let c[i] = \alpha( G(S)[i] ), where c[0] = 0, c[1] = 1. We shall determine each c[i] in order, using previous calculations to help. Observe that c[i+1] \in \{ c[i], c[i]+1\}, so we will attempt to build an independent set using the element i+1 that has size at least c[i]+1. If we succeed, then we terminate immediately. If all attempts fail, then we know that c[i+1] = c[i].

Start by setting U = [i+1] and X = \varnothing. When an element x is added to X, then N(x)\cup \{x\} is removed from U. Thus, we first add i to X and remove i and N(i) = \{ i - s : s\in S\} from U.

The recursive algorithm takes the sets X, U, and selects j = \max U. The algorithm attempts the two choices: place j in X or leave j out of X. This is done by removing j from U in either case, and placing j in X only for the first. This will enumerate all possible independent sets in G(S)[i+1].

However, a branching algorithm is only as effective as its pruning method! We have a few ways to detect if a search node contains no possible solution. Recall that j = \max U.

  1. If |X| + c[j] \leq c[i], then no extension of X using elements in U will create an independent set of size c[i]+1.
  2. Let (a_t, b_t)_{t=1}^k be pairs such that U = \cup_{t =1}^k [a_t, b_t). Then if |X| + \sum_{t=1}^k c[b_t-a_t] \leq c[i], then no extension of X using elements in U will create an independent set of size c[i]+1.

The first prune above is the pruning mechanism directly from cliquer. It is the second prune that is special to this restriction to G(S)[m]. If the generators from S are sufficiently “spread out,” then the set U will become split into several intervals, creating small subgraphs G(S)[a_t, b_t) that are used to select the rest of the independent set. While there may be edges between these intervals, perhaps most of the difficulties in creating a new independent set come from within these intervals.

These two pruning mechanisms are incomparable: there exist situations where one works and the other does not. However, the two together work to very quickly determine the clique number of intervals containing even 200 elements.

In this way, we can compute c[i] for each i \geq 0 one at a time until we are satisfied. As we go, we can track the minimum value of c[i]/i = \alpha(G(S)[i])/i for all i \geq 1. We cannot be sure we have computed the exact value of \overline{\alpha}(S) until we have a corresponding lower bound, which requires considering the circulant setting.

The Algorithm for Circulants

The main way that circulant graphs G(n,S) differ from the intervals G(S)[m] is the “wrap around” effect. The intervals simply do not have the same difficulties with the edges, and thus are simpler to handle. Further complicating the issue is that we do not know which value n will present the best lower bound of \alpha(G(n,S))/n \leq \overline{\alpha}(S). Thus, we need to try each as we go.

For 0 \leq i < n, let c[n,i] = \alpha(G(n,S)[i]) : the independence number of the first i points in the circulant graph G(n,S). Note that c[n,i] \leq c[i] always, and we still have that c[n,i+1] \in \{c[n,i], c[n,i]+1\}. Thus, for each n, we can compute all values c[n,i] for i \leq n and report \alpha(G(n,S)) = c[n,n].

We use the same recursive algorithm as in the interval case, except we are careful of the wrap-around effect when removing elements from U. Also, we use both pruning mechanisms, except the first uses c[n,j] and the second uses c[b_t - a_t]. Thus, it is important to compute the interval numbers before computing the circulant numbers.

Computing \overline{\alpha}(S)

By the previous algorithms, we can compute both lower and upper bounds on \overline{\alpha}(S). However, since computing $latexalpha(G(n,S))$ requires n instances of the algorithm with no help from c[n-1,i], it is somewhat slower to compute each of the circulant graph ratios than to compute the interval ratios. Further, we found that the minimum n such that \frac{\alpha(G(n,S))}{n} =\overline{\alpha}(S) is quite a bit lower than the minimum value m such that \overline{\alpha}(S)= \frac{\alpha(G(S)[m])}{m}.

Thus, we followed the pattern of computing c[5n+j] for all j \in \{0,\dots,4\} before computing c[n,n]. Continuing until our best lower bound matches our best upper bound, we determined \overline{\alpha}(S).

Future Work

The strength of this algorithm came from the fact that there is a single automorphism \sigma such that there is a single \sigma-orbit of vertices. In fact, a graph with such an automorphism is in fact circulant, as you can relabel the vertices as u_i = \sigma^i(v) for some vertex v.

The concepts used by this algorithm could be extended to graphs with high-order symmetries. By selecting an automorphism \sigma that has large orbits, few orbits, or one large orbit, then perhaps there could be some improvements over cliquer over these classes of highly-symmetric graphs.

However, (and this is a big “however”) finding such a “large” automorphism \sigma may be incredibly hard. While nauty can find a set of generating automorphisms, I could not find many algorithms for finding automorphsims with certain properties. The best I could find is Linear optimization over permutation groups, which essentially uses branch-and-bound for a related problem. I would be very interested to see more work on optimizing more orbit-type properties of a single automorphism in the automorphism group.

References

C. Buchheim, M. Jünger, Linear optimization over permutation groups, Discrete Optimization 2 (2005) 308-319.

J. Carraher, D. Galvin, S. Hartke, J. Radcliffe, D. Stolee, On the independence ratio of distance graphs arxiv:1401.7183 [math.CO] (2014).

Distance Graph Homepage

Cliquer Homepage

Sampo Niskanen and Patric R.J. Östergård, Cliquer User’s Guide.

Patric R. J. Östergård, A fast algorithm for the maximum clique problem, Discrete Applied Mathematics 120 (2002) 197-207.

Advertisements