### Shannon Capacity of Odd Cycles

Breaking news: Mathew and Östergård have improved lower bounds on the Shannon capacity of some odd cycles. These improvements over a very old and difficult problem come via some sophisticated algorithms.

### Shannon Capacity of a Graph

For a full description of Shannon capacity, see the article on Wikipedia.

Let $G$ be a graph. An independent set is a set of vertices where no two are adjacent. Let $\alpha(G)$, called the independence number of $G$, is the maximum size of an independent set. If $k \geq 1$ is an integer, then the $k$th strong power of $G$, denoted $G^k$, is the graph where every vertex is a $k$-tuple where every coordinate is a vertex of $G$ and two tuples are adjacent in $G_k$ if and only if for every position the vertices of those two coordinates are equal or are adjacent in $G$. The Shannon capacity of $G$, denoted $\Theta(G)$, is defined as

$\Theta(G) = \sup_{k\geq 1} \sqrt[k]{G^k} = \lim_{k\to\infty} \sqrt[k]{G^k}.$

Many things in information theory are named after its founder, Claude Shannon. This is no exception, and while we defined $\Theta(G)$ in graph theory terms, there is a meaningful definition when discussion zero-error communication across a channel. For simplicity, we will focus on the case of a cycle, $G = C_n$.

Suppose that there is a channel that can send signals where each signal corresponds to an element of the cyclic group $\mathbb{Z}_n$, but sometimes a signal gets corrupted and when we send $i$ through the channel the output signal of the channel can be “halfway between” $i$ and $i+1$, or between $i-1$ and $i$. Then the cycle graph $C_n$ has adjacencies between elements that can be confused by the receiver of a message. When a message is received, each symbol of the message has two possible initial values.

One way to make sure your message is properly communicated is to agree up front to only use an independent set of $C_n$. Then, when the output signal is “halfway between” two vertices, exactly one of those vertices is in the independent set, and you can reconstruct the original symbol. For instance, if $n$ is even then we can use the “odd” letters to form an independent set of size $n/2$. When $n$ is odd, then we cannot use all “odd” letters, as both $1$ and $n$ are odd and adjacent. However, we could use all “even” letters, resulting in $(n-1)/2$ possible letters. Using a maximum-size independent set gives you the most options for your symbols, so you can encode messages using fewer symbols. Thus, there is a small bit of inefficiency when $n$ is odd compared to $n$ being even.

A founding principle of coding theory is that when messages get longer you can more carefully encode details by carefully arranging words of multiple symbols to have some dependence between the letters. That is, if we consider blocks of two symbols instead of individual symbols, we are considering vertices in $G^2$, and words of length two can be confused if their first or second symbols are confusable. Thus we should use an independent set $S$ in $G^2$, and our efficiency should be measured against the “effective number” of symbols per position: that is $\sqrt{|S|}$ (as there are $|S| = \sqrt{|S|} \cdot \sqrt{|S|}$ possible messages for every two positions). This generalizes to using an independent set in $G^k$ when using messages of length $k$ and the effective number of letters is $\sqrt[k]{\alpha(G^k)}$. If we allow longer and longer messages, we may become more efficient, so we should see what happens in the limit.

### Shannon Capacity of Odd Cycles

The Shannon capacity of an even cycle $C_n$ is $n/2$, and it is not hard to convince yourself this is the case. The only odd cycle whose Shannon capacity is known is $C_5$, where Lovász proved that $\Theta(C_5) = \sqrt{5}$ (and this is achieved by $\sqrt{5} = \sqrt{\alpha( C_5^2 )}$). All other odd cycles have a gap between known lower and upper bounds. This is due in part to the fact that it is hard to compute $\alpha(C_n^k)$ for even small $k$ and $n$ (as $|V(C_n^k)| = n^k$, so standard exponential-time algorithms take $\Omega(2^{n^k})$ time).

Mathew and Östergård succeed in increasing the lower bounds on $\Theta(C_7)$ and $\Theta(C_{15})$ by discovering new, larger independent sets in $C_7^5$ and $C_{15}^4$ (they also find new, larger independent sets in $C_{11}^4$ and $C_{13}^4$, but these do not improve the known lower bounds on the Shannon capacities). They use stochastic methods (with some added symmetry) so these are now known to be upper bounds even for these specific graphs.

The end result of their stochastic algorithm is a set of lower bounds on $\alpha(C_n^k)$, which can be translated to lower bounds on $\Theta(C_n)$. The table below presents the current-best bounds (up to four decimal places) on $\Theta(C_n)$ for $n \in \{5,7,9,11,13,15\}$. It is interesting that all upper bounds are found using the Lovász $\vartheta$ function.

 2.236 $<$ $\Theta(C_{5})$ $<$ 2.2361 3.2271 $<$ $\Theta(C_{7})$ $<$ 3.3177 4.3267 $<$ $\Theta(C_{9})$ $<$ 4.3601 5.2895 $<$ $\Theta(C_{11})$ $<$ 5.3864 6.2743 $<$ $\Theta(C_{13})$ $<$ 6.4042 7.2431 $<$ $\Theta(C_{15})$ $<$ 7.4172

One interesting viewpoint brought by the authors is that $C_n^k$ is a circulant graph under a certain ordering of the vertices. We discussed finding cliques and independent sets in circulant graphs before, but those exact methods become intractable somewhere between 100 and 400 vertices (depending on the number of generators used) but the graphs considered here have $7^5 = 16807$ or even $15^4= 50625$ vertices!

Circulant graphs are very nice because they are very symmetric: they have at least the dihedral group as a subgroup of their automorphism group. Frequently (and in this case) there are many other automorphisms. The authors restricted their algorithm to look for symmetric independent sets, a set $S \subseteq V(C_n^k)$ where $S$ is preserved under some subgroup of automorphisms. Given one automorphism, a cyclic subgroup is generated and then the vertices partition into orbits. Careful selection of an automorphism can present many orbits that are themselves independent sets. Then, we can build our independent set by selecting some number of orbits to include (keeping in mind that no adjacencies can exist between the orbits).

Thus, given $G =C_n^k$ and an automorphism $\pi$, we can build a reduced graph $G_\pi$ where the vertices of $G_\pi$ are the orbits of $V(C_n^k)$ under $\pi$ and two orbits are adjacent in $G_\pi$ if and only if there are elements of those orbits that are adjacent in $G$. If $\pi$ is chosen carefully, then $G_\pi$ has many fewer vertices and does not have too many edges. Now we want to build an independent set in $G_\pi$, but the orbits under $\pi$ may have different sizes, so we need to weight the vertices of $G_\pi$ by the number of vertices in each orbit.

In the case of the cycles considered here, the reduced graphs are still quite large (and we have lost the structure of a circulant graph) so it is difficult to compute $\alpha(G_\pi)$. Instead of exactly computing these values, they use a randomized algorithm to search for a large weighted independent set. The algorithm is based on an algorithm by Montemanni and Smith with some extra heuristics by Östergård. Let’s briefly discuss the basics of such an algorithm.

#### Stochastic Search

One can easily build a maximal independent set by randomly adding a vertex that is not adjacent to any vertices currently in your set. You can define heuristics for weighting the vertices to skew your probability distribution such that you are likely to add “good” vertices (those with high weight, but also not too many edges to the other choices). It is best if this heuristic uses your current set and the structure of the remaining vertices.

When you have a maximal independent set, you should not just accept it, but try to improve it! You cannot add vertices, but perhaps you could “swap” vertices by deleting a small number and again performing your randomized addition algorithm. You can again make a heuristic weighting of which vertex you should delete (those with low weight, but also many edges to high-weight vertices).

One difficulty in describing their method is that much of the description is left to an in-preparation manuscript by Östergård. For now, we will reconstruct what we can and leave the rest to future investigations.

Always we will have an independent set $S$ in our vertex-weighted graph $G$, and a value $w_{\max}$ corresponding to the current-best weight of a known independent set. We must describe two heuristic functions (based on $G$, $S$, and $w$): the adding weight for vertices in $V(G) \setminus S$ and the deleting weight of vertices in $S$. Then the probability distribution defining our next action will be given by dividing all weights by the total weight sum.

Adding Weight: Let $v \in V(G) \setminus S$. Let $d(v,S)= \min_{u \in S} d(v,u)$ be the distance from $v$ to the set $S$. If $d(v,S)=1$, then we cannot add $v$ to $S$ and maintain that $S$ is an independent set, so we must give such vertices an adding weight of zero. Otherwise, it may be better to add vertices that are closer to the set (such as $d(v,S)=2$) than far from the set because that may make a better packing. However, we cannot expect $G$ to be bipartite, so we should not restrict ourselves to vertices at even distance. We should also think about how many (or what weight of) vertices will become adjacent to $S$ when adding $v$ to $S$. Thus, a potential adding weight is

$a(v) = \displaystyle\frac{w(v)}{\sqrt{d(v,S)} \cdot \sum_{u \in N(v) \setminus (S \cup N(S))} w(u)}.$

Maximizing this weight function $a(v)$ tries to maximize the weight of $v$ while minimizing the distance to $S$ and the weight of the vertices adjacent to $v$ not currently in $S$.

Deleting Weight: Let $v \in S$. We will want to remove $v$ from $S$ if it will increase the number (or weight) of new additions and also will not decrease the weight of $S$ by too much.

$d(v) = \displaystyle\frac{\sum_{u : N(u) \cap S = \{v\}} w(u)}{w(v)}.$

The heuristics above are crude and have not been tested for their performance. They are included only for the sake of concretely proposing a heuristic that may work for such a problem.

#### Stochastic and Exact Methods

In their paper, Mathew and Östergård make mention of the latter author’s Cliquer software and claims
After removing a set of vertices, an exact algorithm (like Cliquer) can be used to
find a set of vertices to add that have the largest possible weight. In some sense,
this approach lies in between basic stochastic search and exact algorithms.
Based on this small quote, I can only assume that after constructing a maximal independent set $S$ in $G$, the algorithm uses the deleting weight heuristic to randomly delete some number of vertices from $S$. Instead of using a heuristic such as the adding weight to add vertices, they consider the subgraph of $G$ induced by the vertices that could be added to $S$ and then exactly compute a maximum-weight independent set in that graph, giving an optimal way to augment the independent set.

With this approach, the randomness is used to perturb the independent set into a non-maximal set by deleting its “worst” members then adding new members in the best possible way. Combining this process with Tabu search may help to avoid adding the same set of vertices back to the set so this local search process will continue to generate new independent sets the longer the algorithm runs.

### References

Shannon capacity of a graph, Wikipedia.

L. Lovász, On the Shannon capacity of a graph, IEEE Trans. Inform. Theory
25 (1979) 1-7.

K.A. Mathew, P.R.J. Östergård, New Lower Bounds for the Shannon Capacity of Odd Cycles, arXiv:1504.01472 [cs.IT]

R. Montemanni and D. H. Smith, Heuristic algorithms for constructing binary constant weight codes, IEEE Trans. Inform. Theory 55 (2009) 4651–4656.

P.R.J. Östergård, Partial neighborhood search for the maximum weight clique problem, in preparation.

### A Postscript

You know the feeling where you haven’t done an important task and the longer you take to do it the more it seems daunting? My lack of posting here has felt like that, and every time I start a post it is never good enough to finish. Hopefully, that will improve. Also: apparently my email address never got updated to my Iowa State email, so if you tried to email me using the blog links in the past two years, then you did not reach me! Sorry. I hope that everything is fixed now.