Introduction to Orbital Branching

by Derrick Stolee

Orbital branching is an example of a method that started in combinatorial optimization but has been applied for computations in pure combinatorics. In this post, we’ll introduce this method while in the context of generating combinatorial objects. The major difference between this perspective and the original goal is that we are generating all feasible solutions (which are probably rare) while the original applications were finding optimal solutions (and feasible solutions were plentiful).

Disclaimer: This post is adapted from a chapter of my dissertation.

A property P on a combinatorial object can be expressed as a set \{ C_i \} of constraints on a set of 0/1-valued variables. For example, graphs can be stored as indicator variables x_{u,v} with value 1 if and only if uv is an edge. Then, a constraint C_i(\mathbf{x}) can be satisfied when the equality \sum_{j=1}^n x_{i,j} = 4 holds. If all constraints \{C_i\}_{i=1}^n, then the graph corresponding to these variables is 4-regular. Thus, a property P can be expressed as the conjunction of several constraints, where P holds if and only if all constraints hold.

Orbital branching is a technique to solve symmetric constraint systems using branch-and-bound techniques by selecting orbits of variables for the branch instead of a single variable. Orstrowski, Linderoth, Rossi, and Smriglio introduced orbital branching and applied the technique to covering and packing problems. The authors implemented the branching procedure into the optimization framework MINTO. This integration allowed interaction with existing heuristics such as clique cuts and the commercial optimization software IBM ILOG CPLEX. They also generalized the technique to include knowledge of subproblems within the context of Steiner systems.

There are three main components to orbital branching. First, a description of the constraint system must be implemented. Tightly integrated subcomponents of the system include the symmetry model of the constraints and the presolver. The symmetry model consists of a graph G with vertices corresponding to the variables along with vertices encoding the structure of the system. The automorphism group of G is computed using nauty. The presolver reduces the number of variables and constraints by recognizing dependencies. Second, the constraint propagation algorithm takes existing choices from the branch-and-bound and fixes variables with forced values. Third, a branching rule decides which orbit of variables is selected for the branch.

Variable Assignments

We shall consider a constraint problem P where P : \{0,1\}^m \to \{0,1\} is a function on m variables which take value in \{0,1\}. A vector \mathbf{x} = (x_1,\dots,x_m) \in \{0,1, *\}^m is a variable assignment. If x_i = *, we say that variable is unassigned. We can compare two assignments \mathbf{x}, \mathbf{y} with \mathbf{x} \preceq \mathbf{y} if and only if for all i \in \{1,\dots, m\} either x_i = * or x_i = y_i.

Example. An (n, k, \lambda, \mu) strongly regular graph is a k-regular graph on n vertices so that if uv is an edge, then u and v have \lambda common neighbors and if uv is not an edge, then u and v have \mu common neighbors. Let \mathbf{x} be a variable assignment on m = {n\choose 2} variables corresponding to x_i = 1 if and only if the ith unordered pair of vertices is an edge. (See ranking and unranking of sets.)

Thus, a variable assignment \mathbf{x} maps to a trigraph G which is given by a vertex set and the pairs are partitioned into three sets: black edges, white edges, and gray edges, where black edges are “definitely edges”, white edges are “definitely nonedges” and gray edges are “undecided pairs”. Two trigraphs G and H can be compared as G \preceq H if the corresponding variable assignments \mathbf{x} and \mathbf{y} have \mathbf{x} \preceq \mathbf{y}. Trigraphs and variable assignments are useful data structures when searching for strongly regular graphs because nonedges are fundamental to the structure of the graph.

Let \mathsf{SRG}_{n,k}^{\lambda,\mu}(\mathbf{x}) be the boolean function that encodes whether or not the given variable assignment corresponds to an (n,k,\lambda,\mu) strongly regular graph. For later examples, we shall consider n, k, \lambda, and \mu to be fixed and define \mathsf{SRG}(\mathbf{x}) = \mathsf{SRG}_{n,k}^{\lambda,\mu}(\mathbf{x}). End Example.

Our goal is to generate all possible vectors \mathbf{x} so that P(\mathbf{x}) = 1. We shall build these vectors starting at the empty assignment \mathbf{x} = (*, *, \dots, *) and assign values to the variables. A naïve approach would be to brute-force check every possible assignment. Instead, we shall employ constraint detection and constraint propagation techniques.

  1. Let \mathsf{Detect}_P(\mathbf{x}) be an algorithm that returns \mathsf{True} only if for all \mathbf{y} \in \{0,1\}^n where \mathbf{x} \preceq \mathbf{y} we have P(\mathbf{y}) = 0. (That is, all possible extensions of \mathbf{x} have the property P.)
  2. Let \mathsf{Propagate}_P(\mathbf{x}) be an algorithm that on input \mathbf{x} \in \{0,1,*\}^n returns a variable assignment \mathbf{y} \in \{0,1,*\}^n so that for all \mathbf{z} where \mathbf{x} \preceq \mathbf{z} so that P(\mathbf{z}) = 1, then \mathbf{x} \preceq \mathbf{y} \preceq \mathbf{z}. (That is, all possible vectors \mathbf{z} \succeq \mathbf{x} that have the property P also have \mathbf{z} \succeq \mathbf{y}.)

The efficiency and strength of \mathsf{Detect}_P and \mathsf{Propagate}_P will depend on the property P (and the user’s knowledge about the property P). At minimum, \mathsf{Detect}_P(\mathbf{x}) could return \mathsf{True} if and only if \mathbf{x} has no unassigned variables and P(\mathbf{x}) = 0. Further, \mathsf{Propagate}_P(\mathbf{x}) could simply return \mathbf{x}. Using some structure of the function P can allow more complicated (and helpful) functions.

Example. When searching for an (n,k,\lambda,\mu) strongly regular graph, we can guarantee a few properties. Let \mathbf{x} be a variable assignment and G_{\mathbf{x}} be the associated trigraph. Note that the maximum degree of a graph (\Delta(G)) and neighborhoods (N(v_i)) are monotone functions on the edge set of trigraphs with respect to \preceq. The following implications are then immediate:

  1. If \mathsf{SRG}(G) = 1, then \Delta(G) = k. Therefore, if \Delta(G_{\mathbf{x}}) > k, then G_{\mathbf{x}} cannot extend to a strongly regular graph.
  2. If \mathsf{SRG}(G) = 1, then \Delta(\overline{G}) = n - k - 1. Therefore, if \Delta(\overline{G_{\mathbf{x}}}) > n - k - 1, then G_{\mathbf{x}} cannot extend to a strongly regular graph.
  3. If \mathsf{SRG}(G) = 1, then every pair v_iv_j \in E(G) has |N(v_i)\cap N(v_j)| = \lambda. Therefore, if v_iv_j is an edge in G_{\mathbf{x}} and |N(v_i)\cap N(v_j)| > \lambda then G_{\mathbf{x}} cannot extend to a strongly regular graph.
  4. If \mathsf{SRG}(G) = 1, then every pair v_iv_j \in E(\overline{G}) has |N(v_i)\cap N(v_j)| = \mu. Therefore, if v_iv_j is a nonedge in G_{\mathbf{x}} and |N(v_i)\cap N(v_j)| > \mu then G_{\mathbf{x}} cannot extend to a strongly regular graph.

Thus, a possible algorithm for \mathsf{Detect}_{\mathsf{SRG}}(\mathbf{x}) is to return 1 whenever \Delta(G_{\mathbf{x}}) > k, \Delta(\overline{G_{\mathbf{x}}}) > n -k -1, or |N(v_i)\cap N(v_j)| > \max\{\lambda,\mu\} for some pair v_i,v_j \in V(G_{\mathbf{x}}). Further, a possible algorithm for \mathsf{Propagate}_{\mathsf{SRG}}(\mathbf{x}) is to place assignments on unassigned variables whenever these constraints are sharp:

  1. If v_i has degree k, then for any vertex v_j where the pair v_iv_j is unassigned, make v_iv_j be a nonedge.
  2. If v_i has non-degree\footnote{The \textit{non-degree} of a vertex v_i is the number of nonedge pairs containing v_i.} n-k-1, then for any vertex v_j where the pair v_iv_j is unassigned, make v_iv_j be an edge.
  3. If v_iv_j is an edge in G_{\mathbf{x}} and |N(v_i)\cap N(v_j)| = \lambda then whenever v_\ell is a vertex with v_iv_\ell an edge and v_\ell v_j unassigned, make v_\ell v_j be a nonedge. Similarly, if v_iv_\ell is unassigned and v_\ell v_j is an edge, make v_i v_\ell be a nonedge.
  4. If v_iv_j is a nonedge in G_{\mathbf{x}} and |N(v_i)\cap N(v_j)| = \mu then whenever v_\ell is a vertex with v_iv_\ell an edge and v_\ell v_j unassigned, make v_\ell v_j be a nonedge. Similarly, if v_iv_\ell is unassigned and v_\ell v_j is an edge, make v_i v_\ell be a nonedge.

End Example.

Given algorithms for \mathsf{Detect}_P(\mathbf{x}) and \mathsf{Propagate}_P(\mathbf{x}), we can build a full search algorithm using branch-and-bound. One missing ingredient is the \mathsf{Unassigned}_P(\mathbf{x}) function which selects an index i of an unassigned variable (with x_i = *). Such a selection could be arbitrary, but it could also be carefully selected. This algorithm allows for dynamic variable ordering, which changes which variable we shall assign next in hopes that certain constraints will become sharp and the \mathsf{Propagate}_P(\mathbf{x}) algorithm can assign several new values in one step. The full branch-and-bound algorithm is given as \mathsf{BranchAndBound}_P(\mathbf{x}) in Algorithm 1.

Algorithm 1. \mathsf{BranchAndBound}_P(\mathbf{x})
if \mathsf{Detect}_P(\mathbf{x})
     return
end if
\mathbf{y} \leftarrow \mathsf{Propagate}_P(\mathbf{x})
if \mathbf{y} \in \{0,1\}^m
    if P(\mathbf{y}) = 1
        output \mathbf{y}
    end if
    return
end if
i \leftarrow \mathsf{Unassigned}_P(\mathbf{x})
y_i \leftarrow 0
call \mathsf{BranchAndBound}_P(\mathbf{y})
y_i \leftarrow 1
call \mathsf{BranchAndBound}_P(\mathbf{y})
return

By initializing this recursive algorithm on the empty assignment \mathbf{x} = (*,\dots,*), all possible feasible solutions \mathbf{x} with P(\mathbf{x}) = 1 will be written to output.

While branch-and-bound is a complete algorithm, it has no concern for the symmetries of the objects represented by the variable assignments. That is, perhaps the property P is invariant under a certain set of permutations and so we should only generate variable assignments up to isomorphism. The next section defines these symmetries.

Constraint Symmetries

Let \sigma \in S_n be a permutation of order m. Given a vector \mathbf{x} = (x_1,\dots, x_m), the vector \mathbf{x}_\sigma has value \mathbf{x}_{\sigma} = (x_{\sigma(1)}, x_{\sigma(2)}, \dots, x_{\sigma(m)}). That is, \sigma acts on the variables. An automorphism of a constraint problem P is a permutation \sigma \in S_m so that for every vector \mathbf{x} \in \{0,1\}^m, P(\mathbf{x}) = P(\mathbf{x}_\sigma). The set of automorphisms of P form a group, denoted \mathsf{Aut}(P).

Example. When the variable assignment \mathbf{x} corresponds to a trigraph G_{\mathbf{x}} (as in the case of strongly regular graphs), we consider a permutation \tau \in S_n to be a colored isomorphism between two trigraphs G and H if for all pairs v_iv_j, the color of v_{\tau(i)}v_{\tau(j)} in H matches the color of v_iv_j in G. A property P is invariant under colored isomorphsims if P(G) = P(H) for any two isomorphic trigraphs. Therefore, for a property P that is invariant under colored isomorphisms, \mathsf{Aut}(P) \supseteq \{ \sigma_\tau \in S_{n\choose 2} : \forall \tau \in S_n \}, where \sigma_\tau maps a pair v_iv_j to v_{\tau(i),\tau(j)}. That is, any permutation of the vertices corresponds to a map of the pairs and P is invariant under these permutations. End Example.

Since we shall be making changes to the variable assignment \mathbf{x}, we want to track the symmetries of the system with respect to the current variable assignment. Thus, let \mathsf{Aut}_{\mathbf{x}}(P) be the set of permutations of the variable set so that P is invariant under all extensions of the given variable assignment, and a variable x_i is mapped to a variable with the same value: \mathsf{Aut}_{\mathbf{x}}(P) = \{ \sigma \in \mathsf{Aut}(P) : \forall i\in \{1,\dots,m\}, x_i = x_{\sigma(i)}\}.

Example. Recall that we know how to compute the automorphisms of trigraphs.

We shall focus on the action this group on the variables. The orbit of a variable x_i is the set {\mathcal{O}}_i = \{ x_j : \exists \sigma\in \mathsf{Aut}_{\mathbf{x}}(P), j = \sigma(i)\}. Using these symmetries and considering orbits of unassigned variables instead of single variables, we can extend branch-and-bound to be symmetry-aware and reduce the number of isomorphic duplicates.

Orbital Branching

Given a partial variable assignment \mathbf{x}, we compute the colored automorphism group \mathsf{Aut}_{\mathbf{x}}(P) and use this action on the variables to compute orbits. Orbital branching extends the standard branch-and-bound technique by selecting an orbit of unassigned variables. From a selected orbit {\mathcal{O}}, there are two branches:

  1. Select a representative i_1 \in {\mathcal{O}} and assign the variable x_{i_1} = 0.
  2. For all representatives i \in {\mathcal{O}}, assigne the variable x_i = 1.

Notice that if there are non-trivial automorphisms, then there exists an orbit of size at least two. This makes the second branch assign more than one variable in a given time. The reason this process works is that assigning x_i = 0 for any representative i \in {\mathcal{O}} leads to the same variable assignment up to isomorphism. Since trying any one variable a zero results in an isomorphic variable assignment, we can just try one and in the second branch we can set all of the variables to the other value. Figure 1 shows the difference between branch-and-bound and orbital branching.

Figure 1. Comparing standard branch-and-bound to orbital branching.

Algorithm 2 gives a full description of the orbital branching method. The algorithm follows a similar pattern to Algorithm 1 but uses a subroutine, denoted \mathsf{UnassignedOrbit}_P(\mathbf{x}), to select an orbit of unassigned variables instead of just a single variable.

Algorithm 2. \mathsf{OrbitalBranching}_P(\mathbf{x})
if \mathsf{Detect}_P(\mathbf{x})
     return
end if
\mathbf{y} \leftarrow \mathsf{Propagate}_P(\mathbf{x})
if \mathbf{y} \in \{0,1\}^m
    if P(\mathbf{y}) = 1
        output \mathbf{y}
    end if
    return
end if
{\mathcal{O}} \leftarrow \mathsf{UnassignedOrbit}_P(\mathbf{x})
i_1 \leftarrow \min \mathcal{O}
y_{i_1} \leftarrow 1
call \mathsf{OrbitalBranching}_P(\mathbf{y})
for all i \in {\mathcal{O}}
    y_i \leftarrow 0
call \mathsf{OrbitalBranching}_P(\mathbf{y})
return

Example. In the case of variable assignments corresponding to trigraphs, the first branch in the orbital branching algorithm selects a pair from the orbit and assigns that pair to be a nonedge. No matter which representative pair is selected from the orbit, the resulting trigraph is isomorphic to any other choice of representative. Therefore, an arbitrary representative suffices. Also, any assignment of edges and nonedges to this trigraph so that one of the representatives becomes a nonedge is isomorphic to some trigraph where our selected representative is a nonedge. Therefore, after checking all extensions where this representative is a nonedge we may assume that every pair from the orbit is an edge for other extensions.

Note. The orbital branching procedure could easily change to allow that in one branch the variable y_{i_1} is set to 1 instead of 0 and in the other branch all variables of the orbit are assigned the value 0. Also, the choice of which branch has which value can be a dynamic decision. In some problems, the variables are not binary-valued, and in this case we must select a single value to send to the whole orbit and the other values must be individually tested on the given variable.

Branching Rules

In the previous description of the orbital branching algorithm, we left the algorithm for \mathsf{UnassignedOrbit}_P(\mathbf{x}) to be an arbitrary algorithm. An instance of this algorithm is called a branching rule, as it dictates which variables are assigned in the current branch. In the original work by Ostrowski et al., a lot of attention was given to these branching rules. One reason is that they were working with combinatorial optimization problems where the “bounding” part of branch-and-bound is particularly useful. Hence, several of the branching rules used a continuous relaxation of the problem as advice to choosing an orbit.

Since we mostly consider exhaustively generating all feasible solutions to a combinatorial problem where the constraints are rarely simple to describe in a continuous setting, we ignore the original branching rules. Instead, we focus on the example of searching over trigraphs for our rules.

Example. Suppose we wish to select an orbit of unassigned pairs from a trigraph G. The following list contains a few potential branching rules along with positive and negative effects.

  • LargestOrbit: Select an orbit of highest order.
    • Pro:Maximizes amount of change in second branch.
    • Con: Does not concern constraints. Can quickly remove symmetry from the graph.
  • LargestConnOrbit: Maintain a set S \subseteq V(G) of vertices where every vertex in S is contained in at least one assigned pair. Select an orbit where all pairs have both endpoints in S (if possible) or at least one endpoint in S (otherwise). From these orbits, select the one of largest order.
    • Pro: Builds graphs by filling in all pairs from a given set first. Attempts to maximize change in second branch.
    • Con: Some of the orbits may be small when S is nearly full, resulting in low symmetry for later orbit calculations.
  • MaxDegreeSumOrbit: Select an orbit of pairs uv where the degree and non-degree sums of u and v are maximum. Break ties by orbit size.
    • Pro: Attempts to maximize tightness of constraints by fully specifying the edges and nonedges at vertices of high degree and nondegree.
    • Con: The first branch leads to a single vertex with high nondegree, which leads to the orbit sizes decreasing rapidly.

Orbital Branching and Canonical Deletion

From our example of searching for strongly regular graphs, we see that orbital branching can be used to search for combinatorial objects. This brings the technique into the combinatorial realm, where it can compete with canonical deletion. Orbital branching may seem weaker than canonical deletion, since we have no guarantee that every object is visited at most once. Even worse, when using orbital branching and you generate an object with no automorphisms, the technique becomes no better than brute-force search. However, there are situations when orbital branching is significantly more efficient than canonical deletion.

One reason the orbital branching technique may work better than canonical deletion is that the techniques have strengths in two opposite areas. Canonical deletion focuses on removing all isomorphic duplicates, but there is no known method to incorporate constraint propagation with canonical deletion. Orbital branching is built to naturally allow constraint propagation, but it only reduces the number of isomorphic duplicates.

Another reason is that the augmentation step for orbital branching involves exactly one automorphism calculation, and two augmentations. In canonical deletion, every possible augmentation must be attempted (up to isomorphism) and checked to see if it is a canonical augmentation. Depending on the augmentation step, this can be a very costly operation.

In the next post, we will investigate an extension of the orbital branching technique with a customized augmentation for a specific problem. This augmentation integrates well with orbital branching, but our attempt to integrate it with canonical deletion was less efficient. Our implementation with orbital branching is a very efficient algorithm; by executing the algorithm, we discovered several new graphs with the desired properties.

References

G. L. Nemhauser, M. W. P. Savelsbergh, and G. C. Sigismondi. MINTO, a Mixed INTeger Optimizer. Operations Research Letters, 15:47–58, 1994.

J. Ostrowski, J. Linderoth, F. Rossi, and S. Smriglio. Orbital branching. In IPCO ’07: Proceedings of the 12th international conference on Integer Programming and Combinatorial Optimization, volume 4513 of LNCS, pages 104-118, Berlin, Heidelberg, 2007. Springer-Verlag.

J. Ostrowski, J. Linderoth, F. Rossi, and S. Smriglio. Constraint orbital branching. In A. Lodi, A. Panconesi, and G. Rinaldi, editors, Integer Programming and Combinatorial Optimization, 13th International Conference, IPCO 2008, Bertinoro, Italy, May 26-28, 2008, Proceedings, volume 5035 of Lecture Notes in Computer Science. Springer, 2008.

J. Ostrowski, J. Linderoth, F. Rossi, and S. Smriglio. Solving large Steiner triple covering problems. Technical Reports of the Computer Sciences Department, University of Wisconsin-Madison, (1663), 2009.

S. G. Hartke, D. Stolee, Uniquely K_r-Saturated Graphs, The Electronic Journal of Combinatorics, 19(4), P6, 41pp. 2012.

Advertisements