### A Visual Guide to Combinatorial Search

#### by Derrick Stolee

Today, we investigate a way to visualize *combinatorial search* in a way that touches on most of the computational and mathematical concerns. Throughout the description, we shall refer to a common example of generating graphs using edge augmentations. Since we’ve already seen some combinatorial searches before (see Introduction to Canonical Deletion and Introduction to Orbital Branching for examples), some of this discussion will seem familiar.

*Disclaimer: I adapted this post from a chapter of my thesis.*

Suppose a combinatorial search is defined by searching for combinatorial objects from a family of *labeled objects*. Under the appropriate definition of isomorphism for those objects, let be the isomorphism relation and be the family of *unlabeled objects*: the equivalence classes under . Let be a *property*, and we wish to generate all objects in where . We shall assume the property is *invariant* under isomorphism (): for all unlabeled objects and labeled objects , . In this case, we can define for an unlabeled object to be equal to for any labeled object .

**Example.** Let be the set of graphs of order . Then is the family of unlabeled graphs where the standard relation of isomorphism () is used between graphs. The property could be if and only if is 4-regular and has chromatic number three. **End Example.**

### Base Objects and Augmentations

A combinatorial search consists of a set of *base objects* and an *augmentation*.

Let be a set of labeled objects For a labeled object , the augmentation defines a set of *augmented objects*. Let be the set of *deleted objects*, defined as We shall consider our objects as being built *up*, so the set of augmented objects can be called the *above* objects while the deleted objects are the *downward* objects.

For isomorphism concerns, we shall assume that for an unlabeled object , any two labeled objects have a bijection so that for all , . This allows us to define the augmented and deleted objects and for unlabeled objects as well.

**Example.** For enumerating all graphs of order , the empty graph of order can serve as a base object. The augmentation step from a graph may be adding an edge to . Therefore, and . **End Example.**

### Search as a Poset

This relationship between augmented and deleted objects defines a partial order on objects in . For , let be a cover relation if and only if (equivalently, ). Extending these cover relations by transitivity makes a partial order over . By our assumptions on unlabeled objects, defines a partial order over .

The combinatorial search is *complete* if every unlabeled object with , there is a base object so that for some . A complete search ensures that every unlabeled object is visited at least once.

Figure 1 visualizes the poset and shows the unlabeled objects with property as dots.

**Example.** When generating graphs by edge augmentations, two unlabeled graphs and have if and only if is isomorphic to a subgraph of . Since the empty graph is a subgraph of every graph, this search is complete. **End Example.**

### Algorithm Structure

Consider the search space as a directed graph with vertex set (every vertex is a labeled object) and edge set given by an edge from every to every . This graph is also the Hasse diagram of the poset .

The combinatorial search algorithm is a depth-first search on starting at every base object in . This very basic view of the process is given as Algorithm 1.

**Algorithm 1.** CombinatorialSearch1()

ifthenoutputend iffor allcallCombinatorialSearch1()end for

**Example.** Algorithm 2 demonstrates a specific instance of Algorithm 1 where the objects are graphs, the augmentation involves adding edges, and the property if and only if is 4-regular and has chromatic number three. **End Example.**

**Algorithm 2.** GraphSearch()

ifand}thenoutputend iffor alledgescallGraphSearch1()end for

Note that this process generates all labeled objects regardless of whether they can eventually lead to solutions. Further, the algorithm must operate on labeled graphs, so it currently does not remove multiple representatives of the same unlabeled object.

### Sub-solutions and Pruning

A labeled object (or unlabeled object ) is a *sub-solution* if there exists a labeled object (or unlabeled object ) with and (or and ). In the poset, the sub-solutions form a *down-set* generated by the objects with property . Since these objects have some sequence of augmentations which lead to a solution, we must generate every unlabeled sub-solution at least once.

We also hope that we could immediately detect when our current object is not a sub-solution (there is no sequence of augmentations which leads to a solution). If we could immediately determine if the object is not a sub-solution, we could ignore these objects and generate only the sub-solutions. Unfortunately, the space of objects where we can efficiently detect that the object is not a sub-solution is not the complement of the sub-solution space. Figure 2 shows the regions of sub-solutions and those that are detectably not sub-solutions and there is a gap.

Suppose we create a procedure called Detect() which takes a labeled object and returns only if is not a sub-solution. We can modify the combinatorial search to utilize this procedure, as in Algorithm 3.

**Algorithm 3.** CombinatorialSearch2()

ifthenoutputend ififDetect()thenreturnend iffor allcallCombinatorialSearch2()end for

**Example.** For our example of generating 4-regular graphs with chromatic number three, a graph with maximum degree at least five cannot extend to a 4-regular graph by adding edges. Similarly, a graph with chromatic number at least four cannot extend to a three-chromatic graph. Algorithm 4 demonstrates a specific instance of Algorithm 3 by detecting non-sub-solutions using and . **End Example.**

**Algorithm 4.** GraphSearch2()

ifandthenoutputend ififorthenreturnend iffor alledgescallGraphSearch1()end for

An ideal path through the search space is to always remain within the sub-solutions and always hit a solution no matter what path is taken. However, this is not always possible. Sometime the augmentation will lead to an object which is not a sub-solution, but it takes a few more steps before reaching an object which is detectably not a sub-solution. Figure 3 demonstrates this issue.

In a non-ideal path, the number of steps between reaching a non-sub-solution and actually detecting that there is no solution greatly changes the run time of an algorithm. In this region, the algorithm is *thrashing*: augmenting in all possible ways for several steps before backtracking, with no hope of finding a solution. To reduce the run time, the gap between sub-solutions and detectably non-sub-solutions must be reduced. This step requires modifying the augmentation step or proving a theorem.

### Number of Paths to Each Unlabeled Object

Consider an unlabeled object and a base object . Let be the unlabeled object for . The interval between and contains all unlabeled object so that (Figure 4 shows such an interval). As stated before, every object in this interval must be generated at least once in order to guarantee that the search is complete.

However, if we are not careful, the Hasse diagram on this interval can be a tightly woven network. The number of times a labeled object is generated is equal to the number of paths in this graph.

**Example.** When generating a 4-regular graph on vertices by edge augmentations, there are a total of total edges in . If we assume that almost every subgraph of is distinct up to isomorphism (which is not that much of an assumption, since almost all graphs have no non-trivial automorphisms), then there are objects in the down-set generated by . Not only is that a large number of subobjects, but there are up to different ways to order the edges and build by adding edges in that order. Since , this is asymptotically worse than just generating all subobjects. Further, this is based on the number of ways to build as a *labeled* object. If is a rigid graph, there are different labeled graphs isomorphic to , and hence possible ways to generate this isomorphism class. To avoid such drastic costs, something must be done to reduce isomorphic copies of . **End Example.**

In an ideal world, every unlabeled object is generated at most once. Thus, there is at most one path in the Hasse diagram from any base object to any unlabeled object. This creates a tree structure to the poset, as seen in Figure 5.

### Count and Cost Tradeoff

There is an important tradeoff to consider when designing computational searches. The total computation time can be modeled as the amount of computation per object: . Thus, **Total Time** = **Number of Objects** **Avg. Computation per Object}** Depending on the problem considered, different augmentation steps can change either the number of objects in the space or the average computation per object. Figure 6 models computation time as the amount of black in the figure. In Figure 6.a, there are many nodes but they require very little cost per node. In contrast, Figure 6.b has fewer nodes but each requires more cost. Which has less computation time? How could you tell before implementing and executing the search?

Similar tradeoffs occur with different methods to reduce isomorphic duplicates. One technique may remove all duplicates, but be overwhelmingly expensive to perform the computations. Another technique may be very quick per object, but suffers from combinatorial explosion in the number of isomorphic duplicates that appear. Finding the balance between these costs requires experience and experimentation.

**Example.** When generating graphs, it is almost always the case that generating isomorphic duplicates leads to combinatorial explosion. Using a technique such as canonical deletion to remove isomorphic duplicates is a well-studied technique. However, this techniques works for augmenting by edges *or* augmenting by vertices. It is almost always the case that augmenting by edges requires at least as much computation per node as augmenting by vertices, except the number of objects that are visited is significantly more for edge augmentations. This is because edge augmentations visit every subgraph while vertex augmentations only visit *induced* subgraphs. Since induced subgraphs include knowledge about *non-edges*, more is known about the supergraphs that can be generated later, and non-sub-solutions may be detected earlier.

However, there are always counterexamples to the common theme: In a later post, we will find that augmenting by *ears* is very similar to augmenting by edges, but it allows a certain invariant to be monotonic in these augmentations where it would not necessarily be for vertex augmentations. Further, in this example the ear augmentations are intimately tied to the structure of the target graphs and greatly assists the detection of non-sub-solutions.

When we generate uniquely -satuated graphs, we also go against the typical pattern by allowing isomorphic duplicates in favor of a shorter computation time per object. While this increases the number of labeled objects in total, the reduced amount of computation is significant and allows the search to expand to graphs of order while other techniques became intractable around vertices. **End Example.**

### Partitioning the Search Space and Parallelization

If we are given a tree-like search space, we can partition the search space by sub-trees.

Given a base object , the objects at the th level (denoted ) are those which are generated from by performing augmentations. The objects in then partition the space above the th level: for , let . That is, the up-sets are disjoint and hence partition the space.

From this partition, since the recursive algorithm does not preserve information between recursive calls at the same level, these parts of the search space can be run independently. With appropriate communication protocols, every part can be run in a different process or even a different computation node. This allows for arbitrary parallelism by generating all objects in for some then running the search starting at every .

In a later post, I will discuss the *TreeSearch* library, which is my own abstract implementation of a combinatorial search tree. The library handles all of the parallelization of the subtrees, so any implementation using *TreeSearch* has access to these tools.

#### See Also

Introduction to Canonical Deletion

Small Graphs are Reconstructible

Introduction to Orbital Branching

Uniquely -Saturated Graphs: Experiment #1

#### References

D. Stolee, *TreeSearch* User Guide. Software available with the *SearchLib* collection.

In your definition of D(X), do you mean to write D(X) = {Y in L : X in A(Y)} instead of Y in A(X)?

Absolutely, I do. Thanks!