### Canonical Labelings with Nauty

Here, I’ll describe how to actually compute a canonical labeling using nauty. The objects we will consider are graphs, directed graphs, and (vertex or edge) colored versions of those objects. This will lead into the next post where we will use nauty to compute orbits in these objects.

Most of what I discuss here is more thoroughly described in the nauty User’s Guide. However, some of the constructions are found in hidden corners of the guide. This is simply an attempt to present another perspective on the ways to interact with nauty.

### Graphs

Included in the nauty library are two data structures: graph and sparsegraph. The graph structure is a bit-packed adjacency matrix while the sparsegraph structure is a single-dimensional-array version of an adjacency list. I prefer using the sparsegraph because it is easier to work with and I do not find a large performance loss using this structure. (One of my personal goals is to have readable code, and sparsegraph makes that goal easier, too.) Every call to nauty has a graph pointer in the function definition, but passing a sparsegraph pointer will also work.

A sparsegraph $g$ has three main arrays: $g.v$, $g.d$, and $g.e$. The array $g.e$ stores the actual links for the adjacency list, $g.v[i]$ stores the index within $g.e$ to begin the adjacencies corresponding to the vertex $i$, and $g.d[i]$ stores the degree of the vertex $i$ (and hence the number of consecutive positions within $g.e$ that correspond to the vertex $i$). Variables $g.nv$ and $g.nde$ store the number of vertices and number of directed edges (so, twice the number of undirected edges), respectively.

The array $g.e$ does not have to be compressed. In fact, since we are typically augmenting a graph to add vertices and edges, I let $N$ be the maximum order of a generated graph and store $g.v[i] = N * i$ and have the array $g.e$ have $N^2$ positions.

#### The call to nauty

To call nauty, we need a few bookkeeping items. The first is an options structure, which can be initialized with the line
static DEFAULTOPTIONS_SPARSEGRAPH(options);
when using sparsegraph. Then, the options can give several important options to the nauty function. Most importantly, this controls whether the graph is directed and whether we want a canonically-labeled graph in the result.

Also, we’ll need what is called a workspace. This is a set of memory that the nauty function will use for all of its data. It will not allocate heap memory other than this workspace. See the code to see how this is set up.

It is best to see the actual code to see how everything is set up, so I have created a source code file nautyexamples.c. This file contains the canonical labeling methods descibed here, as well as the constructions described below. See the comments at the top of the page for compilation instructions and the comments above the main() method to see how to send arguments and input to the program. (You can disable the main() method and use it as a library if you want.)

### Directed Graphs

To store a directed graph, simply use the graph or sparsegraph strucutres without bidirectional edges, the same way an adjacency matrix or adjacency list loses symmetry. Keep in mind, now, that $g.nde$ is actually equal to the number of edges. (In general, these structures consider an undirected graph simply as a directed graph with the property that the edge $uv$ exists if and only if the edge $vu$ exists.)

To run nauty with a directed graph, simply change the option opt.directed to TRUE. This tells the nauty algorithm to be aware of direction when building the canonical labeling. This is the only change required, including in the following two colored versions of the canonical labeling.

### Vertex-Colored Graphs

Consider pairs $(G,c)$ where $G$ is a graph and $c: V(G) \to \{0,\dots,k-1\}$ is a (possibly improper) $k$-coloring of $G$. Over these pairs, we can define a color-preserving isomorphism from $(G, c)$ to $(H, c')$ as a bijection $\pi : V(G) \to V(H)$ such that $\pi$ is an isomorphism from $G$ to $H$ and $c(v) = c'(\pi(v))$ for all $v \in V(G)$. We will wish to compute canonical labelings for these isomorphism classes.

This type of coloring is exactly what the lab and ptn arrays are used for as input to nauty. The ptn array marks the positions of a partition of the vertices, and the vertices of each part are given by the order in lab.

To reflect a coloring $c : V(G) \to \{0,\dots,k-1\}$, iteratively place the vertices in $c^{-1}(i)$ to the beginning of lab, for all $i \in \{0,\dots,k-1\}$ in increasing order. Let $s_i = \sum_{j=0}^{i} |c^{-1}(i)|$ and let ptn have positive value everywhere except value 0 at the positions $s_i - 1$ for each $i \in \{0,\dots,k-1\}$. The 0’s dictate the end of a part. The ordering of $c^{-1}(i)$ doesn’t matter, only that $c^{-1}(i)$ appears before $c^{-1}(i+1)$.

To have nauty pay attention to your partition, set options.defaultptn = FALSE.

Now, running nauty with these changes will return a canonical labeling under the conditions that vertices with color $i$ are within the positions occupied by $c^{-1}(i)$. Therefore, the order of your colors is important for the canonical labeling.

#### Permuting the Colors

Sometimes we think of “non-isomorphic colorings” as having no order to the color set and hence the colors can be permuted. This type of “isomorphic coloring” can be tested by adding a new vertex for each color adjacent to its color class. Then, use the current method to 2-color the graph as “regular vertices” and as “color vertices” to avoid these two types of vertices from interacting. (See “Exchangeable vertex colours” in Section 12 of the Users’ Guide for more.)

### Edge-Colored Graphs

Suppose we instead have pairs $(G, c')$ where $G$ is a graph and $c'$ is an edge-coloring $c' : E(G) \to \{0,\dots,k-1\}$ and wish to canonically label these pairs under the same kind of isomorphism as vertex-colored graphs. We cannot simply partition the edges of the graph, so we instead build a layered graph.

#### The $k$-Layered Graph

If we use $k$ colors, the simplest thing to do is to replace each vertex $v_j \in V(G)$ with some fixed connected graph of $k$ vertices $v_j^{(0)},\dots,v_j^{(k-1)}$. (A path is easy, but a cycle is even simpler to implement.) Then, if an edge $v_jv_{j'}$ has color $i$, we add an edge between $v_{j}^{(i)}$ and $v_{j'}^{(i)}$. Finally, we partition these vertices by the superscripts: $V_i = \{ v_0^{(i)},\dots, v_{n-1}^{(i)}\}$. Now, any permutation of $V_0$ carries over to a permutation to all $V_i$ by the path that replaced each vertex. By looking only at the order on $V_0$, we have a canonical label of $G$.

#### The Binary Layered Graph

The $k$-layered graph has one big issue: when $k$ is large, we are making many layers and this is wasteful. Instead, let $\ell = \lceil \log_2 (k+1)\rceil$ and use $\ell$ layers. We will associate the color $i$ with the $\ell$-bit binary representation ${\bf x}$ of the number $i+1$ and for every edge $v_jv_{j'}$ colored $i$ we will place an edge between the copies of $v_j$ and $v_{j'}$ in all layers with a 1 in ${\bf x}$. Now, we have only a logarithmic number of layers, but still a layer-preserving isomorphism between two binary layered graphs must have an edge appear in exactly the right arrangement of layers, so color is preserved.

#### Trigraphs

Most of the previous constructions were given in a very hypothetical sense, so let’s see a very important application of these ideas.

Suppose we are generating graphs by starting with $n$ vertices and adding edges and non-edges. What I mean is this: every pair of vertices starts as “undecided” and we will augment the structure by setting some pairs to “edges” and some pairs to “non-edges”. Essentially, we are starting with an uncolored copy of $K_n$ and coloring the edges with 1 (edge) and 0 (non-edge).

A trigraph is a 3-edge-colored copy of $K_n$ where all pairs have color black (edge), white (non-edge), or gray (undecided). As combinatorial objects, these have been thought of as partial realizations of actual graphs and the “full” realizations change the gray edges to black or white to create a traditional graph. For this view, see Chudnovsky (2006) or Martin and Smith (2012) for some applications.

In the graph generation case, Hartke and I (2012+) used this to generate uniquely $K_r$-saturated graphs. In the implementation, we used a 2-layered graph to store the trigraphs thinking of the gray edges as non-edges and black and white edges as a 2-coloring of a certain graph. Letting the gray edges be non-edges in the layered graph is possible only because every pair has one of three colors. Specifically, the gray edges start as the majority of edges, so to keep the structure spare in early steps we leave out the gray edges.

#### Permuting the Colors

Again, if we with to allow the colors to be permuted when testing isomorphism of two edge-colorings, we must use a different construction. First, for $k$ colors we must use the $k$-layer construction, where every layer corresponds to a color. Also, all copies of a vertex $v_i$ must be pairwise adjacent to allow for any permutation of the layers. Finally, instead of partitioning the layers, make every vertex in layer $i$ adjacent to some new vertex $u_i$ and place $u_0,\dots,u_{k-1}$ into a second partition (separating “normal vertices” from “color vertices”).

### Exercise: Totally-Colored Graphs

A total coloring is a coloring of both the vertices and edges. Try to think of a construction that allows a canonical labeling of a totally-colored graph. What happens when we wish to permute the colors? Be careful: if the color $i$ changes, it should change to the same color for both the vertices and edges!

### References

#### nauty

B. McKay, nauty User’s Guide.

#### Trigraphs

M. Chudnovsky. Berge trigraphs. J. Graph Theory, 53(1), pages 1-55, (2006).

R. Martin, J. Smith, Induced saturation number, Discrete Mathematics, 312(21), pages 3096-3106 (2012).

S. Hartke, D. Stolee, Uniquely $K_r$-Saturated Graphs, preprint.