Tips and Tricks: Using gtools

by Derrick Stolee

If you want to get down and dirty with computational methods but don’t want to do a lot of programming, you can get started now using Brendan McKay‘s gtools software, part of the nauty library. It’s free software, but you’ll need to compile it yourself. This software allows you to generate graphs in several different ways, usually finding all graphs with certain properties up to isomorphism. The output is not human-readable, but you can use the output to create graphs in Sage or just count the number of graphs with that property. Everything I write below is available in the user manual, but read ahead for a slightly different treatment.

Suppose we want to check a conjecture on all graphs of a given order, say graphs on 8 vertices. The number 8 is usually big enough to find strange examples but small enough to not have too many graphs (10 vertices is even better for strange examples, like the Petersen graph, and is still reasonable, time-wise, where reasonable is “about a day”).  With geng, we can quickly generate these graphs. Run:

geng 8

What follows is a list of all 12,346 graphs of order 8. The output is difficult to read. Here are some lines of the output:

GTzvv[
GTzvv{
GTzvn[
GTzvn{
GTzv^[

Clearly, this does not look like a list of graphs to us humans. However, this is a very compressed representation of the adjacency matrix of a graph, called graph6 format. You can also ask geng for the output in sparse6 format (a compressed version of the adjacency list representation) by using the -s flag. One way to understand the graphs that are output is to use the Sage graph library, such as:

G = Graph("GTzvv["); G.show()

and it would output a picture:

Clearly, going through all 12,346 graphs by hand is not an option. It may first help to reduce the number of graphs we are checking. For example, perhaps you really only care about connected graphs. Then, add the -c flag to only output the list of connected graphs. If you know other restrictions, consider the following list of parameters which restrict the properties of the output graphs:

  • #:$ : at least # edges and at most $ edges
  • -d# : minimum degree at least #
  • -D# : maximum degree at least #
  • -c : connected
  • -C : 2-connected
  • -t : triangle-free
  • -f : C_4-free
  • -b : bipartite

Finally, perhaps you don’t even care about the graphs themselves, but how many there are. Then, use the -u flag to skip the graph output and simply output the number of graphs. So, if I wanted to find out how many connected, triangle-free, cubic graphs there are of order 10, I could type

geng 10 -d3D3 -ctu

and the output would be

>A ./geng -ctd3D3 n=10 e=15
>Z 6 graphs generated in 0.00 sec

If you do require a further analysis of the graphs output by geng, I recommend enlisting the computer to output a list of graphs with desired properties. There are two ways to do this: the easy way and the fast way.

The Easy Way

First, we output all graphs to a file using the > shell operator, such as:

geng 8 > graphs8.txt

Then, we can read every line of that file into Sage:

f = open(basefolder + "graphs8.txt", "r");
for line in f.xreadlines():
    G = Graph(line);
    if TestMyProperty(G):
        print line;
        G.show();

If you only find a small number of counterexamples, this will show you a small list of graphs. One drawback to this method is that you must store all of the graphs in a file, which can quickly become a drain on your storage. Also, Sage uses the Python interpreter to do all of its computation, which can be slow at times. I call this the “easy” way because it is much faster to write a testing function in Sage than in C/C++.

The Fast Way

If using Sage is too slow for your needs, I recommend piping the output of geng to your own executable which reads graphs in graph6 (or sparse6) and checks the property. Luckily, the nauty package includes a few graph structures along with the methods to read graph6/sparse6 into one of these structures. I prefer using the sparsegraph structure found in nausparse.h, so that’s what I will be talking about. In order to code in this way, you’ll have to get your hands dirty. However, let me briefly describe a few elements of the programming.

A sparsegraph object g represents a graph using a “packed” adjacency list. That is, there is a single array g.e which stores all adjacency pointers. The array g.v has the value g.v[i] storing the index j such that the adjacencies from the vertex i begins at g.e[j]. Then, g.d[i] stores the degree of the vertex i, so the adjacencies from i are given by g.e[g.v[i]], ..., g.e[g.v[i] + g.d[i] - 1]. While this may look complicated, I find it much easier to deal with than the graph object, which is a bit-packed version of the adjacency matrix.

To load a sparsegraph object, you can use the stringtosparsegraph method. Here is a quick segment of C code to get you started:

char buffer[1000];
while ( scanf("%s", buffer) != EOF )
{
    sparsegraph g;
    SG_INIT(g);
    int num_loops;
    stringtosparsegraph(buffer, &g, &num_loops);
    performCheck(&g);
    SG_FREE(g);
}

What Now?

I leave you now with this super-short introduction to using geng. I recommend that you try out other similar generation algorithms given in gtools:

  • genbg generates bipartite graphs.
  • gentourg generates tournaments.
  • genrang generates random graphs.

But first, I will discuss the algorithm behind geng and how it has been used in recent research. Look forward to that post next week.

Advertisements