On the arXiv

by Derrick Stolee

Whenever I write a paper, I put it on the arXiv. The arXiv is an open-access paper repository run by Cornell University. It’s pretty fantastic to know that almost anyone can have an immediate international audience by posting a paper to the arXiv. My use is two-fold: I upload papers and I look forward to the evening paper dump in my RSS feed five times a week. It is a great way to be actively connected to the research world. As such, I try to convince my coauthors to upload papers to the arXiv and have never had one complaint.

That is, until something interesting happened. I’ll tell the story very quickly, then talk about pros and cons for using the arXiv. This will entirely skip any copyright issues with journals, and focus on the benefits of public/private preprints. Please add your comments!

The Story

I recently uploaded the paper, Extending Precolorings to Distinguish Group Actions, with my coauthors Michael Ferrara, Ellen Gethner, Stephen G. Hartke, and Paul S. Wenger. We had worked hard on this paper, carefully defining the new “distinguishing extension number” and pushing the details of our new proof technique as far as we could. In fact, our proof technique was not much different than when we first started the project over two years ago, but it took a while to fully and carefully articulate the nuances of how we were using group actions to find contradictions. In fact, this concept led to some interesting figures, as seen below.
A toroidal lattice.
The interesting part is that this last week, another paper was uploaded by Alex Lombardi, and undergraduate at Harvard. Alex participated in Joseph Gallian’s undergraduate research program and used our paper as a starting point. Not only did he quickly solve more than we did, but he solved all but one conjecture and crafted a 20+ page paper that seems particularly well-written. I want to make clear that I am impressed, to say the least.

While Alex’s paper surprised me, what surprised me more was a reaction of a colleague: “Well that’s why you shouldn’t put papers on the arXiv.” This person’s motivation for the statement is simply that now that another paper has better results than ours, then our paper is less worthy of publication. Of course, our paper also has a guaranteed citation, and Alex appears to use our proof technique significantly. (I should also note that the person’s comment was specific to the case when you introduce a problem and have not completely solved it.)

What is clear is that the situation has made me rethink the benefits of the arXiv. Let’s go through them together. I want to specify how posting papers to the arXiv affects researchers at different times in their careers.

For Senior Researchers

In my opinion, there is no excuse for a senior researcher to not post to the arXiv except “I don’t know how to use a computer.” Senior researchers see only benefits here, since even if they are completely scooped, there is little chance that losing one paper every now and then will cause serious harm to their career. In fact, a highly respected researcher (Terry Tao, for example) can change the status of mathematical research by posting a big result openly, claiming credit, and openly talking about the methods before they are “checked” and “published.”

For Graduate Students

Graduate students are in a more precarious position, but it still seems like a no-brainer to post to the arXiv. It is rare for a graduate student to have more published papers than submitted papers, so it is important that when they are going into the job market that their papers are freely available. Sure, they could post the papers to their web pages (and they should do that, too) but Google Scholar will find them and then “everyone” knows about them. One benefit of the arXiv is that it counts as a true timestamp on a result. You can definitely say “I finished this work on such-and-such date, not just put a bunch of stuff online before I applied for your job.”

In particular, Alex Lombardi can say that he solved several conjectures and wrote the paper in the span between May 21st and August 25th. That’s quite the turn-around!

Another anectdote: This paper on rainbow matchings came two weeks before this other paper on rainbow matchings. The papers ended both up having novel, not-properly-contained contributions, and so the papers were merged into one mega-paper.

The caveat for graduate students is this: if you have one paper, and that paper happens to get “scooped” the same way mine did, then that may be the difference between having a publication and not having a publication when you graduate. This cannot be understated. However, having no publicly available papers is almost as bad as not having any papers at all!

For Junior Faculty

This is the scenario that is trickiest, and is my own current situation. Junior faculty are expected to build a strong research portfolio. Having a paper or two be completely scooped will show up on that portfolio. What questions does that raise? How good am I of a mathematician if an (incredibly talented) undergrad can kill my paper in a few months? Here’s a table of Pros and Cons.

Cons Pros
Your work might be superseded before publication. You can put a public timestamp on your work, preventing a complete “scoop.”
  People have early access to your articles.
  Others may cite or build on your work.
  Others may notice your frequency of paper submissions and quality of work.
  Someone may find an error in your paper and notify you.
  Someone may find your work interesting and ask you to speak at their seminar.
  Someone may need to write a reference letter for you, and needs to look at your list of papers.

You may notice that my “cons” are quite few. In fact, I think that most of the scenarios in this MathOverflow question are not cons for an author legitimately using the arXiv, but instead problems with how other authors use the arXiv. (Specifically, the situation where a Ph.D. defense is cancelled because an arXiv preprint solves the candidate’s main theorem is not something that can be blamed on anyone other than the thesis advisor.)

Closing Remarks

My biased views aside, one thing every researcher should do is subscribe the the arXiv RSS feed of there favorite research areas. Personally, I use math.CO (Combinatorics), cs.DM (Discrete Mathematics), and cs.DS (Data Structures and Algorithms). There is no downside to you having a daily dump of interesting papers in your research area! You can find all sorts of new things and find out how to stay ahead of the game, instead of waiting 18+ months for a paper to actually appear in a journal!

Now excuse me, my RSS feed reader has just been populated with a bunch of new preprints!

Other Resources

AstroBetter: To Post or Not

Academia.StackExchange: Why post to the arXiv?

MathOverflow: Downsides of using the arXiv?