Update: A new version of this diagram can be found at the bottom of the article.
On October 25th, Terry Tao wrote about the idea of creating a display visualizing the applications of mathematics. Soon after, there was a lot of activity on MathOverflow, collecting possible uses of different fields of mathematics. The original idea came from a visualization which illustrated the uses of the elements of the Periodic table.
I started writing this blog because of an overlap which I perceived between Tufte’s view of information display and mathematical thought. Tufte’s insight was mainly that effective information display requires an exacting form of clarity of expression where what is extraneous is identified as deleterious, and is therefore stripped away. In this view there is little room for ornamentation for the sake of ornamentation. Whether in computer code or in good writing, mathematical or otherwise, there is always something to be said for concise and clear expression.
Therefore, the idea of creating an informative visualization of mathematics was inherently appealing; and it was completely in keeping with the original mission of the blog.
The periodic table, although somewhat innocuous to those of us who are used to it, is — in my opinion — one of the greatest achievements of chemistry. It is also an arrangement that is uniquely useful to chemistry. In order to display relationships between the fields of mathematics, we need a different underlying paradigm.
Naturally, the first question is: if not a table, then what type of two dimensional structure might suitably display the relationship between mathematical fields? The idea of a graph seemed natural to me. The only issue was how should it be constructed. While more subjective ways of doing this initially came to mind, lacking the man power, I decided to go with a more objective means of construction.
The arXiv is a store of electronic preprints, many of which are mathematical. These mathematical papers are tagged according to a system of classification, which I decided to use in my diagram. I collected the number of papers in each pair of fields published in the last five complete years (2004-2008). I defined two fields as contributing ideas to each other if the author of a paper tagged his paper as belonging to both fields. For each field, I looked at the relative contributions of other fields. The diagram is based on a directed graph where an arrow goes from field A to field B if filed A is listed as being relevant to more than 12.4% of the papers in field B. I have listed the exact percentage of papers on the edges of the graph. Note that 12.4% is to a large degree an arbitrary number. Although, speaking in its favor, it is the largest number to three significant figures that produces a graph with only one connected component. Here I was balancing having a connected graph with minimizing the number of edges.
I can think of more rigorous ways to pick the cut point because each threshold percentage suggests a model of the data. If I could look at data from 2009, I could assess each model and pick the one that was most predictive. Unfortunately, I don’t think I have the energy for collecting all this data. So, we are stuck with 12.4% for now.
One difficulty that I was not able to overcome is that the arXiv only gives a maximum of 1000 papers in its search results. Therefore, I was unable to look at all papers in a field and instead had to stick to pairs of fields. Therefore, single field papers were omitted. Also, the further complication of papers in more than two fields was overlooked.
I have made the code I used to make the diagram fully available. It’s written in Mathematica. I also have the image which is available in full size if you click the image above. Finally, I have the data set which I generated in order to make the diagram.
The diagram can easily be turned into the full idea that Terry mentioned because of the flexibility of Mathematica. It’s entirely possible to have each vertex replaced with an image. Therefore, it is just a matter of getting the pictures ready.
On this score, I have an idea. What we need is an image of an application for each mathematical area. There is a nice program called fotosketcher that can turn photographs into drawings. These can then be cropped and fashioned into nice displays similar to the ones in the original periodic table PDF.
However, if the project never moves any further, I enjoyed the process of making this map of mathematical fields.
Terry suggested that the size of the vertices should be scaled according to the number of papers. When I was designing the diagram, I had ruled that alternative out because I didn’t want it to be too tied to the original arXiv data set. All the same, the results of incorporating the number of papers is inherently interesting.
The radius of the vertices is proportional to the square root of the number of preprints in that category. Therefore, the area of the black disks grows linearly with the number of papers. For fun, I adjusted the color of the arrows to be darker for larger percentages of papers.