Keeping it together

Today is my birthday, so I decided to take a day off from my usual work—writing a dissertation about the production of “unit landscapes” in American history—and instead work on a side project that tries to examine the same basic question from a very different perspective. In the dissertation, I’m primarily concerned with qualitative and historical explanations for how people came to see parts of the world as “single” places, and I try to emphasize how the recognition of a “single” place is caught up in a mess of social and political contingencies. Many of the characters in my story turned to empirical studies of social, economic, and ecological interconnectedness to make claims about what kinds of whole areas were really stitched together by a substantive web of relations. I’m highly skeptical that you can ever make the case for unit areas in empirical and statistical terms alone—the political, cultural, and ethical stakes of geographic cohesion are too high for the production of unit landscapes to be a disinterested administrative exercise. But I’m still curious about what kinds of observable data might offer the suggestions about how human communities are materially interrelated in space.

Last month, the geographer Alasdair Rae published a wonderful working paper on American “mega-regions” based on commuting data from the American Community Survey. It’s part of a project he’s working on that’s based on the observation that commuting patterns sketch out “how the places where people live connect with other places from a functional economic point of view, at a fairly fine-grained level.” As part of the project, Rae made the massive data set available online.

Rae’s project offers a striking visual representation of how “mega-regions” are visible across some of the U.S.’s largest urban centers. But I was curious what would happen if I used the data and subjected it to an algorithmic detection of community borders, rather than merely relying on the human eye to pick out the natural groupings of commuter regions. So I turned to the Combo software developed by Stanislav Sobolevsky, Riccardo Campari, Alexander Belyi, and Carlo Ratti at the MIT SENSEable City Lab.

It took a fair amount of data wrangling in Python, SQLite, and QGIS to get all of the different software working together (I’ll write up more about the process at a later date). I decided to limit the analysis to the 102,221 commute entries from Rae’s database which had both their origin and destination points inside the state of Massachusetts. I did this to make the data a bit more manageable, and also to see whether community-detection patterns would emerge at a much smaller scale than the national level. When I finally managed to get Combo to work with the data, it split Massachusetts up into nine communities, with an optimality score of 0.568038. Here’s what that looks like when plotted back onto a map:

Massachusetts community partitions

What I found particularly exciting about this result was the fact that Combo does not take any geographic data in its input; the algorithm is only considering connectivity between nodes. That means that it could have spit out a result where the communities were not geographically contiguous, which would have been the case if commuting patterns were randomly distributed. But I was amazed to find that it partitioned the state into geographically-whole areas—and interesting areas at that!

Here are a few interesting observations:

  • Counties in Massachusetts are mostly ceremonial, and have little meaningful political or community function. Yet two counties—Berkshire and Nantucket—remained perfectly intact in this analysis.
  • Meanwhile, the three counties of the Connecticut River Valley that were once all Hampshire County exhibit a functional commuter unity (with small corners in the northeast and southeast of the region more connected to the Worcester area). Thus, the splitting of this area into three counties in the early nineteenth century has been undone by modern commuting habits.
  • Martha’s Vineyard is integrated into Cape Cod (which itself has a neat commuter boundary at the Cape Cod Canal), while Nantucket, a much longer ferry ride away, forms its own isolated region.
  • The functional Greater Boston commuter area is eerily similar to the shape of the 27 cities and towns which Sylvester Baxter hoped to consolidate into a federated metropolis in the 1890s—one of the major stories of my dissertation.
  • Exurban Boston divides fairly cleanly into four recognizable regions, roughly North Shore, South Shore, Worcester, and Fall River–New Bedford.

What I find most surprising about this data is how well it matches with a naïve sense of Massachusetts’s regions; it suggests that a fairly close synchronicity exists between the general perception of “single” places and the actual social pattern which lies beneath those perceptions. What is amazing to me about this map is how ordinary it looks: a computational algorithm looking only at structured patterns of connection was able to create a sliced-up Massachusetts that fits almost too well with the divisions that we already recognize.

Next, I’d like to try this process on larger areas, like the New England states as a whole. But I’m even more excited to see if it’s possible to apply this same analysis to historic data. Marriage records seem like they might be a good kind of proxy source for the same kinds of social and economic connections which are sketched out by commutes—and if I can get such data stretching all the way back to the nineteenth century, I’ll have an interesting way of looking at these patterns as they’ve changed alongside larger arcs of economic and social transformation.