Maximizing Golden Gate Transit: Open Data

So far in this series, we’ve discussed how Golden Gate Transit might better communicate its routing, its scheduling and its headways. This transparency would be incredibly useful to the end-user, but the data that generated those maps would still be hidden away on the GGT servers, accessible only to internal users. Opening that data up for outside developers, a concept creatively known as open data, would allow anyone to present that data in ways – both useful and whimsical – that GGT would never even think of, much less fund.

The most obvious use for open data is integration with other regional systems. At the moment, getting around using GGT requires a trip to, but it’s clunky, unattractive and inflexible. Besides, anyone who is not from the Bay Area wouldn’t know that 511 exists. Google Maps is a good fallback for these visitors but GGT isn’t on the system, leading to Marin being a black hole; only the ferries are really an option. (When asked, GGT said they were planning on integrating with Google Maps but had no timeline.)

In real estate, there are apartment and home finder tools based on travel distance by transit. Type in an address, specify how many minutes you want to travel, and the map highlights how far you can get using transit only. The people that invented Walkscore have also invented a Transitscore, showing how accessible a given location is to transit at any given time of day. With open data, real estate agents could easily market a given area as highly transit accessible. This would not just appeal to potential residents but also to those who need to hire the young and carless, such as tech companies. Many people take transit accessibility into consideration when considering job opportunities. If I can’t figure out on my tool of choice how far an employer is from me by transit I’ll probably pass them by.

Open data also provides a wealth of information for those that love mapping, statistics, or both. Rather than paying tens of thousands of dollars for analyses of headway frequency, stop density, or the like, opening up performance, location and routing data lets advocates analyze the data for themselves. They could combine it with census data to find out how many residents are covered by transit, or determine which routes have the most frequency. They could chart scheduled departures and, if GGT invests in NextBus, show on-time performance for any given route. They could find the busiest corridor, the most densely populated corridor, the worst-performing corridor, and on and on.

Opening up data, then, provides free advertising on all the transit accessibility or utilization tools that want to include Marin. It allows advocates and enthusiasts to process data for the system for free, giving power to the people and giving GGT stronger tools to work with.

Opening data is not always a straightforward matter. The databases need to be converted to usable formats, the information needs to be scrubbed, and service disruptions need to be communicated in similar ways. None of this is free. But the benefits of open data to an otherwise opaque and infrequent suburban system are too great to ignore.