I Quant New York: On the Power of Open Data in the City
On Tuesday, a New York Times headline read, “Airbnb Releases Trove of New York City Home-Sharing Data,” thereby announcing the company’s long-awaited response to the Attorney General’s claims that the business was actually running illegal hotels, and thus owing millions in taxes for illegal short term rentals. Transparency at last—or not. Because next, a New York Daily News editorial called out “Airbnb’s Data Sharing Sham,” due to the fact that you have to make an actual in-person appointment with Airbnb’s New York City office to look at any of the “trove,” which would sort of be like visiting the National Archives and searching for something with big-brother watching. The Daily News went on to say the data itself offered “limited value to anyone attempting to judge the company’s impact on residential buildings, and [avoided] a credible estimate of how many listings are actually short-term motels.” At least, it pointed out, Airbnb lets you take notes while you’re there!
Finally, Ben Wellington—called a folk hero for the beautiful data analyses he performs on his blog, I Quant NY, using open city data, and who is likely best known for getting the MTA to create a $27.25 button on payment kiosks, which was the result of a mathematical proof he wrote that showed no available payment amount ever resulted in a zero balance on your Metrocard—added on Twitter, “Well done @NYDailyNews! Claims of data transparency at @airbnb are a joke. Glad you agree,” which followed another of his tweets, “Worst ‘open & transparent release ever?” about the company’s data release. Wellington, though, had already made amends for Airbnb’s mockery of a data release—earlier this year he performed his own analysis of how New Yorker’s are really using Airbnb working with data that a photographer named Murray Cox managed to scrape from the company.
More than most, Wellington likes numbers: He has a Ph.D. in Computer Science from New York University and, when he is not busy being a new dad, he is a visiting professor in the City & Regional Planning program at Pratt, and works as a quantitive analyst at tech company Two Sigma. It’s only on his teensy amount of free-time that he runs his data-loving, sea-changing blog. And his love of numbers is not based solely in theory, but in very practical numbers that are attached to very real things like companies and city infrastructure; the kind of numbers that, when analyzed, reveal solutions to problems that can make daily life and entire cities run better.
But many people, myself included, are scared of numbers, maybe because they contain an enormous innate power to produce immediate and measurable improvements, if they are harnessed correctly. (Ed. note: Or maybe too many of us just hated math in high school?) But maybe it’s that last bit—the “if they are (and can be) harnessed correctly” part—that is the Daily News’ and Wellington’s whole point: No numbers’ potential energy can ever turn kinetic if the data is either not accessible or not presented in a way that makes it easy to analyze, which is to say, what Airbnb did.
New York City has long been a leader in championing open data. In 2012, the Bloomberg administration signed into law the “most ambitious and comprehensive open data legislation in the country,” which required that all city agencies make all of their data public, which, it was envisioned, would be placed on a “single web portal” by 2018, where it could be accessed by anyone. At the time, the law was focused on providing web developers with the data they needed to make apps that could add to and boost New York City infrastructure. But, of course, once the data is out in the open, there’s no real controlling what people do with it. The awesomeness of open data is that absolutely anybody, not just web developers, can access it, and absolutely any calculations can be made with it that can pinpoint an endless number of problems, and their solutions, whether or not that includes a New York City-based app or not.
Airbnb is obviously not a city agency that must comply with the 2012 law, but even the agencies that are, says Wellington, are not usually great at showing the most essential data, nor in a way that makes it easy to analyze—i.e. useful. But this is not just because, like Airbnb, city agencies might have something to hide. It’s often for reasons that are less conspiracy-centric, and more pragmatic. As an example case of why that happens, Wellington uses one of the city’s most notoriously inscrutable agencies, the NYPD.
While the NYPD does release crime data by incident—such as in this crime map that indicates all the locations where a crime has been committed, and some other attendant data, like the kind of crime committed—the problem is that the information is not presented in a format that can be downloaded, or therefore analyzed. And without analysis, numbers are impotent. It is, says Wellington, “a half-baked form of transparency.” In other places, NYPD releases aggregate rather than incident-level data—and that is downloadable—but without the incident-level data details found on the crime map, effective analysis—the kind that comes up with targeted solutions to specific issues—is impossible. So in both cases, NYPD presents the data in exactly the wrong format. Elsewhere, Wellington says there are already movements, like in Chicago and LA, to release the kind of granular, incident-level crime data that can be analyzed, thereby allowing tailored crime-reducing fixes to emerge.
But, to return to the original point, the reasons an agency like the NYPD might be hesitant to release too much incident-level data in easy to analyze formats can be practical. For example, Wellington says, particularly when it comes to crime data, there are privacy concerns. If there were a rape, for instance, it would not be the right move to reveal the address of the victim. And the NYPD’s job first and foremost is to protect people. “At the end of the day, it may not feel like that includes dealing with data requests,” says Wellington. “Crime would be reduced, but they might come back and say, ‘Look, we have a lot to do right now.’”
So, despite well-intentioned legislation, the struggle toward a truly open-data NYC, the kind that can be the embodiment of all the revolutionizing analyses that Wellington has done, like when he saved New Yorkers thousands of dollars in erroneous parking tickets every year, is ongoing.
“We’re still leaders in the moment of open data, and we’re recognized globally for that,” says Wellington. “There will always be problems everywhere, but I think that things are moving in the right direction, slowly, because it’s government. But it’s moving.”
Imagine, says Wellington, a future where the Department of Sanitation put out an open call for anyone who could figure out how to optimize the fastest schedule for garbage pickup: “I promise, you would get all sorts of great ideas. That is not science fiction, that is very real, a world where the city crowd sources ideas, and then innovates those ideas into action.”
Follow Natalie Rinn on twitter @natalierinn