United States cities collect data on everything from reported potholes to bus ridership to municipal workers’ salaries. With all this info at their fingertips, they are perfectly positioned to take advantage of big data to improve their efficiency and service delivery.
But some cities have even bigger plans for the data they collect.
The City of Chicago’s Open Data Portal, an initiative to promote access to government information, makes over a thousand raw data sets publicly available online. This allows researchers, technologists, and average citizens to conduct any analysis they want. Online since 2010, the portal has grown to include regularly updated data from every city agency.
Kellogg faculty recently visited the city’s Department of Information and Technology to talk big data on a trip organized by the Program on Data Analytics at Kellogg. Afterwards, Tom Schenk, Chicago’s chief data officer, sat down with Achal Bassamboo, a professor of managerial economics and decision sciences, to discuss the evolution of the portal and how data is driving operational change in the city and beyond.
This interview has been edited for length and clarity.
Bassamboo: Who are the audiences for the data portal?
Schenk: We see a lot of different audiences. Some folks are interested in transparency—city salaries are the most viewed data set. We see academic partners who use that data for research. There are what I call “civic technologists,” who are using that data to provide information or build civic applications. Nonprofits and community groups use that data to help inform what’s happening at the local level. Technologists use it to try to learn how to program and store data. Students and teachers use that data for classroom purposes. Members of the public use it to find out what’s happening around them. A lot of people reference the data portal to look at crimes or other activities like business openings.
Bassamboo: One of the biggest challenges in dealing with big data sets is curating them, making sure that they are properly handled and that there are no mistakes. The amount of effort that is needed to clean even one data set is one thing that people sometimes appreciate only once they have done it. Can you talk about that process for the open data portal?
Schenk: What we publish is the same data that city employees use. We don’t clean the data. We simply extract it and post it to the data portal. Cleaning data can be quite a time sink. Trying to clean data before we publish it would not be as productive as adding data sets to the data portal. Adding a data set increases what the public can do with it.
Bassamboo: You wanted to allow for more opportunities, rather than driving people to use the data in certain ways. How have you seen people using data differently than you might have expected?
Schenk: One novel way is a website called How’s Business Chicago.
On the national level, new home constructions and new business permits are leading economic indicators of where the U.S. GDP might be heading. It’s tougher on the local level.
We publish all the local business permits on our data portal, and How’s Business Chicago uses that data and some statistical analysis to try to give us an indication of where the local economy is going. When you combine that with other indicators, it’s somewhat similar to what we have at the national account levels.
Bassamboo: This open data portal is being used internally by the various departments at the City of Chicago and also by the public. How are the data sets being used internally right now?
Schenk: There was a lot of attention in the press about the difficulty that the City of Chicago was having meeting its obligations to do food inspections. There was a clear opportunity for analytics to help inform those decisions. The Department of Public Health had very progressive leadership that was looking to innovate in a number of different ways, so it was easy to reach out to them and make recommendations about how we could do a better job operationally on a day-to-day basis.
Bassamboo: Can you talk a little bit about that link between having this great information and figuring out how to incorporate that data into action plans?
Schenk: We look for the small changes that we can make and insert crucial parts of information to improve the decision-making process.
It’s important to us when we are doing the research project to minimize the impact on the business process—that is, to not ask too much of the manager. If anything, we want to make it easier for that manager to do her job.
You wouldn’t want the manager to have to come into work an hour earlier to do her job better, right? That’s asking too much. But if she can walk into the office, sit down at her computer, and have pieces of information that she did not have access to before that make it easier for her to make decisions, that works.
Bassamboo: When we talk about operations, the ideal organization works through collaborative effort rather than each department doing it’s own thing. Have you seen scenarios where people or organizations have used the shared data to make better decisions collaboratively?
Schenk: Quite often there are large lines of demarcation between departments. But there are encouraging conversations happening. For instance, the Department of Public Health is using 911 data to look at opioid overdoses, which are an epidemic lately. With that data, the Department of Public Health can be proactive in preventing opioid overdoses, whereas before they didn’t have access to the 911 data, so they didn’t have insight into where those overdoses were happening.
Bassamboo: Are there ways in which you feel open data, and this platform, can increase trust in government?
Schenk: The action of posting data sets makes everything much more transparent and easier to access. In the past, some of this data might have been public in the sense that you could have requested it from us, but now it’s something that you can get on demand.
By making data publicly available, I think you change the type of discourse that you have because you’re being proactively transparent, not reactively transparent. Your trust model changes.
For example, if somebody came up to you and told you some facts about themselves, about their life, without you even having to ask that question, you would be hard-pressed to call that person reclusive. Same thing here. Proactively publishing information makes it harder to say that the government is being reclusive—and that helps increase trust.
Bassamboo: What do you feel is missing right now from the open data portal that you would like to see?
Schenk: There’s so much. There’s so much that we do in the city that we have not yet transcribed on the data portal. We have a lot of nonemergency activities that we need to publish to the data portal. Some of those basic day-to-day services that the city provides—that’s probably one of the biggest blind spots that we have right now.
Bassamboo: With people coming to the public portal and taking that data to use, do you feel that there’s a responsibility to create a forum where their findings or results have an opportunity to be shared?
Schenk: It’s easy to reach out to us by email or by social media to share those results or ask questions. It’s also easy for us to reach out. We are present at community meetings—every Tuesday I’m at Chi Hack Night, which is a meeting of civic technologists who are using the data portal. We also go to meetings with community groups across the city to solicit their input and tell them about what we’re doing. That’s a core principle that we have.