Every time an inmate enters or leaves a correction facility in Connecticut, a database is updated so the state has an accurate count of its incarcerated population. And each day, those numbers are used to produce a chart on the state’s website.
It’s one of the few state “datasets” that is updated daily for the public. But it takes persistence to find the chart because it’s buried on the website of the state’s Office of Policy and Management (OPM), and it exists nowhere else, including on other government websites with state criminal justice data.
Also, the feature to download the raw data is broken.
These types of problems are common even in the best government data, which can be hidden in various corners of the Internet, and is often in formats that range from user unfriendly to unusable. In short, data is difficult to gather and analyze.
Following the example of 39 states and 43 cities and counties, Connecticut is planning to launch an online data portal in February or March. The project is being run out of the governor’s office, with OPM doing much of the planning.
Every government agency produces some type of data, much of which is potentially of interest to the public, and an open data portal provides a uniform platform for that data to be published. When data is shared in different places, or not shared at all, the information never gets a chance to be analyzed, aggregated or combined — either by other agencies or private citizens.
“We’re doing other people’s work, so they have every right to see the data,” said OPM Secretary Ben Barnes.
But beyond the actual content, much of the interest in the state’s plan lies with what the portal means for the future of data transparency in the state, which allows citizens — and government — to be aware of what government is doing, without having to go on a scavenger hunt for information.
“What would be productive is producing a culture of transparency — that it’s the default option to publish data,” said Scott Gaul, director of the Hartford Foundation’s Community Indicators Project.
Barnes agrees with the sentiment to “shift the organizational culture to one of disclosure and transparency.” But the shift may take time, he said.
Many rules govern what data can be disclosed, so gatekeepers of that data have often erred on the side of caution because they didn’t want to be “caught on the wrong side” of those rules, Barnes said.
“I would admit that disclosure has not always been the culture in government — in Connecticut, or many other places I’ve worked,” he said.
Privacy and other concerns
People may often be concerned about data violating someone’s privacy.
Sheryl Horowitz, director of community research at the Connecticut Association for Human Services, gave the example of autism data; in some towns, there might only be three people who have autism, and “somebody’s going to know who those three people are.” This type of situation can make people fearful, she said, adding that there are ways around that problem.
Horowitz dealt with these issues when she chaired a data committee for the Early Childhood Cabinet — a state group that considered building a data portal for childhood data to break down some of the “silos” data that exist. There are “obstructionists” not willing to share data, even within departments, Horowitz said. But she said the portal may shift that culture as people begin to see the power of sharing data with each other.
Another fear of publishing raw data is that it will be analyzed incorrectly. In fact, Horowitz predicts it will happen.
“We know that people can misuse data,” she said. “And there’s a fear that if you put data into the hands of anybody that doesn’t have knowledge, they can produce things that are erroneous.”
But putting the data into the open allows others to point out that data has been analyzed incorrectly. Horowitz said the possibility of that discussion is the best-case scenario.
“They’re going to make errors, but they’re also going to expose errors,” she said.
A technical challenge for some state departments, such as the Department of Social Services, is that some of their data is stored in pdfs, Horowitz said. While pdfs are good for presenting data, the format is difficult to manipulate and analyze. So this data may not reach the data portal until underlying organizational and technology problems are fixed.
That said, OPM and other departments currently publish a lot of data online, but not in a central location. Barnes said portals may entirely replace some of the smaller data hubs at OPM. In addition, newer endeavors like the Criminal Justice Information System, which is being built to centralize crime data, may also host its public data on the portal. Barnes is also excited about the possibility of putting the Uniform Chart of Accounts for schools and municipal governments on the portal. This provides information on how towns and cities are spending state dollars.
Other departments maintain their own, relatively robust data collections, so they will have to decide whether to come in line with the open portal. For example, the State Comptroller’s office runs Open Connecticut, which launched in early 2012 to provide state financial data. But because it’s interpreted data, it doesn’t fit with the open data portal’s purpose of providing raw data, Barnes said.
“I respect what the comptroller (Kevin Lembo) is doing,” Barnes said. “It’s significantly more interpretive than an open data portal is intended to be, but they may decide to bring that in line with an open data portal.”
But merely publishing data won’t provide value. In fact, Gaul stressed the importance of data being more granular, so it could be “cut” different ways, and for it to be updated frequently. The state is currently working out who will be in charge of each dataset’s accuracy and timeliness. There will likely be a point-person for each dataset, Barnes said, though this hasn’t been finalized.
In addition, Gaul hopes the portal will reveal new datasets that weren’t previously available to the public. But Horowitz said she doesn’t think new datasets will be uploaded unless people ask for them.
One group that will play a crucial role in the portal’s success is the civic hacking community. While it’s not as prominent in Connecticut as it is in more tech-driven cities such as Boston or New York, these are the groups who will innovate with the data. The state has not yet developed an outreach plan, but Barnes said he expects the community to engage more after the initial launch.
Zack Beatty, who runs a New Haven-based civic hacking group, said he hopes to be involved before that.
“I would love to be invited to a meeting where I had a showing, ahead of the reveal,” he said.
Even though Beatty said he’s glad the state is launching the portal, he’s been planning to reach out to the City of New Haven to pitch an idea for a city-run portal. The city may not spend the money for it, but he said hehopes the community could curb costs by serving as a “pro-bono consultant” to the city. It wouldn’t be a competitor to the state’s portal; he pointed to Philadelphia, where the city runs a data portal but was preceded by a citizen-driven portal. But Beatty added, “Until the two different types of sites are clearly shown to be redundant, I’d much rather err on the side of having both.”
A vendor for the portal will be chosen by the end of the month, Connecticut’s Chief Information Officer Mark Raymond said. There are several companies that provide open data services to governments; the most prominent is SOCRATA, which is what New York City and Chicago use.
OPM Secretary Ben Barnes’ email to division heads