A while ago I wrote a bit of a rant about Schooloscope, and how its over-simplification of school data made us feel perhaps smarter than we really are. Mike Gurstein, who is Executive Director of the Centre for Community Informatics Research, Development and Training (Vancouver BC and Cape Town, South Africa), has written another angle on a parallel issue. He argues that open data is, of course, a good thing, but that without proper training in its use it just empowers those with the social capital – Internet access, education, time – who can then, in the time-honoured fashion, suck resources away from the less-empowered.
An interesting example of how open data, with appropriate attention being given to some of these pre-conditions, in fact can provide a basis for effective use can be seen in how the UCLA Centre for Health Policy Research’s California Health Interview Survey (CHIS) has been put to use by Community Advocates in Solano County.
The CHPR conducts a bi-annual California Health Interview Survey in conjunction with the California Department of Health “to provide a snapshot of the health and healthcare of Californiansâ€Â.
The survey is used by a range of political authorities but most interestingly they provide free and widely accessible training on how to use the information “to develop appropriate and targeted policy responses†and overall “to learn how to use and apply the data to improve health and health careâ€Â.
That is, the information is not only made accessible but attention is paid and resources are provided to ensure that the data is usable by those who might make effective use of it.
In this instance, the Solana County Community Advocates were trained so as to be able to take the data provided by the CHIS, and plot incidences of asthma by local electoral district. They were then able to create a map showing an extremely high frequency of asthma among residents in a particular local area. The Community Advocates successfully argued against developing another truck stop along I-80 in the county based on CHIS 2001 data estimates that showed Solano County to have the state’s highest rate of asthma symptom prevalence overall and one of the highest rates for children.
It’s a really interesting article (his Bangalore example is also great, but I don’t want to leach away his content). Go and read.
This is an aspect of open data I haven’t considered before. In an era of increasing availability of data it’s tempting to hope that people are going to become commensurately more sophisticated at using it.
On the other hand the asthma case in the article is a great example of going on a statistical ‘fishing trip’ – which can be dangerous. If you look through health data to see where there are geographical clusters of diseases then you are bound to see some. What are the chances of their being an even spread of every disease?
Establishing when a geographical cluster of health problems is caused by an environmental factor and when it’s just a coincidence is hard, but that’s what you need to do before you can usefully draw any conclusions.
I’m sure in the asthma case all this was taken into account. I guess that might not always be the case.
It’s easy to think of open data as neutral source of facts, but to be useful data needs to be properly interpreted.
In the end though having a lot of a data to play with is surely the best way to get better at using it. Fingers crossed.
Didn’t mean this to be such a long comment!