Free Our Data?

Last night I was at the Free Our Data? discussion at the University of Manchester, running as part of the ESRC Festival of Social Science 2007. This was interesting, not least because I have been thinking about this debate purely in terms of geographical data, yet other types of data bring other issues and concerns. The question-mark is important, as it represents the crux of this evening’s debate. Should public sector data be available for free, or freely available? The session was recorded and webcast by the University, so I won’t try and summarise the entire debate, but there were many thought-provoking issues raised by the panelists.

Charles Arthur, Technology Editor for the Guardian fervently believes that digital data should be freely available at the point of use as the costs of dissemination, reproduction etc are virtually nil. Some costs should be borne by increased taxes, but that the benefits to the economy created by free access to the data would outweigh that. Not sure I agree with the first point, that data dissemination doesn’t cost anything, but I can certainly see at our small-fry level we could charge less for the work we do, and do a better job, if we weren’t incurring an overhead for the cost of licensing data.

Jill Matheson, Director of Census, Demographic and Regional Statistics at the Office of National Statistics had three basic and eminently sensible points: That the value of data is in it’s use; that protecting confidentiality is paramount; and that there’s no such thing as free data, only hard decisions as to who pays for it. Jill then went on to say that the more people use data, the better the Quality Assurance Process.So- if we want more people to use the data, then let’s make it free. However, I’m not suggesting a “Statistipedia” approach, as that kind of editorial model would not be appropriate! Jill argued against Charles’ assertion that data can be disseminated at no cost, saying that making data accessible is what costs, rather than making it available. On the subject of confidentiality- I initially wondered if that was a bit of obfuscation, as once statistical data is anonymised confidentiality is obviously not an issue. I then wondered, however, how I would feel if my street was classed badly on the basis of some demographical analysis. That would be personal to me, but at what stage or scale does the data become safe- at the level of a zone, a town or city, a county?

Duncan MacNiven, the Registrar General for Scotland highlighted the way that they are making a lot of information freely available North of the border. However, I was intrigued by his argument for charging for some types of data but not others. He argued that demographic and census data should be (and is) free at the point of use, subsidised by the state, but that genealogical data should not be. Why should Scottish tax payers subsidise people in Australia looking for information on their family history?

I find this difficult to agree with. As a tax payer in England I subsidise a lot of things that I am not interested in, such as healthcare for smokers. As a tax payer in the North of England, I am subsidising the cost of hosting the Olympics in 2012 in London. As an archaeologist, I have seen jobs cancelled since “we” won the Olympic bid, because developments have been curtailed or stopped and the money channeled into the Olympic development. Furthermore, once the data is out there in the digital realm, it’s available to anyone, no matter what country they are in.

Duncan Shiel, Head of Strategy at Ordnance Survey, was always going to be on the least popular side in this debate, but he did make a good point that having good quality data is more important than free data. I agree with this completely and sometimes worry that too much focus on the cost takes away from sensible debate, and leads to poor quality solutions or ways around the problem. Duncan’s next point though, was something I would have liked to challenge, had there been time. He said that the private sector should add value to public sector data. Lovely. So the private sector should pay to use the data, but should then give something enhanced back to the public sector for free? I don’t think so!

Finally Peter Elias, Professor of Labour Economy at the University of Warwick and a Strategic Advisor to the ESRC said that in an ideal world data for publicly funded work for the public good, at a publicly funded institution such as a University, producing results that are going to be publicly available should be able to use data freely. That seems sensible, but not always economically viable.

I went away thinking that commercial archaeological units, in the UK at least, are in a difficult position. We are not academic, in the sense of being part of an educational institution, although some of us are educational charities. However, the work that we do comes from a piece of planning law- PPG16, which requires that the majority of development in this country has some level of archaeological assessment undertaken. We have a duty to preserve the archaeology, preferably in situ but by record if necessary, and to retain it’s context for future generations. Good units will create an academically rigorous report on their findings, often subject to peer review in a reputable journal. And yet, we have to pay through the nose to use public data, unless we happen to work for English Heritage, or can persuade the developers that they should lend us their data for the duration of the project.

End thoughts- it’s a very complicated and rich debate, and perhaps geographical data is the easiest to resolve as it doesn’t have confidentiality issues. Both sides are quite entrenched, but at least the discussion is happening.