Birmingham Open Data
How did I first get involved?
After checking with colleagues, I informally approached Digital Birmingham to help with their Open Government Data project. I had a meeting with their Head on 2012-04-23, was sent a document written by their former Implementation Manager Simon Whitehouse and agreed to come back with some bullet points. I then attended Transparency Camp 2012 in Washington D.C. at my own expense.
My initial bullet points
Attitude - check perception of openness
Contracts - tie partners in to opening data
Technology - what systems do we need, if any?
Engagement - train staff to make the most of this
Work so far...
Had lunch with Gerry McMullan (Information and Strategy Manager), then we met with Heike Schuster-James (Programme Manager).
Talked about aims of the project, issues we'll face and benefits we hope to bring.
I stressed the importance of the engagement model - bad data is an opportunity, not a threat.
Gerry suggested opening potholes data first, as people have been asking for it.
I suggested going after data that will be challenging to open first, then working our way down.
Heike suggested we prioritise the datasets to release and we all agreed.
Heike suggested building a 3D catalogue of the internal datasets, which chimed with my desire to map @BhamCityCouncil departments and processes.
Also advised I will meet with Emer Coleman on 2012-05-28 and users of London DataStore.
Heike took more detailed notes and agreed to share via dropbox.
I agreed to officially join the project on 2012-05-14, working 1 day a week to start with.
We agreed to think about it over the weekend, then meet with his manager on 2012-05-17.
Tried to export my weekend's work on a Freeplane mindmap into the system that Digital Birmingham use, called MindJet MindManager, but failed. Will come back to that. Here is what it looks like, though: -
Had a meeting with Heike Schuster-James (Programme Manager) to discuss: -
Made contact with various local members of the Birmingham community, including: -
Start drafting an email to everyone I met (or wished I'd talked to) @TCampDC
Doesn't really feel like I've achieved much today.
Been doing a lot of reading and research around work already done and approaches to opening data.
Knocked out 3 slides of a project aims presentation, to help focus my attention.
Ran into several, "we don't want to release that" brick walls today.
Decided I need a better pitch when coming up against this.
Looked at the following web pages: -
Set out a to-do list: -
Wow. What a difference a few days make.
Have a look at this http://j.mp/godhd-wiki_DRAFT
Well, I finally think I've got my head round this project. Here is the text of my July 2012 update for the head of Digital Birmingham (the original is at http://j.mp/bcc-odp-july2012): -
Birmingham City Council Open Data project update
James Cattell, Monday 9th July 2012. Twitter @jacattell
Digital Birmingham and partners are creating a project plan to build an open data platform. This platform will present council data, bringing opportunities to improve efficiency and stimulate economic growth. The plan, with costs, will be complete by September 2012.
Engage with local communities and national bodies
Catalogue - update our Information Asset Register
Policy - implement and improve
Pump - procure a system to move our data
Platform - procure a system to present our data
Pricing - develop a structure to charge for data analysis
As a public body, it is our duty to serve our community, follow Government direction and work with partners. This trinity drives our service development and therefore requires continuous engagement.
Since its inception several years ago, our open data project has engaged with all of these. Engagement increased when the Government started pushing open data as a way to stimulate economic growth.
Among others, we currently engage with: -
Internal political, directorate and key department leaders
External influencers, e.g.
West Midlands Open Data User Group
Cabinet Office’s Open Data User Group
Open Data Institute
Small & Medium Enterprises (SMEs)
Voluntary and Charity Sector
Try asking an interested party, “What data would you like?”
The usual response is, “Well, what data do you have?”
Central to this project is the catalogue, or Information Asset Register (IAR). It lists all data assets we manage, mapped to Birmingham City Council’s directorates, departments and teams. This helps identify service duplication and opportunities to make our organisation more efficient. It enables open knowledge and points towards the concept of an open service.
We will need to choose a schema to catalogue our data. Apart from categorisation, schemas help provide data about data (technically known as metadata). Just as you may read the back cover of a book to get an idea of what’s inside, metadata gives context, attributes the creator, when it was published and points to linked datasets.
Implementing a schema helps us create “linked data”. This allow different datasets across the internet to be combined, creating new insights into Government and beyond. Linked data requires: -
a unique resource identifier (URI) for each dataset, e.g.
metadata pointing to related datasets, e.g.
constituency=“Hall Green”, ward=”Moseley and Kings Heath”
The allows us to link the school data with related ward or constituency level data, e.g. information on housing costs.
Birmingham City Council’s current open data policy was approved by our Corporate Management Team (CMT) on 17th May 2011. A copy is available at http://j.mp/bcc-od-policy. It is reviewed annually.
It is a comprehensive document that states, “The sharing and open access to datasets using common standards amongst public sector organisations will turn data into intelligent and smart information. This has the potential to accelerate business growth and increase entrepreneurial opportunities.”
Common standards are important, because they allow us to use different data sources in similar ways, with similar tools. One popular common standard is comma-separated values (CSV).
Unfortunately, our policy has not been widely promoted or adopted internally. This is evidential in the plethora of Portable Document Format (PDF) files on our website. PDF is a proprietary format and the data it contains is difficult to use, unlike CSV. Think of a PDF as an electronic printout – you can’t open a printout in a spreadsheet and start linking it to other data.
Service Birmingham (our partnership with Capita) maintains a list of applications used inside Birmingham City Council. Officially this list totals 348, although there may be more. Once we have catalogued the data in these applications, we need to begin selectively pumping it outside the organisation. This selection process will be heavily influenced by our community and national drivers.
An automated data pump allows us to control what goes where and how. Without it, employees will need to manually upload data, which would bring accuracy, security and sustainability issues, to list but 3.
With the pump in action, we need somewhere to host the data output. This platform should also allow users to interrogate, engage with and rate our data. Feedback will improve the service and therefore our organisation, so a solid engagement model is necessary.
Several Government organisations already use a service called the Comprehensive Knowledge Archive Network (CKAN), including data.gov.uk. CKAN has various levels of service, ranging from catalogue only, to full blown repository with built in engagement tools. If we choose to use this service, there will be ongoing monthly charges, ranging from hundreds to thousands of Euros.
An alternative is to build our own platform. At no cost, we have acquired the code London use to run their DataStore. When we innovate this system, the resulting source code must be open to community. This allows other organisations to benefit from our work. Open source code improves engagement with the developer community and fosters goodwill with other local authorities. It will then be possible to provide a paid consulting service to help others implement this tool, which would repay any costs to develop the code.
An important option is to allow direct and real-time access to live data. There are obvious and surmountable security issues here, which Service Birmingham and others can advise on. Sustainable live data is key to the smart cities agenda and allows creation of useful service, e.g. countdown.tfl.gov.uk
I recommend a mix of the above; use CKAN to host the data, our platform for engagement and real-time access to rapidly changing data, such as transport movement.
Let us be clear; the majority of our community will not tolerate us charging for open data. The argument is, “Why, when we have already paid our taxes, should we have to pay again, to access our data?” and it is a fair point.
One way for us to make money from open data, is to provide more paid analysis or consultation. After all, who knows our data better than us? External entities may have better analytical and visualisation skills, but I surmise we understand the source data better than anyone. The change agent based Birmingham Consulting Service is one potential outlet for this service.
Offering a tailored pricing structure is key. We have invited Birmingham City University and Wolverhampton University to research this subject. They have access to a rich source of existing research papers and should be able to self-fund this work.
Everything above is doable. The challenges are access to resources, i.e. people/money and capitalising on the political will to make this happen. Digital Birmingham aims to complete this project by early 2013. The project plan will be delivered in September 2012 and will set out exact timings and costs.
We welcome the opportunity to work with all interested parties.
Into the final straight now.
Birmingham City Council Open Data project update
James Cattell, Saturday 4th August 2012. Twitter @jacattell
So, the planning stage is upon me. Just a couple more meetings on Monday, then time to knuckle down and meld my findings into a PRINCE2 plan to liberate Birmingham City Council's data.
The main themes will be: -
Schema - select the appropriate schema for our data, e.g. schema.org
License - will one license fit all our data? If so, which one. If not, what?
Data hunting - approach to find and secure access to all data
Information asset register - mould data and schema into a portable catalogue
(Inter)national catalogues - integration with data.org.uk and others
Systems integration - specify interfaces to applications and databases
Platform - specify both internal and external data stores
Visualisation - what basic methods should be available to all users
Engagement - channels of interaction, service level agreements
Measure risk / success - design system to monitor and report both.
..and I have 5 working days to do this. No pressure then :-)