Birmingham Open Data

Birmingham from Wikipedia
Birmingham is the 2nd largest City in England.  Governence of the City is managed by Birmingham City Council which produces, manages and shares a vast amount of public data. 

Open Data?

The simplest way to explain this, is for you to watch this animation on YouTube.

What are we trying to do? 


How did I first get involved?  

After checking with colleagues, I informally approached Digital Birmingham to help with their Open Government Data project.   I had a meeting with their Head on 2012-04-23, was sent a document written by their former Implementation Manager Simon Whitehouse and agreed to come back with some bullet points.  I then attended Transparency Camp 2012 in Washington D.C. at my own expense. 

My initial bullet points  

  • Attitude - check perception of openness  
  • Contracts - tie partners in to opening data 
  • Technology - what systems do we need, if any? 
  • Engagement - train staff to make the most of this 

Work so far...

Meeting @digibrum (map

2012-05-11
Had lunch with Gerry McMullan (Information and Strategy Manager), then we met with Heike Schuster-James (Programme Manager). 
Talked about aims of the project, issues we'll face and benefits we hope to bring. 
I stressed the importance of the engagement model - bad data is an opportunity, not a threat. 
Gerry suggested opening potholes data first, as people have been asking for it. 
I suggested going after data that will be challenging to open first, then working our way down. 
Heike suggested we prioritise the datasets to release and we all agreed. 
Heike suggested building a 3D catalogue of the internal datasets, which chimed with my desire to map @BhamCityCouncil departments and processes. 
I advised that Andrew Thomas and Sarah Mount had expressed interest in writing an academic research paper. 
Also advised I will meet with Emer Coleman on 2012-05-28 and users of London DataStore. 
Heike took more detailed notes and agreed to share via dropbox
I agreed to officially join the project on 2012-05-14, working 1 day a week to start with. 

Meeting @bccrentservices (map)

2012-05-11
My boss was pleased for me, but was concerned about the impact on my work in the Rent Service.  
We agreed to think about it over the weekend, then meet with his manager on 2012-05-17. 

First full day  @digibrum (map

2012-05-14
Tried to export my weekend's work on a Freeplane mindmap into the system that Digital Birmingham use, called MindJet MindManager, but failed.  Will come back to that. Here is what it looks like, though: - 

Had a meeting with Heike Schuster-James (Programme Manager) to discuss: - 
Made contact with various local members of the Birmingham community, including: - 
Start drafting an email to everyone I met (or wished I'd talked to) @TCampDC
Formally invited Andrew Thomas and Sarah Mount to consider writing an academic research paper with me. 
Asked Surita Solanki (Partnership Support Officer) to find a contact @BhamCityCouncil to discuss opening potholes data with. 

Second full day  @digibrum (map

2012-05-15 
Doesn't really feel like I've achieved much today. 
Been doing a lot of reading and research around work already done and approaches to opening data. 
Knocked out 3 slides of a project aims presentation, to help focus my attention. 
Ran into several, "we don't want to release that" brick walls today. 
Decided I need a better pitch when coming up against this. 
Looked at the following web pages: - 
Set out a to-do list: - 

Fifth full day  @digibrum (map

2012-06-11 
Wow.  What a difference a few days make. 
Have a look at this http://j.mp/godhd-wiki_DRAFT 

Seventh full day  @digibrum (map

2012-07-09
Well, I finally think I've got my head round this project.  Here is the text of my July 2012 update for the head of Digital Birmingham (the original is at http://j.mp/bcc-odp-july2012): - 

Birmingham City Council Open Data project update

James Cattell, Monday 9th July 2012.  Twitter @jacattell

Executive summary

Digital Birmingham and partners are creating a project plan to build an open data platform.  This platform will present council data, bringing opportunities to improve efficiency and stimulate economic growth.  The plan, with costs, will be complete by September 2012.

Ongoing work


  1. Engage with local communities and national bodies
  2. Catalogue - update our Information Asset Register
  3. Policy - implement and improve
  4. Pump - procure a system to move our data
  5. Platform - procure a system to present our data
  6. Pricing - develop a structure to charge for data analysis

Engage

As a public body, it is our duty to serve our community, follow Government direction and work with partners.  This trinity drives our service development and therefore requires continuous engagement.

Since its inception several years ago, our open data project has engaged with all of these.  Engagement increased when the Government started pushing open data as a way to stimulate economic growth.

Among others, we currently engage with: -
  • Internal political, directorate and key department leaders
  • External influencers, e.g.
    • West Midlands Open Data User Group
    • Cabinet Office’s Open Data User Group
    • Open Data Institute
  • Small & Medium Enterprises (SMEs)
  • Voluntary and Charity Sector
  • Academia

Catalogue

Try asking an interested party, “What data would you like?”
The usual response is, “Well, what data do you have?”

Central to this project is the catalogue, or Information Asset Register (IAR).  It lists all data assets we manage, mapped to Birmingham City Council’s directorates, departments and teams.  This helps identify service duplication and opportunities to make our organisation more efficient.  It enables open knowledge and points towards the concept of an open service.
We will need to choose a schema to catalogue our data.  Apart from categorisation, schemas help provide data about data (technically known as metadata).  Just as you may read the back cover of a book to get an idea of what’s inside, metadata gives context, attributes the creator, when it was published and points to linked datasets.

Implementing a schema helps us create “linked data”.  This allow different datasets across the internet to be combined, creating new insights into Government and beyond.  Linked data requires: -
  • a unique resource identifier (URI) for each dataset, e.g.
    • opendata.birmingham.gov.uk/schools/moseley_junior_and_infants
  • metadata pointing to related datasets, e.g.
    • constituency=“Hall Green”, ward=”Moseley and Kings Heath”
The allows us to link the school data with related ward or constituency level data, e.g. information on housing costs.

Policy

Birmingham City Council’s current open data policy was approved by our Corporate Management Team (CMT) on 17th May 2011.  A copy is available at http://j.mp/bcc-od-policy.  It is reviewed annually.

It is a comprehensive document that states, “The sharing and open access to datasets using common standards amongst public sector organisations will turn data into intelligent and smart information. This has the potential to accelerate business growth and increase entrepreneurial opportunities.”

Common standards are important, because they allow us to use different data sources in similar ways, with similar tools.  One popular common standard is comma-separated values (CSV).

Unfortunately, our policy has not been widely promoted or adopted internally.  This is evidential in the plethora of Portable Document Format (PDF) files on our website.  PDF is a proprietary format and the data it contains is difficult to use, unlike CSV.  Think of a PDF as an electronic printout – you can’t open a printout in a spreadsheet and start linking it to other data.

Pump

Service Birmingham (our partnership with Capita) maintains a list of applications used inside Birmingham City Council.  Officially this list totals 348, although there may be more.  Once we have catalogued the data in these applications, we need to begin selectively pumping it outside the organisation.  This selection process will be heavily influenced by our community and national drivers.

An automated data pump allows us to control what goes where and how.  Without it, employees will need to manually upload data, which would bring accuracy, security and sustainability issues, to list but 3.

Platform

With the pump in action, we need somewhere to host the data output.  This platform should also allow users to interrogate, engage with and rate our data.  Feedback will improve the service and therefore our organisation, so a solid engagement model is necessary.

Several Government organisations already use a service called the Comprehensive Knowledge Archive Network (CKAN), including data.gov.uk.  CKAN has various levels of service, ranging from catalogue only, to full blown repository with built in engagement tools.  If we choose to use this service, there will be ongoing monthly charges, ranging from hundreds to thousands of Euros.

An alternative is to build our own platform.  At no cost, we have acquired the code London use to run their DataStore.  When we innovate this system, the resulting source code must be open to community.  This allows other organisations to benefit from our work.  Open source code improves engagement with the developer community and fosters goodwill with other local authorities.  It will then be possible to provide a paid consulting service to help others implement this tool, which would repay any costs to develop the code.
An important option is to allow direct and real-time access to live data.  There are obvious and surmountable security issues here, which Service Birmingham and others can advise on.  Sustainable live data is key to the smart cities agenda and allows creation of useful service, e.g. countdown.tfl.gov.uk

I recommend a mix of the above; use CKAN to host the data, our platform for engagement and real-time access to rapidly changing data, such as transport movement.

Pricing

Let us be clear; the majority of our community will not tolerate us charging for open data.  The argument is, “Why, when we have already paid our taxes, should we have to pay again, to access our data?” and it is a fair point.

One way for us to make money from open data, is to provide more paid analysis or consultation.  After all, who knows our data better than us?  External entities may have better analytical and visualisation skills, but I surmise we understand the source data better than anyone.  The change agent based Birmingham Consulting Service is one potential outlet for this service.

Offering a tailored pricing structure is key.  We have invited Birmingham City University and Wolverhampton University to research this subject.  They have access to a rich source of existing research papers and should be able to self-fund this work.

In summary

Everything above is doable.  The challenges are access to resources, i.e. people/money and capitalising on the political will to make this happen.  Digital Birmingham aims to complete this project by early 2013.  The project plan will be delivered in September 2012 and will set out exact timings and costs.  

We welcome the opportunity to work with all interested parties.

www.digitalbirmingham.co.uk/projects/open-government-data
opendata@birmingham.gov.uk
Twitter @digibrum

2012-08-04
Into the final straight now. 

Birmingham City Council Open Data project update

James Cattell, Saturday 4th August 2012.  Twitter @jacattell

So, the planning stage is upon me.  Just a couple more meetings on Monday, then time to knuckle down and meld my findings into a PRINCE2 plan to liberate Birmingham City Council's data.  

The main themes will be: - 
  • Schema - select the appropriate schema for our data, e.g. schema.org 
  • License - will one license fit all our data? If so, which one. If not, what? 
  • Data hunting - approach to find and secure access to all data 
  • Information asset register - mould data and schema into a portable catalogue 
  • (Inter)national catalogues - integration with data.org.uk and others 
  • Systems integration - specify interfaces to applications and databases 
  • Platform - specify both internal and external data stores 
  • Visualisation - what basic methods should be available to all users 
  • Engagement - channels of interaction, service level agreements 
  • Measure risk / success - design system to monitor and report both. 

..and I have 5 working days to do this.  No pressure then :-)
Comments