Data Enrichment Service (DES)
What is the Data Enrichment Service?
The Data Enrichment Service (DES) is a web-based scalable information system build on top of the GATE framework. It allows bespoke processing of data, but it can also be used in conjunction with the core DES enrichment system, which provides continually improving entity extraction with a focus on UK-based data.
If you are handling mainly unstructured text that you know has information hidden away inside it, then the Data Enrichment Service available within the OpenUp™ platform will help you to get at that information. It does this by ‘extracting’ useful items of information, such as people's names, organisations or places. The extracted information can be made available in a variety of formats to suit the requirements of the user.
One of the formats that is available is RDF/XML. This information can then be stored in the RDF store making it available for querying. The combination of extraction and RDF store provides a powerful approach to maximising the information potential of documents.
How can the DES help me and my organisation?
The Data Enrichment Service takes textual information from sources, over which the user has limited control, and adds value to the data to make it useful to other computer systems. The purpose of the DES is two fold:
- To identify key elements within unstructured and semi-structured data and introduce machine-readable markup
- To relate the markup to other data sources and create linked data
The core DES service is free to use within a certain limit. For data publishers needing to enrich more than 10,000 documents per day with a Service Level Agreement (SLA), we can offer the professional version of the Data Enrichment Service for a monthly fee.
Find out more
To find out more about how TSO can help you to create and enrich data, visit http://openup.tso.co.uk![]()
We have also produced two Information Sheets explaining what DES is and how it can help your organisation:
OpenUp Information Sheet [305KB]![]()
OpenUp DES Information Sheet [321KB]![]()
To discuss your requirements with one of our experts email opendata@tso.co.uk
How can I access the DES?
You can access the DES through our OpenUp platform at http://openup.tso.co.uk/developer/des![]()
The service is free to use within a certain limit. For data publishers needing to enrich more than 10,000 documents per day with a Service Level Agreement (SLA), we can offer the professional version of the Data Enrichment Service for a monthly fee.
More information on the DES – Data sets used
Data.gov.uk
Much of the data that drives the DES will come from the data.gov.uk initiative. The following data sets are currently used:
- Edubase data (http://education.data.gov.uk/
) - Local authorities and some countries data (http://statistics.data.gov.uk/
) - Government departments, MPs and Peers (http://reference.data.gov.uk/
)
In addition to these datasets, URIs for dates are also included where possible which reference http://reference.data.gov.uk![]()
Legislation.gov.uk
www.legislation.gov.uk
is the UK’s new website for legislation. It provides an API that can be used to access data across all UK enacted and consolidated legislation.
Currently the DES only makes use of primary legislation information.
Ordnance Survey
Certain data from Ordnance Survey is also used. This consists of:
- British cities
- British towns
- British other settlements
- British water features
- Administrative London boroughs
- Administrative counties
Wikipedia
Information has been acquired from Wikipedia for certain gazetteer lists. These are:
- UK newspapers: http://en.wikipedia.org/wiki/List_ of_newspapers_in_the_United_Kingdom

- English churches: http://en.wikipedia.org/wiki/ Religion_in_England

- Scottish churches: http://en.wikipedia.org/wiki/ Religion_in_Scotland

- Royal societies: http://en.wikipedia.org/wiki/List_ of_Royal_Societies

- Place name adjectival and demonymic forms: http://en.wikipedia.org/wiki/List_of_adjectival_and_ demonymic_forms_of_place_names

- Capital cities: http://en.wikipedia.org/wiki/List_of_ national_capitals

- Royal Navy ships: http://en.wikipedia.org/wiki/ Current_Royal_Navy_ships

- Trade unions: http://en.wikipedia.org/wiki/List_of_trade_unions

- Scottish Government: http://en.wikipedia.org/wiki/Scottish_Government

- Military regiments: http://en.wikipedia.org/wiki/British_Army
Other Data
Additional information on political parties is taken from http://openelectiondata.org/![]()
In addition some data from Geonames is used for countries and a small amount of data from DBpedia is also used.
Provenance
The provenance of information can be an important factor for certain users. To that end the DES is starting to implement the Open Provenance Model (OPMV) as is being created with the ongoing work being done by data.gov.uk. More details on OPMV can be found at SourceForge.
Currently the DES is adding in provenance information for RDF/XML serialisations of documents. As the OPMV work is still ongoing the information generated in this respect is subject to change.
TSO Solutions
Publishing Solutions
- Call: +44 (0)870 600 5522
- Email: solutions@tso.co.uk
Follow us on Twitter: @TSO Solutions![]()
Digital Information Management Solutions
- Call: +44 (0)870 600 5522
- Email: opendata@tso.co.uk
Follow us on Twitter: @TSO Technology






