Welcome to the Discovery Data Service knowledge base

From Discovery Data Service
Jump to navigation Jump to search

These articles describe the Discovery Data Service; the architecture that underpins the data service, the components that make up the service, the testing and assurance processes, and the resources, open source technologies, and software that are used to host, develop, and support the service.

Discovery Data Service status

  • The data service is live.
  • It currently stores and processes the data from 28 million linked organisational patient records in the UK.
  • It contains long term records on 14 million unique UK citizens who have been resident in London at some point in the last 30 years (including people who have left or died).
  • Data on individuals are held in a single common data model with data values represented by a super ontology composed of the world's leading health ontologies.
  • Data in Discovery is submitted by health providers who provide primary, secondary, and urgent care services to patients or clients.
  • Data is kept up to date in close to real time - at worst within 24 hours; see Latency of extract data feeds for more details.

Discovery overview

These articles provide background to the ideas, philosophies and approaches that have resulted in the Discovery Data Service. They cover the original objectives, the approach to collaboration and shared tenancy, and the information governance rules that underpin the way privacy and sharing are balanced.

  • What is Discovery? - This article describes what Discovery is and how it is conceptualised at a very high level.
  • Case studies - An overview of some initial projects that access and utilise data from Discovery; information includes the applicant, project summary, and the latest status (Development/Live).
  • Latest news - The latest information for all Discovery Data Service releases and changes, plus known issues and updates from system suppliers and other third parties.
  • Current data sets - the broad types of data received from various systems.
  • Current published data - provides the latest data publisher information by sector, geographical area, service, and system provider.

Information model topics

These articles provide information about the services, structure, and functionality that operate using the Information Models; one of the fundamental tenets of the Discovery Data Service. This is the starting point for those interested in health data and the information derived from it, and why Discovery may be different from other approaches:

  • Health Information model - An overview of the approach to the Discovery information model, the purpose, and type of content.
  • Modelling language - The W3C standard languages used to build the various components of the information model
  • Basic health data set  - a very high level overview of the types of data mapped from data held within the Discovery data stores.
  • Reference data - an overview of additional data held in the information model.
  • Mapping and matching concepts and structures - Introduces the approaches to matching and mapping concepts and the structural mappings in the model
  • Information model services - the technical services , such as APIs and apps that enable an model instance to be used or to implement an EHR based on the model or particular data model

Publishers and subscribers

These articles provide some brief high level information for organisations that send data to Discovery and those that use that data.

  • Publishers - includes technical interface information, FHIR mapping, system requirements, and how to make changes to publishing services.
  • Subscribers - includes set up, data mapping, and schema information.

Services and applications

These articles provide information about the services and applications provided by the Discovery information services.

  • Sextant - A data set distribution service with an option to post the data directly into the recipients database.
  • Monitoring and alerting - provides an overview of the monitoring and alert systems in place.
  • Data Sharing manager- DSM provides a visual representation of data that is being shared and processed and by which organisations, regions, and/or services.
  • ASSIGN- UPRN address matching application - a web based application that matches single or batches of hand entered address to authoritative addresses and assigns a unique property reference number.
  • Information model manager - an application designed to provide a view of the common information model, and support the authoring of the content of the model.
  • Information model services - the technical services that enable the common information model to be used.
  • FHIR Get Structured Record API - returns a complete patient record in a structured format following GP and Care Connect standards.
  • HL7v2 API - Send HL7v2 ADT and ORU (clinical events) data to Discovery.
  • Get health record - the service that obtains a fully structured health record about a person, in real time via a standards based API.
  • Record publication service - the service that publishes data from health records for specific purposes, often immediately on receipt.
  • Data set distribution service - the service that distributes sets of data for populations of patients where the definition is complex or where the subscriber is hosting a copy of the data permanently.
  • NHS 111 Discovery frailty flagging / Frailty algorithm
  • GitHub repositories - descriptions and information relating to the application source code.


These articles provide information about the overall high level software architecture models providing a descriptions and visualisations of the Discovery Data Service and information service components and how they interact.

  • The Data Storage Architecture illustrates how data is stored within the service and the various implementation options available for organisations that wish to use some or parts of the services.
  • Identity, authentication and authorisation specifies the approach taken by Discovery for accessing utilities and controlling some aspects of access to application functionality and some data resources, after all of the other locks have been opened.
  • Application zone architecture illustrates how the data is accessed in different places and in different ways according to different use cases
  • Pseudonymisation describes how Discovery generates pseudonymised data.
  • Data linking describes how Discovery links several patient records into a single person/citizen record.

Application help

  • Data Sharing Manager - Data Sharing Manager provides a visual representation of data that is being shared and processed and by which organisations, regions, and/or services.
    You can also amend or cancel data sharing and processing agreements in real time.
  • Record Viewer - Record Viewer shows a combined view of an individual's patient records, using data from different source organisations.
  • Jira - Jira is an effective way to share and track information and helps to make sure that tasks and issues are completed, and signed off, by all relevant teams.

Wiki tips