Remote Subscriber Database (RSD) SQL guide (Compass 1)

From Discovery Data Service
Revision as of 14:59, 5 May 2020 by JoC (talk | contribs)
Jump to navigation Jump to search

Introduction

This article describes the steps needed to create, and analyse, patient data selections from the Discovery database.

Datasets are needed to add patient data to a report. Each dataset represents the results gained from running a combination of SQL commands and stored procedures on a data source.

The following diagram shows the algorithm that is followed to generate these datasets:

The algorithm followed to generate datasets

Creating the Patient Cohort

The first step, when a list of patients marked for observational study is received, is to create a patient cohort. Patient data usually comes into the DDS in the form of pseudonymised personal data. Additional information about the patient can be obtained from the Discovery database using the pseudo identifier or the pseudo NHS identifier.

 

General patient cohort

The following entity diagram shows the relationships between patient and organisation:

General patient cohort

Patients are linked to the Discovery database using pseudo patient identifiers. However, if only pseudo NHS numbers are provided, the corresponding pseudo identifiers can be derived from the link distributor table. Patients are filtered by organisation ODS codes to ensure that only the patients of the required GP practices are selected.

 

 

Building a patient cohort using pseudo NHS numbers

The following example SQL shows how a patient cohort is built using pseudo NHS numbers:

Patient cohort SQL

 

Cohorts with filtered observations

Sometimes, the cohort is created with filtered observations.

The example below filters patients who are registered, alive, and have had diabetes. They belong to the age 12 years and over. Two additional tables are introduced for the query: observation(patients who have had diabetes) and episode of care (patients who are registered and have the appropriate registration type for this study).

Cohorts with filtered observations

Patients can be selected with a specific set of observations using a lookup table of relevant SNOMED codes.
Some patients prefer to withhold consent and are therefore are excluded from the cohort; the relevant SNOMED consent code can be checked in the observation table.
Patients are usually only select if they have registration type 2 (Regular/GMS).
Sometimes patients are also based on their age in years.

 

Building patient cohort SQL with filtered observations

Building patient cohort SQL with filtered observations

Creating the observation cohort

Observation cohort are created to improve query performance; instead of scanning through the entire observation table, a sub-set of the observation table is created, based on the patient cohort.

Building the observation cohort SQL 

Building observation cohort SQL

 

 

 

In this example, the age and LSOA code are derived from the patient table. The demographic dataset table is updated with those values when there is a match on the pseudo identifier.
If the update is large, it might be possible to optimise the update process by using batches as shown in the following example: