This article describes the steps needed to create, and analyse, patient data selections from the Discovery database.

Introduction

This outlines the steps we take to create the patient selection for analysis from the Discovery database.

To add this data to a report, we create datasets. Each dataset represents the result set from running a combination of SQL commands and stored procedures on a data source.

Fig 1.1 shows the algorithm we follow to generate these datasets

Creating the Patient Cohort

This is the first step when we receive a list of patients marked for observational study.

General Patient Cohort

Fig 2.1 Entity diagram showing the relationships between patient and organization

We filter patients by their organization ODS codes to ensure only patients of the required GP practices are selected.

Example 1.1 Building patient cohort SQL with NHS numbers

DROP TABLE IF EXISTS cohort_patient;

CREATE TABLE cohort_patient AS

SELECT

      p.id AS patient_id,

      p.nhs_number,

      p.person_id AS group_by,

      p.organization_id,

      p.date_of_birth -- add additional columns if required   

FROM patient p JOIN organization org ON p.organization_id = org.id

WHERE org.ods_code IN ('EMIS99','5PC64');  

-- add ods codes inside the brackets e.g. EMIS99, 5PC64

Cohort with Filtered Observations

Sometimes, the cohort is created with filtered observations.

In the example below, we create a cohort where the patients are registered, alive and have diabetes. They belong to age 12 and over.

Here we have introduced additional tables for the query: observation, concept and the episode of care.

Fig 2.2 Entity diagram showing the relationships between patient, observation and episode of care

By joining the observation table we can obtain the observational information about the patient. If specific information is required, like in this case we are looking for patients with diabetes, we would need to filter the observation based on that specific clinical term. That could be in the form of a stand alone code or a list of codes: SNOMED, READ, CTV3, etc.

The observation table does not store these codes directly but store them as concept ids. On querying the data then, the concept table will be queried first to get the list of concept ids that you are interested in for this observation.

If you intend to filter observations by SNOMEDS then add in the concept map table which will map any legacy concept ids, that is, READ, CTV3 etc, to the SNOMED concept ids, then you can filter by SNOMED codes as shown:

SELECT

...

FROM patient p

JOIN observation o on o.patient_id = p.id

JOIN concept_map cm on cm.legacy = o.non_core_concept_id

-- use the non core concept id for link to the concept table

JOIN concept c on c.dbid = cm.core

WHERE c.code IN (46635009, 44054006) -- e.g. diabetes snomed codes

From the episode of care, we can obtain whether the patient is currently registered and has the appropriate registration type for this study:

SELECT

...

FROM patient p

JOIN episode_of_care e ON e.patient_id = p.id

JOIN concept c ON c.dbid = e.registration_type_concept_id

WHERE c.code = 'R' -- registered

Some patients prefer to withhold consent and therefore are excluded from the cohort. We can check this in the observation table. Again we need to get a list of concept ids that you are interested in for this observation.

For example, you could approach it in this way:

1. Create a lookup table to hold the consent codes:

2. Create a view to filter out the concept ids on the observation table with these codes:

-- filter out the observation table with the relevant consent codes

DROP VIEW IF EXISTS obs_optout_view;

CREATE VIEW obs_optout_view AS

SELECT

ob.id,

ob.patient_id,

ob.organization_id,

ob.non_core_concept_id,

ob.clinical_effective_date,

cpt.dbid,

cpt.code,

cpt.name,

csc.status

FROM observation ob

JOIN concept cpt ON cpt.dbid = ob.non_core_concept_id

JOIN consent_code csc ON cpt.id = csc.code

AND cpt.id = BINARY csc.code;

3. Using the view to create a helper function to get the latest opt-out status of the patient:

-- function to retrieve the patient's latest opt out status

DROP FUNCTION IF EXISTS get_optout_status;

DELIMITER //

CREATE FUNCTION get_optout_status(p_patient_id BIGINT,

p_organization_id BIGINT)

RETURNS VARCHAR(1)

NOT DETERMINISTIC READS SQL DATA

BEGIN

DECLARE l_status VARCHAR(1);

DECLARE l_patient_id BIGINT;

DECLARE l_organization_id BIGINT;

SELECT

ob.patient_id,

ob.organization_id,

IF (ob.status = 'Opt-Out', 1, IF(ob.status = 'Opt-In',null,null))

INTO l_patient_id, l_organization_id, l_status

FROM obs_optout_view ob

LEFT JOIN obs_optout_view ob2 ON ob2.patient_id = ob.patient_id

AND (ob.clinical_effective_date < ob2.clinical_effective_date

OR (ob.clinical_effective_date = ob2.clinical_effective_date

AND ob.id < ob2.id))

WHERE ob2.patient_id IS NULL

AND ob.patient_id = p_patient_id

AND ob.organization_id = p_organization_id LIMIT 1;

RETURN l_status;

END//

DELIMITER ;

Where 1 = opt out, null = opt in or not recorded derived from the latest observation from the consent codes.

Putting all of the above together, patient cohort query becomes this:

CREATE TABLE cohort_patient AS

SELECT DISTINCT

p.id AS patient_id,

p.person_id AS group_by,

get_optout_status (p.id, p.registered_practice_organization_id)

AS optout_status,

IF (p.date_of_death IS NULL,

TIMESTAMPDIFF(YEAR, p.date_of_birth, CURDATE()),

TIMESTAMPDIFF(YEAR, p.date_of_birth, p.date_of_death)) age_years

FROM patient p

JOIN observation o ON p.id = o.patient_id

JOIN organization org ON org.id = o.organization_id

JOIN episode_of_care e ON e.patient_id = o.patient_id

JOIN concept c ON c.dbid = e.registration_type_concept_id

JOIN (SELECT o.patient_id

FROM observation o

JOIN concept_map cm ON cm.legacy = o.non_core_concept_id

JOIN concept c ON c.dbid = cm.core

WHERE c.code IN (46635009, 44054006) -- e.g. snomed codes

) ob ON ob.patient_id = p.id

WHERE org.ods_code IN ('EMIS99','5PC64') -- e.g. ods codes

AND p.date_of_death IS NULL

AND c.code = 'R' -- registered patient

AND e.date_registered <= now()

AND (e.date_registered_end > now() OR

e.date_registered_end IS NULL) -- active

AND IF (p.date_of_death IS NULL,

TIMESTAMPDIFF(YEAR, p.date_of_birth, CURDATE()),

TIMESTAMPDIFF(YEAR, p.date_of_birth, p.date_of_death)) >= 12

-- age 12 and over

AND get_optout_status(p.id, p.registered_practice_organization_id) IS NULL;

-- opt in status

3. Creating the Observations Cohort

We create a separate observations cohort for query performance. Instead of scanning through the entire observation table, we create a sub-set of the observation table based on the patient cohort.

Example 2.1 Building the Observations Cohort SQL

CREATE TABLE cohort_observations AS

SELECT DISTINCT

o.id,

o.person_id AS group_by,

o.patient_id,

o.person_id,

o.clinical_effective_date,

o.non_core_concept_id,

o.result_value,

o.result_value_units,

o.result_date,

o.result_text,

cr.age_year

FROM observation o JOIN cohort_patient cr

ON o.person_id = cr.person_id

AND o.organization_id = cr.organization_id

AND o.patient_id = cr.patient_id;

CREATE INDEX obs_cpt_ix ON cohort_observation(non_core_concept_id);

CREATE INDEX obs_pat_ix ON cohort_observation(patient_id);

4. Creating the Medications Cohort

Again, we create the medications cohort for query performance. We create a sub-set of the medication statement/order table based on the patient cohort.

Fig 4.1 Entity diagram showing the relationships between patient, medication statement and organization

Example 4.1 Building the Medications Cohort SQL

-- step 1: cohort of the medication statement

DROP TABLE IF EXISTS cohort_medications_stmt;

CREATE TABLE cohort_medications_stmt AS

SELECT DISTINCT

m.id,

m.non_core_concept_id,

m.person_id,

m.patient_id,

m.clinical_effective_date,

m.cancellation_date

FROM medication_statement m

JOIN cohort_patient cr

ON m.organization_id = cr.organization_id

AND m.person_id = cr.person_id

AND m.patient_id = cr.patient_id;

CREATE INDEX med_stmt_ix ON cohort_medications_stmt(id);

-- step 2: cohort of the medication oder

DROP TABLE IF EXISTS cohort_medications_ord;

CREATE TABLE cohort_medications_ord AS

SELECT DISTINCT

mo.id,

mo.medication_statement_id

mo.person_id,

mo.patient_id,

mo.clinical_effective_date

FROM medication_order mo

JOIN cohort_patient cr

ON mo.organization_id = cr.organization_id

AND mo.person_id = cr.person_id

AND mo.patient_id = cr.patient_id;

CREATE INDEX med_ord_ix ON cohort_medications_ord(medication_statement_id);

-- step 3: cohort of the medication - statement plus order

DROP TABLE IF EXISTS cohort_medications;

CREATE TABLE cohort_medications AS

SELECT DISTINCT

ms.id,

ms.non_core_concept_id,

ms.person_id,

ms.patient_id,

mo.clinical_effective_date,

ms.cancellation_date

FROM cohort_medications_stmt ms JOIN cohort_medications_ord mo

ON ms.id = mo.medication_statement_id;

CREATE INDEX med_ix ON cohort_medications(non_core_concept_id);

CREATE INDEX med_pat_ix ON cohort_medications(patient_id);

5. Building Datasets

This involves creating a number of temporary tables in the database.

Patient Demographic Dataset

A typical patient demographic dataset may consist of the following information:

For this example, we create the dataset to hold the patient demographic information as follows:

CREATE TABLE demographicsDataset (

ExtractDate DATETIME NULL,

Pseudo_id VARCHAR(255) NULL, -- unique identifier

Pseudo_NHSNumber VARCHAR(255) NULL,

Gender VARCHAR(50) NULL,

Age VARCHAR(50) NULL,

DateOfBirth DATE NULL,

EthnicityLCode VARCHAR(50) NULL,

EthnicityLTerm VARCHAR(200) NULL,

BirthCountryLCode VARCHAR(50) NULL,

BirthCountryLTerm VARCHAR(200) NULL,

RegistrationStart DATE NULL,

RegistrationEnd DATE NULL,

IMD2010 VARCHAR(50) NULL,

LSOA2011 VARCHAR(50) NULL,

PracticeODSCode VARCHAR(50) NULL,

PracticeODSName VARCHAR(255) NULL,

CCGName VARCHAR(100) NULL,

YearOfDeath INT(4) NULL,

F2fVisits_Total INT(11) DEFAULT 0,

F2fVisits_1year INT(11) DEFAULT 0,

F2fVisits_5years INT(11) DEFAULT 0

);

ALTER TABLE demographicsDataset ADD INDEX demo_pseudoid_idx (pseudo_id);

-- the unique identifier is usually the pseudo id

-- but to adapt for this schema we could use the person_id

We pre-populate the dataset with the unique identifiers from the patient cohort:

INSERT INTO demographicsDataset (pseudo_id, extractdate) SELECT DISTINCT group_by, now() FROM cohort_patient;

Then we run separate update SQL queries to populate the columns of the dataset table.

For example, add registration start date, registration end date, practice ods code, practice ods name:

DROP TEMPORARY TABLE IF EXISTS reg_sort;

DROP TEMPORARY TABLE IF EXISTS qry_reg;

CREATE TEMPORARY TABLE qry_reg AS

SELECT c.group_by,

e.date_regirstered,

e.date_registered_end,

o.name,

o.ods_code

FROM cohort_patient c

JOIN episode_of_care e ON e.person_id = c.group_by

JOIN organization o ON o.id = e.organization_id

WHERE o.ods_code IN ('EMIS99','5PC64'); -- e.g. ods codes

CREATE TEMPORARY TABLE reg_sort (

row_id INT,

group_by BIGINT,

date_registered DATE,

date_regiostered_end DATE,

ods_code VARCHAR(50),

name VARCHAR(255),

rnk INT, PRIMARY KEY(row_id)) AS

SELECT (@row_no := @row_no + 1) AS row_id,

a.group_by,

a.date_registered,

a.date_registered_end,

a.ods_code,

a.name,

a.rnk

FROM (SELECT q.group_by,

q.date_registered,

q.date_registered_end,

q.name,

q.ods_code,

@currank := IF(@curperson = q.group_by, @currank + 1, 1) AS rnk,

@curperson := q.group_by AS cur_person

FROM qry_reg q, (SELECT @currank := 0, @curperson := 0) r

ORDER BY q.group_by, q.date_registered DESC ) a,

(SELECT @row_no := 0) t

WHERE a.rnk = 1;

SET @row_id = 0;

WHILE EXISTS (SELECT row_id from reg_sort

WHERE row_id > @row_id AND row_id <= @row_id + 1000) DO

UPDATE demographicsDataset d

JOIN reg_sort reg ON d.pseudo_id = reg.group_by

SET d.RegistrationStart = reg.date_registered,

d.RegistrationEnd = reg.date_registered_end,

d.PracticeODSCode = reg.ods_code,

d.PracticeODSName = reg.name

WHERE reg.row_id > @row_id

AND reg.row_id <= @row_id + 1000;

SET @row_id = @row_id + 1000;

END WHILE;

In this example, we created 2 temporary tables. The first to hold the registration data. The second to select from the first the records which contain the latest registration date.

If the update is large in terms of the number of records to update then it might be possible to optimise the update process by using batches as demonstrated in the code example above.

Patient Observations Dataset

A typical patient observations dataset may ask for the following information:

A typical dataset can have over 100 to 200 fields. Usually, there will be several datasets to deliver. But this illustrates the kind of layout we use for constructing an observational dataset.

For this example, we create the dataset to hold the observational information as follows:

DROP TABLE IF EXISTS diagnoses2dataset;

CREATE TABLE diagnoses2dataset (

Pseudo_id VARCHAR(255) NULL,

Pseudo_NHSNumber VARCHAR(255) NULL,

AsthmaECode VARCHAR(50) NULL,

AsthmaETerm VARCHAR(200) NULL,

AsthmaEDate VARCHAR(50) NULL,

AsthmaEmergeECode VARCHAR(50) NULL,

AsthmaEmergeETerm VARCHAR(200) NULL,

AsthmaEmergeEDate VARCHAR(50) NULL,

AsthmaResolvedECode VARCHAR(50) NULL,

AsthmaResolvedETerm VARCHAR(200) NULL,

AsthmaResolvedEDate VARCHAR(50) NULL,

COPDECode VARCHAR(50) NULL,

COPDETerm VARCHAR(200) NULL,

COPDEDate VARCHAR(50) NULL,

PulmonaryFibrosisECode VARCHAR(50) NULL,

PulmonaryFibrosisETerm VARCHAR(200) NULL,

PulmonaryFibrosisEDate VARCHAR(50) NULL,

InterstitialLungDiseaseECode VARCHAR(50) NULL,

InterstitialLungDiseaseETerm VARCHAR(200) NULL,

InterstitialLungDiseaseEDate VARCHAR(50) NULL,

AgeRelatedMuscularDegenerationECode VARCHAR(50) NULL,

AgeRelatedMuscularDegenerationETerm VARCHAR(200) NULL,

AgeRelatedMuscularDegenerationEDate VARCHAR(50) NULL,

GlaucomaECode VARCHAR(50) NULL,

GlaucomaETerm VARCHAR(200) NULL,

GlaucomaEDate VARCHAR(50) NULL,

RheumatoidArthritisECode VARCHAR(50) NULL,

RheumatoidArthritisETerm VARCHAR(200) NULL,

RheumatoidArthritisEDate VARCHAR(50) NULL,

SystemicLupusECode VARCHAR(50) NULL,

SystemicLupusETerm VARCHAR(200) NULL,

SystemicLupusEDate VARCHAR(50) NULL,

InflammatoryBowelDiseaseECode VARCHAR(50) NULL,

InflammatoryBowelDiseaseETerm VARCHAR(200) NULL,

InflammatoryBowelDiseaseEDate VARCHAR(50) NULL,

CrohnsDiseaseECode VARCHAR(50) NULL,

CrohnsDiseaseETerm VARCHAR(200) NULL,

CrohnsDiseaseEDate VARCHAR(50) NULL,

UlcerativeColitisCodeECode VARCHAR(50) NULL,

UlcerativeColitisCodeETerm VARCHAR(200) NULL,

UlcerativeColitisCodeEDate VARCHAR(50) NULL,

AtopicDermatitisECode VARCHAR(50) NULL,

AtopicDermatitisETerm VARCHAR(200) NULL,

AtopicDermatitisEDate VARCHAR(50) NULL,

InheritedMucociliaryClearanceECode VARCHAR(50) NULL,

InheritedMucociliaryClearanceETerm VARCHAR(200) NULL,

InheritedMucociliaryClearanceEDate VARCHAR(50) NULL,

PrimaryCiliaryDyskinesiaECode VARCHAR(50) NULL,

PrimaryCiliaryDyskinesiaETerm VARCHAR(200) NULL,

PrimaryCiliaryDyskinesiaEDate VARCHAR(50) NULL,

MelanomaECode VARCHAR(50) NULL,

MelanomaETerm VARCHAR(200) NULL,

MelanomaEDate VARCHAR(50) NULL,

ProstateCancerECode VARCHAR(50) NULL,

ProstateCancerETerm VARCHAR(200) NULL,

ProstateCancerEDate VARCHAR(50) NULL,

LungCancerECode VARCHAR(50) NULL,

LungCancerETerm VARCHAR(200) NULL,

LungCancerEDate VARCHAR(50) NULL,

SmallBowelCancerECode VARCHAR(50) NULL,

SmallBowelCancerETerm VARCHAR(200) NULL,

SmallBowelCancerEDate VARCHAR(50) NULL,

ColorectalCancerECode VARCHAR(50) NULL,

ColorectalCancerETerm VARCHAR(200) NULL,

ColorectalCancerEDate VARCHAR(50) NULL,

BreastCancerECode VARCHAR(50) NULL,

BreastCancerETerm VARCHAR(200) NULL,

BreastCancerEDate VARCHAR(50) NULL,

MiscarriageECode VARCHAR(50) NULL,

MiscarriageETerm VARCHAR(200) NULL,

MiscarriageEDate VARCHAR(50) NULL

);

ALTER TABLE diagnoses2dataset ADD INDEX pseudoid_idx (pseudo_id);

INSERT INTO diagnoses2dataset (pseudo_id)

SELECT DISTINCT group_by FROM cohort_patient;

Each column of the dataset is named by joining pieces of identifiable information together by combining the observational type, the timeframe for the analysis and the requested field type.

Fig 5.1 Shows the naming convention of a dataset column

Again, we pre-populate the dataset with the unique identifiers from the patient cohort.

Then we run separate update SQL queries to populate the dataset table by calling multiple stored procedures.

Patient Medications Dataset

A typical medication request would look like this:

For this example, we create the dataset to hold the medication information as follows:

CREATE TABLE medications (

pseudo_id VARCHAR(255) DEFAULT NULL,

nhsnumber VARCHAR(10) DEFAULT NULL,

codedate VARCHAR(20) DEFAULT NULL,

codeterm VARCHAR(255) DEFAULT NULL,

code VARCHAR(100) DEFAULT NULL,

codevalue VARCHAR(100) DEFAULT NULL,

codeunit VARCHAR(100) DEFAULT NULL

);

Again, we pre-populate the dataset with the unique identifiers from the patient cohort.

Then we run separate update SQL queries to populate the dataset table by calling multiple stored procedures.

6. Calling Stored Procedures

Once we have generated the observation and medication datasets and cohorts, the next step is to populate the datasets using stored procedures.

We construct a series of populate stored procedures which in turn call on other stored procedures to complete the update process.

Fig 6.1 Shows the datasets update process using stored procedures

Calling the Populate Stored Procedures

Create a SQL script to call the stored procedure as follows:

CALL populateCodeDate

(0,'StillbirthE','diagnoses2dataset',0,'237364002,161743003',null,null,null,'N');

CALL populateCodeDate

(0,'PregnancyInducedHypertensionE','diagnoses2dataset',0,null,'48194001',null,null,'N');

CALL populateCodeDate

(0,'PreEclampsiaE','diagnoses2dataset',0,'398254007',null,null,null,'N');

CALL populateCodeDate

(0,'CholestasisE','diagnoses2dataset',0,'235888006',null,null,null,'N');

CALL populateCodeDate

(0,'GallstonesE','diagnoses2dataset',0,'266474003,407637009',null,null,null,'N');

CALL populateCodeDate

(0,'GoutE','diagnoses2dataset',0,'90560007,161451004',null,null,null,'N');

CALL populateCodeDate

(0,'AnkylosingSpondylitisE','diagnoses2dataset',0,'9631008',null,null,null,'N');

CALL populateCodeDate

(0,'JaundiceE','diagnoses2dataset',0,'18165001,161536006',null,null,null,'N');

CALL populateCodeDate

(0,'PsoriasisE','diagnoses2dataset',0,'9014002',null,null,null,'N');

CALL populateCodeDate

(0,'DeafnessE','diagnoses2dataset',0,'15188001',null,null,null,'N');

CALL populateCodeDate

(0,'HearingAidE','diagnoses2dataset',0,'365240006',null,null,null,'N');

The procedure accepts a list of input parameters (this may vary depending on requirements):

Parameter	Data Type	Description
filterType	INT	1 latest, 0 earliest, 2 ever, 3 pivot
col	VARCHAR(100)	the root of the column name
datasetTable	VARCHAR(100)	table name of dataset
reset	BIT	1 reset, 0 no reset of the dataset table
codesToAdd1	VARCHAR(5000)	include all snomed parents and their children
codesToAdd2	VARCHAR(5000)	include snomed parents only
codesToRemove3	VARCHAR(5000)	exclude snomed parents and their children
codesToRemove4	VARCHAR(5000)	exclude only snomed parents
ignorenulls	VARCHAR(1)	Y or N - whether to ignore null result value

A typical populate procedure is outlined below. (Please note this is only an example and may need modifications to it depending on your requirements).

DROP PROCEDURE IF EXISTS populateCodeDate;

DELIMITER //

CREATE PROCEDURE populateCodeDate (

IN filterType INT, -- 1 latest, 0 earliest, 2 ever, 3 pivot

IN col VARCHAR(100), -- the root of the column name

IN datasetTable VARCHAR(100), -- table name of dataset

IN reset BIT, -- 1 reset, 0 no reset

IN codesToAdd1 VARCHAR(5000), -- all parents and their children

IN codesToAdd2 VARCHAR(5000), -- parents only

IN codesToRemove3 VARCHAR(5000), -- parents and their children to be excluded

IN codesToRemove4 VARCHAR(5000), -- just parents to be excluded

IN ignorenulls VARCHAR(1)

)

BEGIN

-- reset snomeds table

DELETE FROM snomeds WHERE cat_id IN (1, 2, 3, 4);

-- reset store table

DELETE FROM store WHERE id IN (1, 2, 3, 4);

-- get snomed concept ids

IF codesToAdd1 IS NOT NULL THEN

CALL storeSnomedString(codesToAdd1, 1);

CALL getAllSnomedsConceptIds (1);

END IF;

IF codesToAdd2 IS NOT NULL THEN

CALL storeSnomedString (codesToAdd2, 2);

CALL getAllSnomedsConceptIds (2);

END IF;

IF codesToRemove3 IS NOT NULL THEN

CALL storeSnomedString (codesToRemove3, 3);

CALL getAllSnomedsConceptIds (3);

END IF;

IF codesToRemove4 IS NOT NULL THEN

CALL storeSnomedString (codesToRemove4, 4);

CALL getAllSnomedsConceptIds (4);

END IF;

-- filter observation based on the snomed codes

CALL filterObservations(filterType, 1, ignorenulls);

-- reset columns

IF (reset = 1) THEN

SET @reset_sql = CONCAT('UPDATE ', datasetTable, ' SET ',

col, "Code = null, ",

col, "Term = null, ",

col, "Date = null");

PREPARE resetStmt FROM @reset_sql;

EXECUTE resetStmt;

DEALLOCATE PREPARE resetStmt;

END IF;

DROP TEMPORARY TABLE IF EXISTS qry_tmp;

CREATE TEMPORARY TABLE qry_tmp (

row_id INT,

group_by VARCHAR(255),

original_code VARCHAR(20),

original_term VARCHAR(200),

clinical_effective_date DATE, PRIMARY KEY(row_id)

) AS

SELECT (@row_no := @row_no+1) AS row_id,

f.group_by,

f.original_code,

f.original_term,

f.clinical_effective_date

FROM filteredObservations f, (SELECT @row_no := 0) t;

SET @row_id = 0;

WHILE EXISTS (SELECT row_id from qry_tmp

WHERE row_id > @row_id AND row_id <= @row_id + 1000) DO

SET @sql = CONCAT('UPDATE ', datasetTable,

' d JOIN qry_tmp f ON d.pseudo_id = f.group_by SET ',

col, "Code = f.original_code, ",

col, "Term = f.original_term, ",

col, "Date = date_format(f.clinical_effective_date, '%d/%m/%Y')

WHERE f.row_id > @row_id AND f.row_id <= @row_id + 1000");

PREPARE stmt FROM @sql;

EXECUTE stmt;

DEALLOCATE PREPARE stmt;

SET @row_id = @row_id + 1000;

END WHILE;

END//

DELIMITER ;

Retrieving the SNOMED Concept Ids

The procedure accepts a list of SNOMED codes as input parameter.

Fig 6.2 Shows how the SNOMED codes are being retrieved using stored procedures

Calling the storeSnomedString procedure

We call the storeSnomedString procedure to separate the list of SNOMED codes into their individual values and store them into a temporary table:

DROP TABLE IF EXISTS store;

CREATE TABLE store (

id INT,

org_snomed_id BIGINT

);

ALTER TABLE store ADD INDEX store_idx (org_snomed_id);

DROP PROCEDURE IF EXISTS storeSnomedString;

DELIMITER //

CREATE PROCEDURE storeSnomedString (

IN stringValue VARCHAR(5000),

IN cat_id INT

)

BEGIN

DECLARE front VARCHAR(5000) DEFAULT NULL;

DECLARE frontlen INT DEFAULT NULL;

DECLARE TempValue VARCHAR(5000) DEFAULT NULL;

processloop:

LOOP

IF LENGTH(TRIM(stringValue)) = 0

OR stringValue IS NULL THEN

LEAVE process loop;

END IF;

SET front = SUBSTRING_INDEX(stringValue, ',', 1);

SET frontlen = LENGTH(front);

SET TempValue = TRIM(front);

INSERT INTO store (id, org_snomed_id)

VALUES (cat_id, CAST(TempValue AS SIGNED));

SET stringValue = INSERT(stringValue, 1, frontlen + 1, '');

END LOOP;

END//

DELIMITER ;

Calling the getAllSnomedsConceptIds procedure

We call another procedure getAllSnomedsConceptIdsFromSnomedCodes to loop through the temporary table. For each SNOMED code, we retrieve the corresponding SNOMED concept id.

We store the resultant set of concept ids into another temporary table where it will be used to query observational data.

DROP TABLE IF EXISTS snomeds;

CREATE TABLE snomeds (

cat_id INT,

snomed_concept_id BIGINT,

core_concept INT,

snomed_code BIGINT,

term VARCHAR(255)

);

ALTER TABLE snomeds ADD INDEX sno_cpt_idx(snomed_concept_id);

ALTER TABLE snomeds ADD INDEX sno_code_idx(snomed_code);

DROP PROCEDURE IF EXISTS getAllSnomedsConceptIds;

DELIMITER //

CREATE PROCEDURE getAllSnomedsConceptIds(p_cat_id INT)

BEGIN

DECLARE done INT;

DECLARE l_parent_id BIGINT;

DECLARE c_get_snomeds CURSOR FOR SELECT org_snomed_id

FROM store WHERE id = p_cat_id;

DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;

SET done = 0;

OPEN c_get_snomeds;

processloop:

WHILE (done = 0) DO

FETCH c_get_snomeds INTO l_parent_id;

IF done = 1 THEN

LEAVE processloop;

END IF;

IF p_cat_id IN (1, 3) THEN

-- get parent snomed concept id based on the snomed code

INSERT INTO snomeds

SELECT

p_cat_id,

cm.legacy,

cm.core,

c.code,

c.name

FROM concept_map cm JOIN concept c ON c.dbid = cm.core

WHERE c.code = l_parent_id;

-- get all child snomed concept ids based on the parent snomed code

INSERT INTO snomeds

SELECT

p_cat_id,

cptm.legacy,

cptm.core,

cpt.code,

cpt.name

FROM concept_tct tc

JOIN snomeds sn ON sn.core_concept = tc.target

AND tc.property = '92842'

JOIN concept_map cptm on cptm.core = tc.source

AND tc.property = '92842'

JOIN concept cpt on cpt.dbid = cptm.core

WHERE sn.snomed_code = l_parent_id;

ELSE

-- include or exclude parent snomed

INSERT INTO snomeds

SELECT

p_cat_id,

cm.legacy,

cm.core,

c.code,

c.name

FROM concept_map cm JOIN concept c ON c.dbid = cm.core

WHERE c.code = l_parent_id;

END IF;

END WHILE processloop;

CLOSE c_get_snomeds;

SET done = 0;

END //

DELIMITER ;

Using the concept_tct table, which is a transitive closure table, we are able to find the corresponding child concept ids based on the parent concept id.

Match the parent against the value in the target column. This will return all the children in the source column.

Retrieving filtered Observations

Using the prepared observations cohort, the next step is to filter out this cohort based on the SNOMED code list and the timeframe, which have been passed in as input parameters from the populate procedure.

Fig 6.3 Shows the processing of the FilterObservations procedure

Calling the filterObservations procedure

By calling the filterObservations procedure, we pass in the following input parameters:

Parameter	Data Type	Description
filterType	INT	Timeframe of the analysis. 0 = earliest, 1 = latest, 2 = ever, 3 = pivot around cohort pivot date (6 months), 4 = All since
toCreate	INT	1 is the default value - to call createObservationsFromCohort
ignorenulls	VARCHAR(1)	Y or N - whether to ignore null result value

DROP PROCEDURE IF EXISTS filterObservations;

DELIMITER //

CREATE PROCEDURE filterObservations (

IN filterType INT,

-- 0 earliest, 1 latest, 2 ever,

-- 3 pivot around cohort pivot date (6 months),

-- 4 allSince

IN toCreate INT, -- 1 to createObservationsFromCohortv2

IN ignorenulls VARCHAR(1) -- Y or N

)

BEGIN

IF (toCreate = 1) THEN

CALL createObservationsFromCohort (filterType);

END IF;

DROP TABLE IF EXISTS filteredObservations;

IF (filterType = 0) THEN -- earliest

IF (ignorenulls = 'Y') THEN

CREATE TABLE filteredObservations AS

SELECT

ob.id,

ob.group_by,

ob.patient_id,

ob.person_id,

ob.original_code,

ob.original_term,

ob.result_value,

ob.clinical_effective_date,

ob.result_value_units,

ob.age_years,

ob.rnk

FROM (

SELECT

o.id,

o.group_by,

o.patient_id,

o.person_id,

o.original_code,

o.original_term,

o.result_value,

o.clinical_effective_date,

o.result_value_units,

o.age_years,

@currank :=

IF(@curperson = o.person_id, @currank + 1, 1) AS rnk,

@curperson := o.person_id AS cur_person

FROM observationsFromCohort o,

(SELECT @currank := 0, @curperson := 0) r

WHERE o.result_value IS NOT NULL

ORDER BY o.person_id,

o.clinical_effective_date ASC,

o.id ASC -- earliest

) ob

WHERE ob.rnk = 1;

ELSE

CREATE TABLE filteredObservations AS

SELECT

ob.id,

ob.group_by,

ob.patient_id,

ob.person_id,

ob.original_code,

ob.original_term,

ob.result_value,

ob.clinical_effective_date,

ob.result_value_units,

ob.age_years,

ob.rnk

FROM (

SELECT

o.id,

o.group_by,

o.patient_id,

o.person_id,

o.original_code,

o.original_term,

o.result_value,

o.clinical_effective_date,

o.result_value_units,

o.age_years,

@currank :=

IF(@curperson = o.person_id, @currank + 1, 1) AS rnk,

@curperson := o.person_id AS cur_person

FROM observationsFromCohort o,

(SELECT @currank := 0, @curperson := 0) r

ORDER BY o.person_id,

o.clinical_effective_date ASC,

o.id ASC -- earliest

) ob

WHERE ob.rnk = 1;

END IF;

ELSEIF (filterType = 2 OR filterType = 4) THEN -- ever or allSince

IF (ignorenulls = 'Y') THEN

CREATE TABLE filteredObservations AS

SELECT DISTINCT

mc.group_by,

mc.patient_id,

mc.person_id,

mc.original_code,

mc.original_term,

mc.result_value,

mc.clinical_effective_date,

mc.result_value_units,

mc.age_years

FROM observationsFromCohort mc

WHERE mc.result_value IS NOT NULL;

ELSE

CREATE TABLE filteredObservations AS

SELECT DISTINCT

mc.group_by,

mc.patient_id,

mc.person_id,

mc.original_code,

mc.original_term,

mc.result_value,

mc.clinical_effective_date,

mc.result_value_units,

mc.age_years

FROM observationsFromCohort mc;

END IF;

ELSEIF (filterType = 1 or filterType = 3) THEN -- latest or pivot

IF (ignorenulls = 'Y') THEN

CREATE TABLE filteredObservations as

SELECT

ob.id,

ob.group_by,

ob.patient_id,

ob.person_id,

ob.original_code,

ob.original_term,

ob.result_value,

ob.clinical_effective_date,

ob.result_value_units,

ob.age_years,

ob.rnk

FROM (

SELECT

o.id,

o.group_by,

o.patient_id,

o.person_id,

o.original_code,

o.original_term,

o.result_value,

o.clinical_effective_date,

o.result_value_units,

o.age_years,

@currank :=

IF(@curperson = o.person_id, @currank + 1, 1) AS rnk,

@curperson := o.person_id AS cur_person

FROM observationsFromCohort o,

(SELECT @currank := 0, @curperson := 0) r

WHERE o.result_value IS NOT NULL

ORDER BY o.person_id,

o.clinical_effective_date DESC,

o.id DESC -- latest

) ob

WHERE ob.rnk = 1;

ELSE

CREATE TABLE filteredObservationsV2 as

SELECT

ob.id,

ob.group_by,

ob.patient_id,

ob.person_id,

ob.original_code,

ob.original_term,

ob.result_value,

ob.clinical_effective_date,

ob.result_value_units,

ob.age_years,

ob.rnk

FROM (

SELECT

o.id,

o.group_by,

o.patient_id,

o.person_id,

o.original_code,

o.original_term,

o.result_value,

o.clinical_effective_date,

o.result_value_units,

o.age_years,

@currank :=

IF(@curperson = o.person_id, @currank + 1, 1) AS rnk,

@curperson := o.person_id AS cur_person

FROM observationsFromCohort o,

(SELECT @currank := 0, @curperson := 0) r

ORDER BY o.person_id,

o.clinical_effective_date DESC,

o.id DESC -- latest

) ob

WHERE ob.rnk = 1;

END IF;

ELSE

SIGNAL SQLSTATE '45000'

SET MESSAGE_TEXT = 'filterType not recognised';

END IF;

END//

DELIMITER ;

Calling the createObservationsFromCohort procedure

From the filterObservations procedure, we call the createobservationsFromCohort procedure, passing in the filter type as the input parameter:

CALL createObservationsFromCohort (filterType);

DROP PROCEDURE IF EXISTS createObservationsFromCohort;

DELIMITER //

CREATE PROCEDURE createObservationsFromCohort (

IN filterType INT

-- 0 earliest, 1 latest, 2 ever,

-- 3 pivot around cohort_patient pivot date (6 months)

)

BEGIN

DROP TEMPORARY TABLE IF EXISTS snomeds_tmp;

CREATE TEMPORARY TABLE snomeds_tmp AS

SELECT cat_id,

snomed_concept_id

FROM snomeds

WHERE cat_id IN (3, 4);

-- delete from snomeds all snomed_concept ids to be excluded

DELETE t1 FROM snomeds t1 JOIN snomeds_tmp t2

ON t1.snomed_concept_id = t2.snomed_concept_id

WHERE t1.cat_id IN (1, 2);

DROP TABLE IF EXISTS observationsFromCohort;

IF (filterType = 3) THEN

-- pivot over 6 months from pivot date (already set in cohort)

CREATE TABLE observationsFromCohort AS

SELECT DISTINCT

o.id,

o.patient_id,

o.person_id,

o.group_by,

o.clinical_effective_date,

o.non_core_concept_id AS original_code,

SUBSTRING(s.term, 1, 200) AS original_term,

o.result_value,

o.result_value_units,

o.age_years

FROM cohort_observations o

JOIN snomeds s ON s.snomed_concept_id = o.non_core_concept_id

WHERE o.clinical_effective_date IS NOT NULL

AND o.clinical_effective_date

BETWEEN DATE_SUB(c.pivot_date, INTERVAL 6 MONTH)

AND DATE_SUB(c.pivot_date, INTERVAL -6 MONTH);

ELSE

-- latest or earliest or ever (so all observations)

CREATE TABLE observationsFromCohort AS

SELECT DISTINCT

o.id,

o.patient_id,

o.person_id,

o.group_by,

o.clinical_effective_date,

o.non_core_concept_id AS original_code,

SUBSTRING(s.term, 1, 200) AS original_term,

o.result_value,

o.result_value_units,

o.age_years

FROM cohort_observations o

JOIN snomeds s ON s.snomed_concept_id = o.non_core_concept_id

WHERE o.clinical_effective_date IS NOT NULL;

END IF;

ALTER TABLE observationsFromCohort ADD INDEX obv_pat_idx(patient_id);

END//

DELIMITER ;

This returns a new observations cohort based on the SNOMED code list.

Filter Observations by Timeframe

Using the new observations cohort, we pass in the filter type to retrieve the observations based on the timeframe for the analysis, and using the clinical effective date to determine the time intervals. We store the result set into a temporary table. This is illustrated in the code snippet of the filterObservations procedure below:

CREATE TABLE filteredObservationsV2 as

SELECT

ob.id,

ob.group_by,

ob.patient_id,

ob.person_id,

ob.original_code,

ob.original_term,

ob.result_value,

ob.clinical_effective_date,

ob.result_value_units,

ob.age_years,

ob.rnk

FROM (

SELECT o.id,

o.group_by,

o.patient_id,

o.person_id,

o.original_code,

o.original_term,

o.result_value,

o.clinical_effective_date,

o.result_value_units,

o.age_years,

@currank :=

IF(@curperson = o.person_id, @currank + 1, 1) AS rnk,

@curperson := o.person_id AS cur_person

FROM observationsFromCohort o,

(SELECT @currank := 0, @curperson := 0) r

WHERE o.result_value IS NOT NULL

ORDER BY o.person_id,

o.clinical_effective_date ASC,

o.id ASC -- earliest

) ob

WHERE ob.rnk = 1;

Updating Dataset Table Columns

Using the result set derived from the above step, we can now update the dataset table columns with the appropriate observational values. This is illustrated in the code snippet of the populateCodeDate procedure:

DROP TEMPORARY TABLE IF EXISTS qry_tmp;

CREATE TEMPORARY TABLE qry_tmp (

row_id INT,

group_by VARCHAR(255),

original_code VARCHAR(20),

original_term VARCHAR(200),

clinical_effective_date DATE, PRIMARY KEY(row_id)

) AS

SELECT (@row_no := @row_no+1) AS row_id,

f.group_by,

f.original_code,

f.original_term,

f.clinical_effective_date

FROM filteredObservationsV2 f, (SELECT @row_no := 0) t;

SET @row_id = 0;

WHILE EXISTS (SELECT row_id from qry_tmp

WHERE row_id > @row_id AND row_id <= @row_id + 1000) DO

SET @sql = CONCAT('UPDATE ', datasetTable,

' d JOIN qry_tmp f ON d.pseudo_id = f.group_by SET ',

col, "Code = f.original_code, ",

col, "Term = f.original_term, ",

col, "Date = date_format(f.clinical_effective_date, '%d/%m/%Y')

WHERE f.row_id > @row_id AND f.row_id <= @row_id + 1000");

PREPARE stmt FROM @sql;

EXECUTE stmt;

DEALLOCATE PREPARE stmt;

SET @row_id = @row_id + 1000;

END WHILE;

To populate medication datasets, we use a similar approach as above.

APPENDIX

Some supplementary material that maybe helpful when developing SQL query, in particular, working with concept ids.

Helper Functions

• Function to retrieve the description field on the concept table based on the concept id:

DELIMITER //

CREATE FUNCTION get_concept_desc(p_concept_id int)

RETURNS VARCHAR(400)

NOT DETERMINISTIC READS SQL DATA

BEGIN

DECLARE l_description VARCHAR(400);

SELECT description INTO l_description

FROM concept cpt

WHERE cpt.dbid = p_concept_id;

RETURN l_description;

END//

DELIMITER ;

Usage: SELECT get_concept_desc(pat.gender_concept_id) FROM patient pat;

• Function to retrieve the code field on the concept table based on the concept id:

DROP FUNCTION IF EXISTS get_concept_code;

DELIMITER //

CREATE FUNCTION get_concept_code(p_concept_id int)

RETURNS VARCHAR(20)

NOT DETERMINISTIC READS SQL DATA

BEGIN

DECLARE l_code VARCHAR(20);

SELECT code INTO l_code

FROM concept cpt

WHERE cpt.dbid = p_concept_id;

RETURN l_code;

END//

DELIMITER ;

Usage: SELECT get_concept_code(epoc.registration_type_concept_id)

FROM episode_of_care epoc;

• Function to retrieve the scheme field on the concept table based on the concept id:

DROP FUNCTION IF EXISTS get_concept_scheme;

DELIMITER //

CREATE FUNCTION get_concept_scheme(p_concept_id int)

RETURNS VARCHAR(50)

NOT DETERMINISTIC READS SQL DATA

BEGIN

DECLARE l_scheme VARCHAR(50);

SELECT scheme INTO l_scheme

FROM concept cpt

WHERE cpt.dbid = p_concept_id;

RETURN l_scheme;

END//

DELIMITER ;

Usage: SELECT get_concept_scheme(ob.non_core_concept_id)

FROM observation ob;

• Function to retrieve the ods code based on the organization id:

DROP FUNCTION IF EXISTS get_ods;

DELIMITER //

CREATE FUNCTION get_ods(p_org_id bigint)

RETURNS VARCHAR(50)

NOT DETERMINISTIC READS SQL DATA

BEGIN

DECLARE l_ods_code VARCHAR(50);

SELECT ods_code INTO l_ods_code

FROM organization org

WHERE org.id = p_org_id;

RETURN l_ods_code;

END//

DELIMITER ;

Usage: SELECT get_ods(pat.registered_practice_organization_id) FROM patient pat;

• Function to retrieve the organization name based on the organization id:

DROP FUNCTION IF EXISTS get_org_name;

DELIMITER //

CREATE FUNCTION get_org_name(p_org_id bigint)

RETURNS VARCHAR(255)

NOT DETERMINISTIC READS SQL DATA

BEGIN

DECLARE l_org_name VARCHAR(255);

SELECT name INTO l_org_name

FROM organization org

WHERE org.id = p_org_id;

RETURN l_org_name;

END//

DELIMITER ;

Usage: SELECT get_ods(pat.registered_practice_organization_id)

FROM patient pat;

• Function to retrieve the event type from the description of the concept table:

DROP FUNCTION IF EXISTS get_event_type;

DELIMITER //

CREATE FUNCTION get_event_type(p_description VARCHAR(400))

RETURNS VARCHAR(50)

NOT DETERMINISTIC NO SQL

BEGIN

DECLARE l_event VARCHAR(50);

SELECT substr(p_description,instr(p_description,"(") + 1,

instr(p_description,")") - instr(p_description,"(") - 1)

INTO l_event;

RETURN l_event;

END//

DELIMITER ;

Usage: SELECT get_event_type(get_concept_desc(ob.core_concept_id)) event_type

FROM observation ob;

• Function to retrieve the latest medication order issue date based on the medication statement id:

DROP FUNCTION IF EXISTS get_lastest_med_issue_date;

DELIMITER //

CREATE FUNCTION get_lastest_med_issue_date (p_medication_stmt_id bigint)

RETURNS DATE

NOT DETERMINISTIC READS SQL DATA

BEGIN

DECLARE l_date DATE;

DECLARE l_rnk INT;

-- get latest issue date

SELECT

med.clinical_effective_date, med.rnk INTO l_date, l_rnk

FROM (

SELECT

mo.id,

mo.medication_statement_id,

mo.clinical_effective_date,

@currank :=

IF(@curmedstmt = mo.medication_statement_id,@currank + 1,1) AS rnk,

@curmedstmt := mo.medication_statement_id AS cur_med_stmt

FROM medication_order mo,(SELECT @currank := 0, @curmedstmt := 0) r

WHERE mo.medication_statement_id = p_medication_stmt_id

ORDER BY mo.medication_statement_id DESC,

mo.clinical_effective_date DESC,

mo.id DESC -- latest

) med

WHERE med.rnk = 1;

RETURN l_date;

END//

DELIMITER ;

Usage: SELECT DATE_FORMAT(get_lastest_med_issue_date(med.id),"%Y%m%d") date_of_last_issue

FROM medication_statement med;

• Function to retrieve the earliest medication order issue date based on the medication statement id:

DROP FUNCTION IF EXISTS get_earliest_med_issue_date;

DELIMITER //

CREATE FUNCTION get_earliest_med_issue_date (p_medication_stmt_id bigint)

RETURNS DATE

NOT DETERMINISTIC READS SQL DATA

BEGIN

DECLARE l_date DATE;

DECLARE l_rnk INT;

-- get earliest issue date

SELECT

med.clinical_effective_date, med.rnk INTO l_date, l_rnk

FROM (

SELECT

mo.id,

mo.medication_statement_id,

mo.clinical_effective_date,

@currank :=

IF(@curmedstmt = mo.medication_statement_id,@currank + 1,1) AS rnk,

@curmedstmt := mo.medication_statement_id AS cur_med_stmt

FROM medication_order mo,(SELECT @currank := 0, @curmedstmt := 0) r

WHERE mo.medication_statement_id = p_medication_stmt_id

ORDER BY mo.medication_statement_id ASC,

mo.clinical_effective_date ASC,

mo.id ASC -- earliest

) med

WHERE med.rnk = 1;

RETURN l_date;

END//

DELIMITER ;

Usage: SELECT DATE_FORMAT(get_earliest_med_issue_date(med.id),"%Y%m%d") issue_date

FROM medication_statement med;

• Function to retrieve the event log last changed date based on the record id:

DROP FUNCTION IF EXISTS get_event_maxdate;

DELIMITER //

CREATE FUNCTION get_event_maxdate (p_table_id tinyint, p_record_id bigint)

RETURNS DATETIME

READS SQL DATA

BEGIN

DECLARE l_max_date DATETIME;

SELECT MAX(etlg.dt_change) modified_date INTO l_max_date

FROM event_log etlg

WHERE etlg.table_id = p_table_id AND record_id = p_record_id;

RETURN l_max_date;

END//

DELIMITER ;

Usage: SELECT get_event_maxdate(2,pat.id) FROM patient pat;

Where 2 is the table id for the patient table.

• Function to retrieve the event log last changed date for medication order based on the medication statement id:

DROP FUNCTION IF EXISTS get_event_maxdate_medo;

DELIMITER //

CREATE FUNCTION get_event_maxdate_medo (p_table_id tinyint, p_record_id bigint)

RETURNS datetime

READS SQL DATA

BEGIN

DECLARE l_max_date datetime;

SELECT MAX(etlg.dt_change) INTO l_max_date

FROM event_log etlg JOIN medication_order mo

ON mo.medication_statement_id = p_record_id

AND mo.id = etlg.record_id

WHERE etlg.table_id = p_table_id;

RETURN l_max_date;

END//

DELIMITER ;

SQL Statement Examples

Patient Information

SELECT

pat.id patient_id,

get_ods(pat.registered_practice_organization_id) source_organization,

IF (pat.date_of_death IS NULL,

TIMESTAMPDIFF(YEAR,pat.date_of_birth,CURDATE()),

TIMESTAMPDIFF(YEAR,pat.date_of_birth,pat.date_of_death)) age,

get_concept_desc(pat.gender_concept_id) sex,

DATE_FORMAT(epoc.startregdate,"%Y%m%d") start_regdate,

get_concept_code(epoc.registration_type_concept_id) reg_status,

pat.nhs_number nhsnumber,

pad.postcode,

DATE_FORMAT(pat.date_of_birth,"%Y%m%d") date_of_birth,

epoc.usualgp,

pat.first_names forename,

pat.last_name surname,

get_concept_desc(pat.ethnic_code_concept_id) ethnic_category,

DATE_FORMAT(pat.date_of_death,"%Y%m%d") date_of_death,

DATE_FORMAT(epoc.endregdate,"%Y%m%d") end_reg_date,

get_optout_status(pat.id, pat.registered_practice_organization_id) optout_status,

pad.address_line_1,

pad.address_line_2,

pad.address_line_3,

pad.address_line_4,

pad.city,

NULLIF(GREATEST(COALESCE(get_event_maxdate(2,pat.id),0),

COALESCE(get_event_maxdate(20,pad.id),0),

COALESCE(get_event_maxdate(6,epoc.id),0)),0) last_modified_date

FROM patient pat

LEFT JOIN patient_address pad ON pat.current_address_id = pad.id AND pat.id = pad.patient_id

LEFT JOIN

(SELECT

epc.id,

epc.organization_id,

epc.patient_id,

epc.registration_type_concept_id,

epc.registration_status_concept_id,

epc.date_registered startregdate,

epc.date_registered_end endregdate,

epc.usual_gp_practitioner_id,

prac.name usualgp

FROM episode_of_care epc LEFT JOIN practitioner prac

ON prac.id = epc.usual_gp_practitioner_id

AND prac.organization_id = epc.organization_id) epoc

ON epoc.patient_id = pat.id

AND epoc.organization_id = pat.registered_practice_organization_id;

Prescription Information

SELECT

med.id event_id,

get_ods(med.organization_id) source_organization,

med.patient_id patient_id,

pat.nhs_number nhs_number,

get_concept_desc(med.authorisation_type_concept_id) issue_type,

DATE_FORMAT(get_earliest_med_issue_date(med.id),"%Y%m%d") issue_date,

DATE_FORMAT(get_lastest_med_issue_date(med.id),"%Y%m%d") date_of_last_issue,

DATE_FORMAT(med.clinical_effective_date,"%Y%m%d") recorded_date,

get_concept_code(med.core_concept_id) dmdcode,

med.dose dosage,

med.quantity_value quantity,

med.quantity_unit quantity_unit,

get_concept_desc(med.core_concept_id) dmd_description,

NULLIF(GREATEST(COALESCE(get_event_maxdate(10,med.id),0),

COALESCE(get_event_maxdate_medo(9,med.id),0)),0) last_modified_date

FROM medication_statement med JOIN patient pat ON med.patient_id = pat.id;

Clinical Event Information

SELECT

ob.id event_id,

get_ods(ob.organization_id) source_organization,

ob.patient_id,

pat.nhs_number nhs_number,

DATE_FORMAT(ob.clinical_effective_date,"%Y%m%d") effective_date,

get_concept_code(ob.non_core_concept_id) original_code,

get_concept_desc(ob.non_core_concept_id) original_term,

get_concept_desc(get_concept_scheme(ob.non_core_concept_id)) original_code_scheme,

ob.result_value result_value,

CASE

WHEN ob.is_problem = 1 AND ob.is_review = 0 THEN 'N - New'

WHEN ob.is_problem = 1 AND ob.is_review = 1 THEN 'F - Follow Up'

ELSE 'O - Other'

END AS episode_type,

prac.role_desc hcp_role,

enc.type encounter_type,

get_concept_code(ob.core_concept_id) snomed_concept_id,

get_concept_desc(ob.core_concept_id) snomed_term,

ob.parent_observation_id parent_event,

get_ods(enc.organization_id) event_location,

get_org_name(enc.organization_id) location_name,

enc.appointment_id appointment_slot_id,

get_event_type(get_concept_desc(ob.core_concept_id)) event_type,

NULLIF(GREATEST(COALESCE(get_event_maxdate(11,ob.id),0),

COALESCE(get_event_maxdate(5,enc.id),0),

COALESCE(get_event_maxdate(13,prac.id),0)),0) last_modified_date

FROM observation ob JOIN patient pat ON ob.patient_id = pat.id

LEFT JOIN practitioner prac ON prac.id = ob.practitioner_id

LEFT JOIN encounter enc ON enc.id = ob.encounter_id;

Patient Registration History Information

SELECT

get_ods(epoc.organization_id) source_origanization,

pat.nhs_number,

get_concept_code(epoc.registration_status_concept_id) reg_status,

get_concept_desc(epoc.registration_status_concept_id) reg_status_desc,

get_concept_code(epoc.registration_type_concept_id) registration_type,

get_concept_desc(epoc.registration_type_concept_id) registration_type_desc,

DATE_FORMAT(epoc.date_registered,"%Y%m%d") date_registered,

DATE_FORMAT(epoc.date_registered_end,"%Y%m%d") date_registered_end,

epoc.id sequence_number,

NULLIF(COALESCE(get_event_maxdate(6,epoc.id),0),0) last_modified_date

FROM episode_of_care epoc JOIN patient pat ON epoc.patient_id = pat.id;

Patient Address Information

SELECT

padr.id address_id,

padr.address_line_1 address1,

padr.address_line_2 address2,

padr.address_line_3 address3,

padr.address_line_4 address4,

padr.city address5,

padr.postcode postcode,

parn.uprn,

DATE_FORMAT(padr.start_date,"%Y%m%d") start_date,

DATE_FORMAT(padr.end_date,"%Y%m%d") end_date,

padr.lsoa_2001_code,

padr.lsoa_2011_code,

padr.msoa_2001_code,

padr.msoa_2011_code,

pat.nhs_number,

parn.abp_address_number matched_address_1,

parn.abp_address_street matched_address_2,

parn.abp_address_locality matched_address_3,

parn.abp_address_town matched_address_4,

parn.abp_address_postcode matched_postcode,

parn.classification,

parn.abp_address_organization business_name,

parn.match_date uprn_match_date,

parn.status uprn_status,

parn.latitude,

parn.longitude,

NULLIF(COALESCE(get_event_maxdate(20,padr.id),0),0) last_modified_date

FROM patient_address padr JOIN patient pat ON padr.patient_id = pat.id

LEFT JOIN patient_address_match parn ON padr.id = parn.id;

Appointment Slot Information

SELECT

appt.id slot_id,

get_ods(appt.organization_id) source_origanization,

appt.schedule_id session_id,

DATE_FORMAT(appt.start_date,"%Y%m%d %T") start_time,

appt.planned_duration planned_duration,

appt.actual_duration actual_duration,

get_concept_desc(appt.appointment_status_concept_id) status,

DATE_FORMAT(appt.date_time_sent_in,"%Y%m%d %T") actual_start_time,

appt.patient_wait wait_time,

DATE_FORMAT(appt.cancelled_date,"%Y%m%d %T") date_cancelled,

DATE_FORMAT(appt.date_time_left,"%Y%m%d %T") time_left,

pat.nhs_number,

NULLIF(COALESCE(get_event_maxdate(18,appt.id),0),0) last_modified_date

FROM appointment appt JOIN patient pat ON appt.patient_id = pat.id;

Encounter Information

SELECT

enc.id encounter_id,

get_ods(enc.organization_id) source_organization,

pat.nhs_number,

DATE_FORMAT(enc.clinical_effective_date,"%Y%m%d %T") encounter_startdate,

DATE_FORMAT(enc.end_date,"%Y%m%d %T") encounter_enddate,

get_concept_desc(enc.non_core_concept_id) original_type,

prac.role_desc hcp_role,

enc.type encounter_type,

get_ods(enc.service_provider_organization_id) event_organization,

enc.appointment_id appointment_slot_id,

NULLIF(GREATEST(COALESCE(get_event_maxdate(5,enc.id),0),

COALESCE(get_event_maxdate(13,prac.id),0)),0) last_modified_date

FROM encounter enc JOIN patient pat ON enc.patient_id = pat.id

LEFT JOIN practitioner prac ON prac.id = enc.practitioner_id;

Organization Information

SELECT

org.id organization_id,

org.ods_code,

org.name organization_name,

org.type_code,

org.type_desc,

org.postcode,

org.parent_organization_id,

NULLIF(COALESCE(get_event_maxdate(12,org.id),0),0) last_modified_date

FROM organization org;

Appointment Session Information

SELECT

sch.id session_id,

sch.name session_name,

get_ods(sch.organization_id) source_organization,

prac.role_desc hpc_role,

DATE_FORMAT(sch.start_date,"%Y%m%d %T:%f") start_time,

sch.type session_type,

sch.location location,

NULLIF(GREATEST(COALESCE(get_event_maxdate(17,sch.id),0),

COALESCE(get_event_maxdate(13,prac.id),0)),0)

last_modified_date

FROM schedule sch JOIN practitioner prac ON prac.id = sch.practitioner_id;

Remote Subscriber Database (RSD) SQL guide (Compass 2)

Contents

Introduction

Creating the Patient Cohort

General Patient Cohort

Cohort with Filtered Observations

3. Creating the Observations Cohort

4. Creating the Medications Cohort

5. Building Datasets

Patient Demographic Dataset

Patient Observations Dataset

Patient Medications Dataset

6. Calling Stored Procedures

Calling the Populate Stored Procedures

Retrieving the SNOMED Concept Ids

Calling the storeSnomedString procedure

Calling the getAllSnomedsConceptIds procedure

Retrieving filtered Observations

Calling the filterObservations procedure

Calling the createObservationsFromCohort procedure

Filter Observations by Timeframe

Updating Dataset Table Columns

APPENDIX

Helper Functions

SQL Statement Examples

Navigation menu

Remote Subscriber Database (RSD) SQL guide (Compass 2)

Introduction

Creating the Patient Cohort

General Patient Cohort

Cohort with Filtered Observations

3. Creating the Observations Cohort

4. Creating the Medications Cohort

5. Building Datasets

Patient Demographic Dataset

Patient Observations Dataset

Patient Medications Dataset

6. Calling Stored Procedures

Calling the Populate Stored Procedures

Retrieving the SNOMED Concept Ids

Calling the storeSnomedString procedure

Calling the getAllSnomedsConceptIds procedure

Retrieving filtered Observations

Calling the filterObservations procedure

Calling the createObservationsFromCohort procedure

Filter Observations by Timeframe

Updating Dataset Table Columns

APPENDIX

Helper Functions

SQL Statement Examples

Navigation menu

Search