Information Model Mapping (v1)

From Discovery Data Service
Jump to navigation Jump to search

Introduction

The information model mapping API is used, when given contextual data, to return both a property concept (representing the context) and a value concept (representing the value within that context). The structure of the mapping module allows multiple contexts to reference a common mapping node (for example when the same type of value is received from multiple sources) and also multiple value types to be referenced from a common mapping node.

Logical Structure

Logical Layout

The initial phase of the mapping process is to identify the context within which the mapping is operating. This is given as a combination of the supplying organisation, the originating system, the schema/format from that system (if multiple are supported) and the data location within that schema (note that though the terms "Table" and "Column" are used, they could equally be "CSV Filename" and "Field name" for example). Multiple map contexts can reference a single map node, for example a CDS inpatient admission code from Barts/Cerner would reference the same node as a from Homerton/Cerner as they are in fact the same concept ("CDS Inpatient Admission Code"). A treatment function code however would reference different nodes for Barts and Homerton as are they are based on local cerner codes.

Once the map node is identified, this gives the "Property" concept for the context, for example "DM_methodOfAdmssion | has admission method".

Given the context of the property, further context is provided via the value type. This allows multiple input maps for the same property context. For example, the "Method of admission" map node could have a value type (code scheme) of "Snomed", "NHS Data Dictionary", "Barts local" or null for free-text matching. For each given value type, a function must be provided. There are currently just 2 supported functions: -

Name Description
Lookup() Uses the "Value Lookup" table to determine the value concept
Format(<format>) Uses a simple string formatting function to generate the concept iri based on the input data

Examples

Organisation System Schema Table/File Column/Field Scheme Value
Barts Cerner CDS inpatient admission_method_code NHS Data Dictionary 21

The Organisation, System, Schema, Table and Columns are used to identify the initial context and thus, in this case, the `/CDS/INPTNT/ADMISSN_MTHD` map node. That map node gives us the property concept (DM_methodOfAdmssion | has admission method). Next the scheme is used to determine the value function, in this case "Lookup()" so the Value is used in the value lookup table to retrieve the final value concept (CM_AdmMetCasSame | Emergency admission from same trust). Note that the same data from Homerton cerner would reference the same common map node as the data within is the same.

Organisation System Schema Table/File Column/Field Scheme Value
Barts Cerner CDS inpatient treatment_function BartsCerner 123456

In this case, the the initial context would lead to the `/BRTS/CRNR/CDS/INPTNT/TRTMNT_FCTN` map node. The property concept associated with that node is "DM_treatmentFunctionAdmit | Treatment function". Here however, the scheme is known to be a Barts local code and so the function is "Format(BC_%s)". Rather than perform a lookup, a simple string format is performed on the code, resulting in the concept "BC_123456".

New/unknown data.

If the mapping API is called with new/unknown context or codes, then new nodes and concepts will be generated and marked as draft for later evaluation.

Organisation System Schema Table/File Column/Field Scheme Value
BHRUT Medway MedwayBI PMI CAUSEOFDEATH Medway A1

On failing to identify an existing context & map node, a new draft context would be created and ponted to a new draft map node(/BHRUT/MDWY/MDWYBI/PMI/CSFDTH). For the map node, a new draft property concept would be created with an iri and description based on the context in the format "LP_<Org>_<Sys>_<Scm>_<Tbl>_<Col>_<Dbid>". In this example, "LP_BHR_MDW_MDW_PMI_CSF_nnnnnn | Legacy property concept for /BHRUT/MDWY/MDWYBI/PMI/CSFDTH" (where "nnnnn" is the concepts dbid to ensure uniqueness). The function for the node will default to "Lookup()".

The final stage is to generate a new, draft value concept (in the format "LPV_<Org>_<Sys>_<Scm>_<Tbl>_<Col>_<Scm>_<Val>_<Dbid>") and an appropriate value node entry created to reference it.

Note: The IRI generation uses 3 character shortening (after stripping vowels) but the full context information is kept in the concept name for clarity