Data map API: Difference between revisions

From Discovery Data Service
Jump to navigation Jump to search
Line 28: Line 28:


== Mapping Convergence ==
== Mapping Convergence ==
[[File:MapClass.jpg|thumb]]
The mapping class 'Domain mapping node' sums up the idea that a number of different source structures can in the end converge and map to a single target structure, by dint of the common domain in which the map occurs, and the common context in which a particular set of sources share.


The primary object is the 'Domain mapping node' which has a uniquely identified context to which a number of sources map to. Each different source is referred to as a 'map source' and each map source has a number of 'map slots', all of which must be filled in order for the map source to be established. Having established the mapping node object, then the map target can be generated by mapping the source value to a target value.
Mapping convergence is the means by which we try to rationalise the huge number of source types to a fewer number in order to improve map making.


For example, let us say that we are trying to map a drug code from an EMIS drug issue to the common model. We understand that the code comes from Chrisp Street Health Centre, which uses EMIS Web, and the source table in question is the Prescribing issue table,  with field of Codeid, and the code value is '12345'.
Let us say that we are trying to map a drug code from an EMIS drug issue to the common model. We understand that the code comes from Chrisp Street Health Centre, which uses EMIS Web, and the source table in question is the Prescribing issue table,  with field of Codeid, and the code value is '12345'.


Likewise, we are also trying to map a drug code from a TPP practice source. We understand that the code comes from the Parkdale medical centre, that they use TPP SystmOne, that the source table in question is the PrimaryCareMedication table , the field is code and the id is 232000001108.
Likewise, we are also trying to map a drug code from a TPP practice source. We understand that the code comes from the Parkdale medical centre, that they use TPP SystmOne, that the source table in question is the PrimaryCareMedication table , the field is code and the id is 232000001108.
Line 59: Line 57:


=== Domain mapping node ===
=== Domain mapping node ===
[[File:MapClass.jpg|thumb]]
The mapping class 'Domain mapping node' sums up the idea that a number of different source structures can in the end converge and map to a single target structure, by dint of the common domain in which the map occurs, and the common context in which a particular set of sources share.
The primary object is the 'Domain mapping node' which has a uniquely identified context to which a number of sources map to. Each different source is referred to as a 'map source' and each map source has a number of 'map slots', all of which must be filled in order for the map source to be established. Having established the mapping node object, then the map target can be generated by mapping the source value to a target value.
This class provides information about the mapping domain and the context within which a particular mapping will take place. This is the target object for a source. For example, this is the convergent context object for the medication requests from a number of supplier systems
This class provides information about the mapping domain and the context within which a particular mapping will take place. This is the target object for a source. For example, this is the convergent context object for the medication requests from a number of supplier systems



Revision as of 11:52, 28 May 2020

This page is under review The mappings package is one of the 5 main component categories in the information model.

Information model packages - mappings.png

For the information model to be even more useful than simply a reference, it is helpful to be able to map published data to the common model, and map the common model content to implemented databases. This provides a mechanism of resolving many to many relationships between a source and a destination whilst at the same time 'cleaning' the data on the way to provide a standard.

In order to support this, there are two main data resources required.

  1. A map that takes some source published data and outputs the data in the common model format.
  2. A map that takes some common model data and outputs the data to a database that holds an actual implementation of part of, or the whole of, the model.

For this to operate, a mapping server API is required so that a client wishing to transform some data from A to B, is able to obtain the information in a computable manner.

Fundamentals

Mapping pipeline

At its most basic level the objective of a mapping process is to take some source value and produce a target structure, which provides information about the object, property, and target value that the source value maps to.

For the source value to be understood, it is necessary to provide some context to it. All values are therefore set in some form of context. As a minimum this context would be the source property i.e. the property for which the value is a value of. In many cases though, the context will be much more extensive.

For example, the word 'negative, set in the field 'result text' associated with code value '12345' set in the property 'test', set in a table of 'clinical events' used in the system 'Cerner Millennium', in the hospital 'Barts NHS trust', may mean something completely different to 'negative' with the code '12345' set in another hospital, even with the same system.

A further layer of context includes the Domain in which the mapping takes place. A map generated for one purpose may be different when generated for another purpose.

For example, when processing published data into Discovery, the domain in question could be described as the 'inbound publisher mapping domain'.

Consequently, the implicit idea of a context and a domain, is explicitly modelled as a 'Domain map context' object. The class, and the various supporting classes are described in the following sections.

Mapping Convergence

Mapping convergence is the means by which we try to rationalise the huge number of source types to a fewer number in order to improve map making.

Let us say that we are trying to map a drug code from an EMIS drug issue to the common model. We understand that the code comes from Chrisp Street Health Centre, which uses EMIS Web, and the source table in question is the Prescribing issue table, with field of Codeid, and the code value is '12345'.

Likewise, we are also trying to map a drug code from a TPP practice source. We understand that the code comes from the Parkdale medical centre, that they use TPP SystmOne, that the source table in question is the PrimaryCareMedication table , the field is code and the id is 232000001108.

We also know that EMIS provides a look up table between their code and DM&D, but SystmOne provides the DM&D code itself.

We know that the IM target we are seeking is the

{"Class" : ":DM_MedicationRequest",
 "Property" : ":DM_hasRequestedMedication",
  "Value" : "SN_232000001108"}

The first thing we recognise is that the context of the above two sources appears to be nearly equivalent. Both are equivalent to the same target structure and both will end up with the same target value when mapped from DM&D. There is a variation in EMIS in that before getting to exactly the same context, there is a prior step to perform, the map between EMIS 'X' and the DM+D 'Y'. However, if that mapping were to occur first, then the two context's would be exactly equivalent.

It appears that there is some form of convergence from two sources. This can be illustrated in the following way:

Convergent context.jpg

It can be seen that by starting at two sources, the first convergence is that from ALL providers, IF the system is EITHER EMIS Web OR system one, and IF the table is EITHER Prescription issue, OR SRPrimary careMedication, THEN we have converged at this point.

The SystemOne drug and value is output as a target and carried forward as a source into the next node

However, EMIS has another mapping context (of code id and code) which requires a look up. Thus the EMIS value now needs a look up (to get the DMD) via another mapping node. The output target of this map is property of drug and a DM+D code.

The input mapping source of Drug+DMD code now acts as a slot to the the now fully converged mapping node, (together with the already created output from the converged map context), and the only remaining look up is the DM&D to common model. This is generated as an output IM target

Domain mapping node

MapClass.jpg

The mapping class 'Domain mapping node' sums up the idea that a number of different source structures can in the end converge and map to a single target structure, by dint of the common domain in which the map occurs, and the common context in which a particular set of sources share.

The primary object is the 'Domain mapping node' which has a uniquely identified context to which a number of sources map to. Each different source is referred to as a 'map source' and each map source has a number of 'map slots', all of which must be filled in order for the map source to be established. Having established the mapping node object, then the map target can be generated by mapping the source value to a target value.

This class provides information about the mapping domain and the context within which a particular mapping will take place. This is the target object for a source. For example, this is the convergent context object for the medication requests from a number of supplier systems

{"DomainMap": {
      "DomainIRI": ":InboundPublisherMaps",
      "ContextIRI": ":DMDMedOrders",
      "Description": "Convergent map for mapping inbound medication order codes"
    },

Many source map classes map to one map context class and each of the source objects must have its slots filled for the source object to be valid. A map source slot may be another context object i.e it "inherits" converged contexts from other maps.



Thus mappings may be used by both subscriber databases and publisher databases, inside and outside of Discovery.

Furthermore, it can be seen from the logical 2 step mappings that it is equally practical for clients to consider a direct map from source to destination knowing that it has mapped to the common model as part of the process. This contrasts this style of mapping to conventional integration mappings that map from many to many directly. In other words by mapping in two stages we get a series of one to one maps which appear to be one to many

To more easily demonstrate how mappings work, there is a working example showing a walk through of the use of the mapping API using the resource examples illustrated below

Target DB schema resources

DB Schema class

Before doing any mappings, it is necessary to model a target schema in order to map to it.

Implementation schema resources are a set of objects of the class DBSchema (to the right)

The class is designed as a simple entity relationship class with 2 additional properties:

  1. The name of the table's extension tables. These are optional triple tables designed so that a schema can continue to extend to additional properties and values using the information model to determine the properties and data types. This avoids the need to continually change the relational schema with new data items.
  2. The name of the field holding the subtype indicator. This is described as the entity subtype attribute.

The following is an example of a snippet from an encounter table:

A schema table example showing extension table and subtype field

{"DBSchema": {
      "DBSchemaName": "Compass_version_1",
      "DBTable": {
        "DBTableName": "encounter",
        "DBExtensionTable": {
          "DBTableName": "encounter_extension"
        },
        "DBSubTypeField": "type" } } }

The encounter table is expecting subtypes to be authored and therefore has a "subtype" field authored in the table in order to avoid generating many subtype tables.

Original source Resources

Every map has a source and target. From the perspective of the information model an 'original source' represents a data model created from a publisher's source data i.e. is likely to be a relational or json representation of publisher data that might have been originally delivered as HL7 V2, XML, JSON, CSV or pipe delimited flat files. Source resources are therefore not representations of the actual data, but representations of a model that would be used when transforming to the common model. An example of this is a staging table.

Source resource description

It is assumed that a source may contain many tables, each with many fields, each with many values including text. It is not necessary for there to be actual tables, and fields and any object structure, masquerading as such, can be used. The terms 'table' and 'field' are used for convenience and refer to objects and properties just as well.

There many be differences between one provider and another using the same system, and different versions of the system. Thus there is a need to provide context for each source resource.

Each element of source data must explicitly inherit the context so that the mapping API can recognise the context with each request.

The Original source resource object reflects a single logical thing to map. In most cases this will be a single field and single value. However, in some cases (such as free text sources), the source is derived from a list of fields, each with certain values.


For example, a piece of text saying "negative", when contextualised as a result against a test for 'Hepatitis B surface antigen' would use compound context consisting of the table, the test field, the test code, the result field and the result text of negative.

An example original source object from a CDS admitted patient care record with a value of '1' for the admitted patient classification

 { "OriginalSource": {
          "Provider": "Barts",
          "System": "CernerMillenium",
          "Context": {
            "id": 1,
            "Table": "APC",
            "Field": "PATIENT_CLASSIFICATION_CODE",
            "Value": 1 } }

The original source resource would be used as part of a mapping request submitted via the API

Information model target resource

The common information model target resource is the target of a map from an original source.

The resource is delivered as part of the IM mapping API response.

Target resource for a map

The target resource indicates the class and properties of the target object, the object may need creating or adding to with the target property, and the target value.

In some cases multiple sources may link to one target and in other cases multiple targets may link to one source, the link item being the "from id" link fro the target.


An example target resource from a request containing an original source object

  {"IMTarget": {
          "Fromid": 1,
          "Class": ":CM_HospitalInpAdmitEncounter",
          "DependentRelationship": {
            "Relationship": ":RM_isComponentOf",
            "Class": ":DM_HospitalInpEntry"
          },
          "PropertyValue": {
            "Property": ":DM_admissionPatientClassification",
            "Value": ":CM_AdmClassOrdinary"  } }}

The class, properties and values of the IM resource all reference IM concepts.

This resource says that the target IM class is a "hospital inpatient admission encounter type". This object is dependent on the presence of a container encounter, in this case of type "hospital in patient stay" and the relationship between them is 'is subcomponent of' . The field value source results in the property of 'admission patient classification; and the value being 'Ordinary admission'.

Note that the information model resource uses subtypes as classes, in line with the ontology. It avoids the complexity involved in populating database schemas. However, the DB target resource does include the specific instructions as to how to populate the types.

Map Request

For a map to be provided it needs to be requested and it is the job of the IM mapping API to respond to a request.

Mapping resource with 3 map optoins

A mapping resource is used both as a request and a reference i.e. can be exported as a set of maps or mapped on request.

Note that as the mapping API is designed to be used as individual requests for individual values, this class does not inherit properties or classes in the source or target. A typical map request using the above example could be:


An example request from a source system to a DB target

 {"Mapping":[   {
        "Request": "SourcetoDB",
        "OriginalSource": {
          "Provider": "Barts",
          "System": "CernerMillenium",
          "Context": {
            "id": 1,
            "Table": "AdmittedPatientCare",
            "Field": "PatientClassificationCode",
            "Value": 1
          }
        },
        "TargetDBSchema": "Compass_version1" } ] }

Bringing it all together

A working example of the above is illustrated as a mapping working example for a client wishing to go from a source to a database target