Mapping and matching concepts: Difference between revisions

From Discovery Data Service
Jump to navigation Jump to search
(Created page with "A general language consists of a vocabulary of words arranged according to a syntax that follows grammatical rules. Information consists of ideas. Another word for ideas is a...")
 
No edit summary
Line 1: Line 1:
A general language consists of a vocabulary of words arranged according to a syntax that follows grammatical rules.
Information consists of ideas. Another word for an idea is a 'concept' . A concept may be named, in which case the meaning of the concept can usually be understood by the name, or they may be unnamed expression which is a set of interrelated named or unnamed concepts.  


Information consists of ideas. Another word for ideas is a 'concept' . A concept may be named, in which case the meaning of the concept can usually be understood by the name, or they may be unnamed (an "expression") which is a definition made up from other named or unnamed concepts. The term "chest pain" implying the idea of a pain in the chest is one example of a named concepts. "Cheat pain, worsened by exercise" is an example of an expression style concept made up from the concept of "chest pain", the statement that it is "made worse by" and the statement that it was made worse by "exercise".
For example the term "chest pain" implies the idea of a pain in the chest is a named concepts. "Chest pain, worsened by exercise" may be an example of an expression style concept made up from the concept of "chest pain", the statement that it is "made worse by -> exercise".


The new generation of health record management systems tend towards the recording of concepts with an objective to closely match the idea behind the entry. These types of concepts are often called term based concepts.
The new generation of health record management systems tend towards the recording of concepts, with the objective being for the record entry to closely match the idea behind the entry. These types of concepts can be called term based concepts.


A modern term based concept is defined in relation to other concepts by a set of assertions indicating whether the concept is equivalent to, or a subtype of, a set of other concepts. The standard approach to this is via the use of Description Logic (DL). By using DL, a computer can automatically classify a concept. Snomed-CT is the worlds largest ontology of term based concepts.
A modern term based concept is defined in relation to other concepts by a set of assertions indicating whether the concept is equivalent to, or a subtype of, a set of other concepts. The standard approach to this is via the use of Description Logic (DL). By using DL, a computer can automatically classify a concept. Snomed-CT is the worlds largest ontology of term based concepts. A collection of concepts defined in this way constitute an "Ontology".


Coded concepts, originate from a different starting point. The intention of a coded entry is to ''pre-classify'' an entry before it is recorded. The classification marker or code, is designed for a particular set of business processes e.g. analytics or payment. A coded concept, being pre-classified, relies on categorisation of the codes, which may or may not imply subtypes. Consequently, as the philosophy is different, code based concepts have to be dealt with differently from term based concepts.
Coded concepts, originate from a different starting point. The intention of a coded entry is to ''pre-classify'' an entry before it is recorded. The classification marker or code is designed for a particular set of business processes e.g. analytics or payment. A coded concept, being pre-classified, relies on categorisation of the codes, which may or may not imply subtypes. Consequently, as the philosophy is different, code based concepts have to be dealt with differently from term based concepts, even if they seem to saying the same thing.


Because of their history, it is not always possible to assert the exact meaning of a code based concept. However, it is often the case that meaning can be inferred or approximated from a coded entry. This inference is achieved via the use of maps.
Because of their history, it is not always possible to assert the exact meaning of a code based concept. However, it is often the case that meaning can be inferred or approximated from a coded entry. This inference is achieved via the use of maps.


A map is a sort of statement that something is possibly or probably similar to something else. It has much less weight than an asserted relationship. Code based concepts can be mapped to term based concepts which enable the use of the vast volumes of data already recorded in systems. Maps generally fall into 4 patterns. These are illustrated in the context of code based concepts as follows:
A map is a sort of statement that something is possibly or probably similar to something else. It has much less weight than an asserted relationship. Code based concepts can be mapped to term based concepts, which enable the use of the vast volumes of data already recorded in systems. Maps must be uswed with care as it is almost always the case that the use of a mapped code in a query is dependent on the purpose of the query. This means that mappings are more of a guide to the things to include rather than a confident statement of meaning.


* A coded concept has one map which is mapped to one term based concept with a certain weighting or map category. For example the icd10 code for Angina has a map which maps to a single term based Snomed-CT concept of angina, with a category indicating that the source concept is properly classified. Note that many coded concepts may be mapped to one single term based concept. The map is viewed from the perspective of the coded concept.
Maps generally fall into 4 patterns. These are illustrated in the context of code based concepts as follows:
 
* A coded concept may have one map which is mapped to one term based concept, the mapping having a certain weighting or category. For example the icd10 code for Angina may have a map which maps to a single term based Snomed-CT concept of angina, with a category indicating that the source concept is properly classified. Note that many coded concepts may be mapped to one single term based concept. The map is viewed from the perspective of the coded concept.


  icd10:I209 |Angina Pectoris (ICD10 I20.9|  
  icd10:I209 |Angina Pectoris (ICD10 I20.9|  
             has map [mappedTo sn:194828000 |Angina (disorder);
             hasMap [mappedTo sn:194828000 |Angina (disorder);
            mapCategory sn:447637006 |Map source concept is properly classified]
                    mapCategory sn:447637006 |Map source concept is properly classified]
A coded concept has more than one map and each map may map to ore than one potential term based concept
 
* A coded concept may have more than one map and each map may map to more than one potential term based concept i.e. a union of concepts
<pre>
icd10:E140| Unspecified diabetes mellitus with coma
          //This maps to a number of target concepts
  :hasMap  [mappedTo UnionOf [
                          sn:26298008|Ketoacidotic coma due to.....,
                          sn:421725003|Hypoglycemic coma due to diabetes mellitus];


The approach taken in Discovery is to classify according to "sets" and thus adopting the approach taken by modern ontologies. A set is a definition of a set of things that have the same properties ( i.e. a class and a set are the same thing). Sets of ideas may contain other subsets which are objects that have the same and more specific properties than the super set (or super class) i.e. a subclass of a superclass.
            mapCategory  sn:447637006 |Map source concept is properly classified ]
           
            //This map is dependent on the context this map is used in
            [mappedTo sn:267384006 |Coma due to hypoglycemia|;
            mapCategory sn:447639009 |Map of source concept is context dependent]
</pre>


Putting together RDF and sets, the net result aligns with RDF, RDFS and OWL2 i.e. the ontology web language. The vocabulary of OWL2 is used to precisely define concepts in relation to other concepts. OWL2 uses an underlying idea of "Description Logic" which is a way of defining things in a logical and consistent way so that a classification can be reliably produced.
* An unnamed concept consisting of a combination of coded concepts (e.g. A and B) has a map which maps to a term based concept.

Revision as of 12:02, 5 April 2021

Information consists of ideas. Another word for an idea is a 'concept' . A concept may be named, in which case the meaning of the concept can usually be understood by the name, or they may be unnamed expression which is a set of interrelated named or unnamed concepts.

For example the term "chest pain" implies the idea of a pain in the chest is a named concepts. "Chest pain, worsened by exercise" may be an example of an expression style concept made up from the concept of "chest pain", the statement that it is "made worse by -> exercise".

The new generation of health record management systems tend towards the recording of concepts, with the objective being for the record entry to closely match the idea behind the entry. These types of concepts can be called term based concepts.

A modern term based concept is defined in relation to other concepts by a set of assertions indicating whether the concept is equivalent to, or a subtype of, a set of other concepts. The standard approach to this is via the use of Description Logic (DL). By using DL, a computer can automatically classify a concept. Snomed-CT is the worlds largest ontology of term based concepts. A collection of concepts defined in this way constitute an "Ontology".

Coded concepts, originate from a different starting point. The intention of a coded entry is to pre-classify an entry before it is recorded. The classification marker or code is designed for a particular set of business processes e.g. analytics or payment. A coded concept, being pre-classified, relies on categorisation of the codes, which may or may not imply subtypes. Consequently, as the philosophy is different, code based concepts have to be dealt with differently from term based concepts, even if they seem to saying the same thing.

Because of their history, it is not always possible to assert the exact meaning of a code based concept. However, it is often the case that meaning can be inferred or approximated from a coded entry. This inference is achieved via the use of maps.

A map is a sort of statement that something is possibly or probably similar to something else. It has much less weight than an asserted relationship. Code based concepts can be mapped to term based concepts, which enable the use of the vast volumes of data already recorded in systems. Maps must be uswed with care as it is almost always the case that the use of a mapped code in a query is dependent on the purpose of the query. This means that mappings are more of a guide to the things to include rather than a confident statement of meaning.

Maps generally fall into 4 patterns. These are illustrated in the context of code based concepts as follows:

  • A coded concept may have one map which is mapped to one term based concept, the mapping having a certain weighting or category. For example the icd10 code for Angina may have a map which maps to a single term based Snomed-CT concept of angina, with a category indicating that the source concept is properly classified. Note that many coded concepts may be mapped to one single term based concept. The map is viewed from the perspective of the coded concept.
icd10:I209 |Angina Pectoris (ICD10 I20.9| 
            hasMap [mappedTo sn:194828000 |Angina (disorder);
                    mapCategory sn:447637006 |Map source concept is properly classified]
  • A coded concept may have more than one map and each map may map to more than one potential term based concept i.e. a union of concepts
icd10:E140| Unspecified diabetes mellitus with coma
           //This maps to a number of target concepts
  :hasMap  [mappedTo UnionOf [
                          sn:26298008|Ketoacidotic coma due to.....,
                          sn:421725003|Hypoglycemic coma due to diabetes mellitus];

             mapCategory  sn:447637006 |Map source concept is properly classified ]
            
             //This map is dependent on the context this map is used in
            [mappedTo sn:267384006 |Coma due to hypoglycemia|;
             mapCategory sn:447639009 |Map of source concept is context dependent]
  • An unnamed concept consisting of a combination of coded concepts (e.g. A and B) has a map which maps to a term based concept.