Subsumption test: Difference between revisions

From Discovery Data Service
Jump to navigation Jump to search
No edit summary
No edit summary
 
(9 intermediate revisions by the same user not shown)
Line 6: Line 6:
Subsumption testing is the mainstay of health record query. It is the major differentiator of health care records from conventional database records that have simple codes or data types as values.
Subsumption testing is the mainstay of health record query. It is the major differentiator of health care records from conventional database records that have simple codes or data types as values.


To take an example, when examining an entry in a health record that purports to be a diganosis, one could ask "is this diagnosis a communicable disease caused by a virus?".
To take an example, when examining an entry in a health record that purports to be a diagnosis, one could ask "is this diagnosis a communicable disease caused by a virus?".


Conversely, one might be interested in finding all the entries in a population of health records that are "communicable diseases caused by a virus", or even a question against an ontology such as "Get me a list of all communicable diseases caused by a virus".
Conversely, one might be interested in finding all the entries in a population of health records that are "communicable diseases caused by a virus", or even a question against an ontology such as "Get me a list of all communicable diseases caused by a virus".
Line 12: Line 12:
The challenge to overcome is that in most cases, the relevant concepts that are X are not within a hierarchy of codes as specified by Y. There is no classification of diseases organised by whether they are communicable AND caused by viruses. Instead it is necessary to use a process called  [https://en.wikepedia.org/wiki/inference inference] to determine whether something is subsumed by something.
The challenge to overcome is that in most cases, the relevant concepts that are X are not within a hierarchy of codes as specified by Y. There is no classification of diseases organised by whether they are communicable AND caused by viruses. Instead it is necessary to use a process called  [https://en.wikepedia.org/wiki/inference inference] to determine whether something is subsumed by something.


To make this work it is usually necessary to have a concept Y, properly defined using an [https://en.wikepedia.or/wiki/axiom axiom] defined in the ontology language [https://www.w3.org/TR/owl2-syntax OWL2]
To make this work it is usually necessary to have a concept Y, properly defined using an [https://en.wikepedia.or/wiki/axiom axiom] defined in something like the ontology language [https://www.w3.org/TR/owl2-syntax OWL2]


For example, the axiom (Manchester syntax) : 
For example, the axiom (Manchester OWL syntax) : 


<span style="color:#8e44ad;">Covid 19 {</span><span style="color:#27ae60;">EquivalentTo</span><span style="color:#8e44ad;">&nbsp;: Disease&nbsp;</span>
<span style="color:#8e44ad;">Covid 19 {</span><span style="color:#27ae60;">EquivalentTo</span><span style="color:#8e44ad;">&nbsp;: Disease&nbsp;</span>
Line 22: Line 22:
<span style="color:#8e44ad;">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;</span><span style="color:#27ae60;">and</span><span style="color:#8e44ad;">(has_pathological_process </span><span style="color:#27ae60;">some</span><span style="color:#8e44ad;">&nbsp;infectious_process)}.</span>
<span style="color:#8e44ad;">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;</span><span style="color:#27ae60;">and</span><span style="color:#8e44ad;">(has_pathological_process </span><span style="color:#27ae60;">some</span><span style="color:#8e44ad;">&nbsp;infectious_process)}.</span>


Thus a search for diseases that are infectious and have causative agents that are viruses would pick up Covid 19 because coranvirus 2 is a siubclass of virus.&nbsp;&nbsp;
Thus a search for diseases that are infectious and have causative agents that are viruses would pick up Covid 19 because coronavirus 2 is a subclass of virus.&nbsp;&nbsp;
 
The following show the same expression in the different syntaxes supported by Discovery
 
<div class="toccolours mw-collapsible mw-collapsed">
Discovery syntax
 
<div class="mw-collapsible-content">
<syntaxhighlight lang="JSON">
{"iri" :" :840539006 |Disease caused by severe acute respiratory syndrome coronavirus 2 (disorder)",
"EquivalentTo": {
  "Intersection": [{
    "Class": "64572001 |Disease (disorder)",
    "ObjectSome": {
      "Property": "609096000 |Role group (attribute)",
      "Intersection" : [{
        "ObjectSome": {
          "Property":"246075003 |Causative agent (attribute)",
  "Class" : "840533007 |Severe acute respiratory syndrome coronavirus 2 (organism)" }},
        {"ObjectSome": {
          "Property":"370135005 |Pathological process (attribute)",
          "Class":"441862004 |Infectious process (qualifier value)"}}]}}]}}
</syntaxhighlight>
 
</div></div>
 
 
<div class="toccolours mw-collapsible mw-collapsed">
OWL2 Functional syntax
 
<div class="mw-collapsible-content">
 
<pre>
EquivalentClasses(
:840539006 |Disease caused by severe acute respiratory syndrome coronavirus 2 (disorder)|
ObjectIntersectionOf(
:64572001 |Disease (disorder)|
ObjectSomeValuesFrom(
:609096000 |Role group (attribute)|
ObjectIntersectionOf(
ObjectSomeValuesFrom(
:246075003 |Causative agent (attribute)|
:840533007 |Severe acute respiratory syndrome coronavirus 2 (organism)|
)
ObjectSomeValuesFrom(
:370135005 |Pathological process (attribute)|
:441862004 |Infectious process (qualifier value)|)))))
 
</pre></div></div>
 
 
<div class="toccolours mw-collapsible mw-collapsed">
Snomed-CT compositional grammar
<div class="mw-collapsible-content">
<pre>
=== 186747009 |Coronavirus infection (disorder)| :
  { 370135005 |Pathological process (attribute)| = 441862004 |Infectious process (qualifier value)|,
    246075003 |Causative agent (attribute)| = 840533007 |Severe acute respiratory syndrome coronavirus 2 (organism)| }
 
</pre></div></div>


In practice, one of two types of techniques are employed when using subsumption testing in queries:
In practice, one of two types of techniques are employed when using subsumption testing in queries:
Line 30: Line 89:
'''Is concept X subsumed by Y.&nbsp;'''&nbsp;In this case the concept X definition is used against the concept Y definition, such&nbsp; test usually including all other definitions in the ontology that are subsumed by Y.&nbsp; This is more commonly used in decision support.
'''Is concept X subsumed by Y.&nbsp;'''&nbsp;In this case the concept X definition is used against the concept Y definition, such&nbsp; test usually including all other definitions in the ontology that are subsumed by Y.&nbsp; This is more commonly used in decision support.


Thus subsumption testing requires an information model that contains a properly authored ontology
Thus subsumption testing requires an information model that contains a properly authored ontology.

Latest revision as of 09:00, 17 May 2020

A subsumption test is the act of testing X to see if is subsumed by Y. The question can also be reversed by asking "Does Y subsume X?".

if X is subsumed by Y it means that all objects of type X are also objects of type Y. Class X is thus either a subclass of class Y, or is equivalent to class Y

Subsumption testing is the mainstay of health record query. It is the major differentiator of health care records from conventional database records that have simple codes or data types as values.

To take an example, when examining an entry in a health record that purports to be a diagnosis, one could ask "is this diagnosis a communicable disease caused by a virus?".

Conversely, one might be interested in finding all the entries in a population of health records that are "communicable diseases caused by a virus", or even a question against an ontology such as "Get me a list of all communicable diseases caused by a virus".

The challenge to overcome is that in most cases, the relevant concepts that are X are not within a hierarchy of codes as specified by Y. There is no classification of diseases organised by whether they are communicable AND caused by viruses. Instead it is necessary to use a process called  inference to determine whether something is subsumed by something.

To make this work it is usually necessary to have a concept Y, properly defined using an axiom defined in something like the ontology language OWL2

For example, the axiom (Manchester OWL syntax) : 

Covid 19 {EquivalentTo : Disease 

                and(causative_agent some coronavirus-2)

                and(has_pathological_process some infectious_process)}.

Thus a search for diseases that are infectious and have causative agents that are viruses would pick up Covid 19 because coronavirus 2 is a subclass of virus.  

The following show the same expression in the different syntaxes supported by Discovery

Discovery syntax

{"iri" :" :840539006 |Disease caused by severe acute respiratory syndrome coronavirus 2 (disorder)",
"EquivalentTo": {
   "Intersection": [{
     "Class": "64572001 |Disease (disorder)",
     "ObjectSome": {
       "Property": "609096000 |Role group (attribute)",
       "Intersection" : [{
         "ObjectSome": {
           "Property":"246075003 |Causative agent (attribute)",
	   "Class" : "840533007 |Severe acute respiratory syndrome coronavirus 2 (organism)" }},
         {"ObjectSome": {
           "Property":"370135005 |Pathological process (attribute)",
           "Class":"441862004 |Infectious process (qualifier value)"}}]}}]}}


OWL2 Functional syntax

EquivalentClasses(
	:840539006 |Disease caused by severe acute respiratory syndrome coronavirus 2 (disorder)|
	ObjectIntersectionOf(
		:64572001 |Disease (disorder)|
		ObjectSomeValuesFrom(
			:609096000 |Role group (attribute)|
			ObjectIntersectionOf(
				ObjectSomeValuesFrom(
					:246075003 |Causative agent (attribute)|
					:840533007 |Severe acute respiratory syndrome coronavirus 2 (organism)|
				)
				ObjectSomeValuesFrom(
					:370135005 |Pathological process (attribute)|
					:441862004 |Infectious process (qualifier value)|)))))


Snomed-CT compositional grammar

=== 186747009 |Coronavirus infection (disorder)| :
   { 370135005 |Pathological process (attribute)| = 441862004 |Infectious process (qualifier value)|, 
    246075003 |Causative agent (attribute)| = 840533007 |Severe acute respiratory syndrome coronavirus 2 (organism)| }

In practice, one of two types of techniques are employed when using subsumption testing in queries:

Generate a list of concepts of type Y.  In this case a function determines all of the possible concepts X1... Xn  in the ontology that are subsumed by Y. This list is used as a join or test against concepts store in records. This is the commonest technique. Type Y is normally part of a value set.

Is concept X subsumed by Y.  In this case the concept X definition is used against the concept Y definition, such  test usually including all other definitions in the ontology that are subsumed by Y.  This is more commonly used in decision support.

Thus subsumption testing requires an information model that contains a properly authored ontology.