Welcome to the London Discovery Data Service knowledge base and Collaborative Glossary: Difference between pages

From Discovery Data Service
(Difference between pages)
Jump to navigation Jump to search
No edit summary
 
No edit summary
 
Line 1: Line 1:
This is an area of the Wiki for members of the Discovery Collaborative to share definitions of common terms. It is public and can be referenced from other documents and websites as necessary
{| class="wikitable sortable"
|+
!Term
!Definition
!External Links
|-
|Cloud
|The Cloud refers to IT infrastructure, platforms, and services that are hosted in a remote data centre, managed by a third party. Typically these data centres are run and managed by large companies such as Microsoft (Azure), Amazon (AWS), and Google (GWP), but there are many other providers too. Using the cloud removes the need for customers to buy and maintain their own physical IT equipment. Instead, the cloud provider typically charges customers daily rates based on storage used, computational power used, memory, number of servers etc
|[[wikipedia:Cloud_computing|Cloud Computing]]
|-
|Cloud Native
|Cloud Native refers to building software and systems in the cloud that make use of services made available by cloud providers, to run scalable, resilient applications. For example, instead of creating a SQL database on a server in the cloud, which the customer would need to monitor, maintain, patch, back-up etc, they could make use of a “SQL database as a service” from the cloud provider, who would automatically take care of all the maintenance without any further intervention from the customer
|[[wikipedia:Cloud_native_computing|Cloud Native Computing]]
|-
|Cluster
|A group of concept codes making another idea. The Cluster may be made up of concepts from different ontologies or classifications e.g., Snomed CT and ICD10. Synonyms for Cluster are refset, value set, codelist, code set, grouper.  The name "cluster" was strongly advocated for by John Robson (Clinaical Effectiveness Group (CEG) NEL) because of his work in primary care. 
|[http://app.powerbi.com/view?r=eyJrIjoiZTY0ODY0YzEtMjhhYy00ZTViLWJjZDQtMWZjOWVkMTZlOGExIiwidCI6IjUwZjYwNzFmLWJiZmUtNDAxYS04ODAzLTY3Mzc0OGU2MjllMiIsImMiOjh9 Primary Care Domain Reference Set Portal]
|-
|Common Data Model
|This is the shared Data Model which has been adopted as a standard for an organisation or group. It defines how the elements of data relate to one another and enables data to be transferred between different systems which share the Common Data Model with ease.
It defines the transformation of data from multiple different databases with potentially multiple coding systems into a common storage format (data model) as well as a common representation (terminologies, vocabularies, coding schemes)
|[https://www.jamesserra.com/archive/2019/06/common-data-model/ Microsoft Example]
|-
|Concept
|A coded idea. For example, Name = Baker's Asthma, SNOMED CT code = 34015007
|
|-
|Conceptual Data Model
|This is a high level model, describing business processes and how they relate to one another. In simple terms, it is a map of concepts and rules relating to the business which can then be used to define relationships between data entities in order to describe the business concepts – this leads to the creation of a logical data model
|
|-
|Data Dictionary
|A centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format e.g. NHS Data Dictionary.
|[[wikipedia:Data_dictionary|Data Dictionary]]
|-
|Data Model
|Data models exist at different levels of abstraction, from conceptual data models which are high-level and consider business activity and relationships, to a physical data model which represents the implementation of a database. A data model shows how elements of data relate to one another and how they are structured, how they relate to business needs, and how they can be manipulated.
|[[wikipedia:Data_model|Data Model]]
[https://digital.nhs.uk/services/nhs-data-model-and-dictionary-service NHS Data Model and Dictionary Service]
|-
|Data Platform
|A Data Platform is any digital solution which enables an organisation to store, transform, aggregate, and analyse data. It integrates many different technologies for different purposes to meet the data needs of the organisation. It includes security and access controls to ensure IG compliance
|
|-
|Database Index
|Indexes are used to quickly sort and retrieve data in a database. They may take the form of a physical, clustered index, which is the physical ordering of the data in the table (e.g. by ID or surname), or a non-clustered index which describes the logical ordering of the data separately – like the index at the back of a book: the index key is ordered (e.g. surname) and next to it is information showing where to find the record.
|
|-
|Database Schema
|This is the definition of the structure of a relational database and defines how data is organised, including the table names, fields (also known as columns), data types, constraints, indexes, and relationships between these entities. The schema is a blueprint of how the database is constructed.
|
|-
|Data Warehouse
|A data warehouse stores data for the main purpose of reporting and data analysis. They are central repositories which bring together disparate data sets into one place, to enable reporting at scale and pace.
|[[wikipedia:Data_warehouse|Data Warehouse]]
|-
|DDS Information Model Manager
|The Information Model Manager is an application developed by the Endeavour Charitable Trust which provides a view of the Common Information Model and supports authoring of the content of the model. It can be used to view the model structure and content, download artefacts (e.g. value sets), and manage the ontology. It can be used to view entities and their relationships (e.g. an A&E admission, which is an Encounter and has a subject of a Patient and has a sub-encounter of a Triage Encounter etc)
|[https://wiki.endeavourhealth.org/index.php?title=Discovery_health_information_model Discovery Health Information Model]
|-
|Endeavour Information Model
|
|
|-
|Expression (Snomed CT)
|A combination of concepts that represent something that is not already represented as a Concept or Cluster. It typically links things across concepts (disease of infection (finding), with bacteria name (organism), with sepsis (observable entity) with kidney (body structure).
|[https://confluence.ihtsdotools.org/display/DOCGLOSS/SNOMED+CT+expression Snomed CT Expression]
|-
|FHIR
|FHIR stands for Fast Healthcare Interoperability Resources. It is an international standard for health care data exchange and is published by HL7. It is designed to enable the exchange of healthcare related information, including clinical, administrative, publich-health, and research data. DDS stores data in the core in FHIR format before it is restructured and sent to subscriber databases
|
|-
|Functional Requirements
|These are product features or functionality that must be developed in order for the system to satisfy requirements. They must be implemented in the system in order for it to achieve what it is supposed to accomplish. Examples include "users should be able to download data based on their filtered search criteria" or "only admin-level users should be able to reset passwords for other users"
|[[wikipedia:Functional_requirement|Functional requirement]]
|-
|GitHub
|GitHub is a provider of version control and source-code management using [[wikipedia:Git|Git]] (which is software used for tracking changes in files). It enables software developers to safely store source code and work on it collaboratively, merging changes from different developers into the main "production" branch safely. It provides additional features such as bug/issue tracking, continuous integration, feature requests etc. GitHub was acquired by Microsoft in 2018.
|[[wikipedia:GitHub|GitHub]]
|-
|Graph Database
|Graph Databases are used to represent data as elements and their relationships (known as nodes and
edges). They are particularly beneficial when representing large datasets with complex and numerous relationships, such as social media contacts and relations. Unlike a relational database, where to return data it is often necessary to join many tables, in a graph database this isn’t necessary. Relationships are stored natively alongside the data elements enabling much faster querying
|
|-
|HDRUK Phenotype Library
|This is a comprehensive, open access resource which exists to provide the research community with information, tools, and phenotyping algorithms for UK electronic health records (EHRs)
|[https://phenotypes.healthdatagateway.org/ HDR UK Phenotype Library]
|-
|Information Model
|A model describing ontologically standardised data items following a specific database schema within a specific relational database architecture.
|
|-
|Knowledge Graph
|An information architecture where concepts are represented as nodes and edges in a network of relationships. This is usually represented in a NoSQL architecture designed to around relationships between concepts.
|
|-
|[[Logical Data Model]]
|The logical data model (or logical schema) describes how data objects relate to each other, but is independent of the technology or database management system used. Logical data models typically show entities (e.g. a patient, an appointment, a result), relationships (e.g. an appointment is linked to a patient via a unique identifier), and attributes (information that is useful to further describe the entities)
|[[wikipedia:Logical_schema|Logical Schema]]
|-
|Mauro Data Mapper
|Mauro Data Mapper is a third-party toolkit for the design and documentation of databases, data flows, and data standards, as well as related software artefacts such as data schemas and data forms. It was originally developed for the description of data in clinical research.
|[https://maurodatamapper.github.io/about/introduction/ Mauro Data Mapper]
|-
|Medical Classification
|Medical Classifications transform descriptions of procedures or diagnoses into standardised codes through the process of clinical coding. E.g., ICD10, OPCS4, LOINC. These
can also be called code sets. Hence, we call them a medical classification to avoid confusion with Clusters
|[[wikipedia:Medical_classification|Medical Classification]]
|-
|Metadata Library
|A tool that allows management of clusters
|
|-
|NHS Terminology Server
|The NHS Terminology Server “is a FHIR compliant solution that holds and disseminates assured international terminologies and classifications (such as SNOMED-CT and ICD-10) and national terminologies (such as NHS Data Model and Dictionary codes).”
See <nowiki>https://digital.nhs.uk/services/terminology-servers</nowiki>. Content is delivered in machine readable format and can be accessed as a real-time resource through APIs to support other applications.
|[https://digital.nhs.uk/services/terminology-servers NHSD Terminology Server]
|-
|Non-functional Requirements
|These are requirements concerning the operation of a system rather than functionality, and typically relate to security, scalability, performance, reliability etc. Examples include "the page must load within 500ms" or "the website must conform to WCAG AA accessibility standards"
|[[wikipedia:Non-functional_requirement|Non-functional Requirements]]
|-
|Normalisation (health data)
|In healthcare, this is the process of taking data from different sources, in different formats and with different code sets, and converting the data into a singular, unified clinical language or terminology – for example mapping proprietary codes from EPR systems to SNOMED codes
|[https://www.imohealth.com/ideas/article/data-normalization-working-toward-a-common-clinical-terminology/ Healthcare Normalisation]
[https://blog.clinicalarchitecture.com/what-is-data-normalization What is data normalisation?]
|-
|Normalisation (database)
|This is the process of structuring a relational database in accordance with a series of derived rules, called ''normal forms'', in order to reduce redundancy, duplication, and improve data integrity.
|[[wikipedia:Database_normalization|Normalisation]]
|-
|NoSQL Database
|NoSQL stands for Not Only SQL, and is a generic name given to databases which store data in a non-tabular format. They are not relational databases – they may use storage techniques such as key-value pairs, document stores, graph databases. In some cases, SQL can still be used to query them
|
|-
|OMOP Data Model
|OMOP stands for Observational Medical Outcomes Partnership, which was formed to inform the appropriate use of observational healthcare databases. OHDSI (Observational Health Data Sciences and Informatics) is a collaborative that now includes all of the original OMOP research investigators and will continue to develop tools using the OMOP common data model and vocabulary (OMOP is no longer and active programme). 
|[https://www.ohdsi.org/data-standardization/the-common-data-model/ OMOP]
[https://github.com/OHDSI/CommonDataModel/blob/v5.4.0/inst/ddl/5.4/sql_server/OMOPCDM_sql_server_5.4_ddl.sql OMOP Common Data Model (GitHub)]
|-
|On-Premise (On-Prem)
|IT infrastructure build and maintained in-house – for example, database servers sitting in the basement of a hospital building which the IT department maintain and upgrade themselves
|
|-
|Ontology
|An Ontology is a set of concepts and categories showing the relationships and properties between them in a particular domain. In healthcare, and ontology is used for modelling the semantics of medical concepts and to enable the exchange of medical data between systems. The most currently used ontology is SNOMED CT - the DDS ontology is made up of several ontologies (such as READ, CTV3 etc) mapped to SNOMED where appropriate SNOMED terms already exist, and extended where they don't.
A representation, formal naming, and definition of the categories, properties, and relations between the concepts, data, and entities that substantiate one, many, or all domains of discourse (e.g., medicine). Example: Snomed CT.
|[[wikipedia:Ontology_(information_science)|Ontology]]
[https://termbrowser.nhs.uk/ NHSD SNOMED browser]


__TOC__
[https://www.snomed.org/ SNOMED.org]
These articles describe the Discovery Data Service; the architecture that underpins the data service, the components that make up the service, the testing and assurance processes, and the resources, open source technologies, and software that are used to host, develop, and support the service.


=Discovery Data Service status=
[https://www.england.nhs.uk/digitaltechnology/digital-primary-care/snomed-ct/ SNOMED - NHS England]
 
|-
* The data service is live.
|OpenCodelists
* It currently stores and processes the data from 28 million linked organisational patient records in the UK.
|OpenCodelists was created by OpenSAFELY for creating and sharing codelists. See "Cluster" definition above.
* It contains long term records on 14 million unique UK citizens who have been resident in London at some point in the last 30 years (including people who have left or died).
|[https://www.opencodelists.org/ OpenCodelists]
* Data on individuals are held in a single common data model with data values represented by a super ontology composed of the world's leading health ontologies.
[https://www.opensafely.org/ OpenSAFELY]
* Data in Discovery is submitted by health providers who provide primary, secondary, and urgent care services to patients or clients.
|-
* Data is kept up to date in close to real time - at worst within 24 hours; see [[Latency of extract data feeds]] for more details.
|PESTLE Analysis
 
|A PESTLE analysis studies the key external factors (Political, Economic, Sociological, Technological, Legal and Environmental) that influence an organisation. It can be used in a range of different scenarios, and can guide people professionals and senior managers in strategic decision-making
= Discovery overview =
|[https://www.cipd.co.uk/knowledge/strategy/organisational-development/pestle-analysis-factsheet Pestle Analysis]
 
|-
These articles provide background to the ideas, philosophies and approaches that have resulted in the Discovery Data Service. They cover the original objectives, the approach to collaboration and shared tenancy, and the information governance rules that underpin the way privacy and sharing are balanced.
|Physical Data Model
 
|The physical data model describes how a database should be structured and is a representation of table structures, columns, column names, column constraints, primary keys, foreign keys, and any other physical features of the database. A database is an implementation of a physical data model.
*[[What_is_Discovery|What is Discovery]]? - This article describes what Discovery is and how it is conceptualised at a very high level.
|[[wikipedia:Physical_schema|Physical Schema]]
*[[Case studies]] - An overview of some initial projects that access and utilise data from Discovery; information includes the applicant, project summary, and the latest status (Development/Live).
|-
*[[Latest news]] - The latest information for all Discovery Data Service releases and changes, plus known issues and updates from system suppliers and other third parties.
|Primary Care Reference Set
*[[Current_data_sets|Current data sets]] - the broad types of data received from various systems.
|This is a cluster of codes used within business rules authored and maintained by NHSD's primary care domain.
*[[Current published data]] - provides the latest data publisher information by sector, geographical area, service, and system provider.
|[https://digital.nhs.uk/data-and-information/data-collections-and-data-sets/data-collections/quality-and-outcomes-framework-qof/quality-and-outcome-framework-qof-business-rules/primary-care-domain-reference-set-portal NHSD Primary Care Domain Reference Set Portal]
 
|-
= Information model topics =
|Relational Database
 
|A Relational database is one where the data is stored in tables, featuring rows and columns and has predefined relationships between the data items. Typically, a table will have a primary key, which is a unique identifier for items in that table. Other tables will reference that primary key for items that are related to it by means of a foreign key. For example, in a table of patients, the primary key might be the NHS number
These articles provide information about the services, structure, and functionality&nbsp;that operate using the Information Models;&nbsp;one of the fundamental tenets of the Discovery Data Service. This is the starting point for those interested in health data and the information derived from it, and why Discovery may be different from other approaches:
|
 
|-
*[https://wiki.endeavourhealth.org/index.php?title=Discovery_health_information_model Health Information model] - An overview of the approach to the Discovery information model,  the purpose, and type of content.
|SNOMED CT Concept
*[https://wiki.endeavourhealth.org/index.php?title=Discovery_health_information_model#Information_model_language Modelling language] - The W3C standard languages used to build the various components of the information model
|A clinical idea to which a unique concept identifier has been assigned.
*[[Health_data_model_content|Basic health data set&nbsp;]] - a very high level overview of the types of data mapped from data held  within the Discovery data stores.
|[https://confluence.ihtsdotools.org/display/DOCGLOSS/concept SNOMED CT Concept]
*[[Information model reference data|Reference data -]] an overview of additional data held in the information model.
|-
*[[Mapping and matching concepts|Mapping and matching concepts and structures]] - Introduces the approaches to matching and mapping concepts and the structural mappings in the model
|SQL
*[[Information_model_service|Information model services]] - the technical services , such as APIs and apps that enable an model instance to be used or to implement an EHR based on the model or particular data model
|A Relational database is one where the data is stored in tables, featuring rows and columns and has predefined relationships between the data items. Typically, a table will have a primary key, which is a unique identifier for items in that table. Other tables will reference that primary key for items that are related to it by means of a foreign key. For example, in a table of patients, the primary key might be the NHS number
 
|[[wikipedia:SQL|Structured Query Language]]
= Publishers and subscribers =
|-
These articles provide some brief high level information for organisations that send data to Discovery and those that use that data.
|SQL Database
 
|A SQL Database is a relational database, which is a collection of tables storing a specific set of structured data, with a fixed schema, which can be queried using SQL
* [[Publishers]] - includes technical interface information, FHIR mapping, system requirements, and how to make changes to publishing services.
|
* [[Subscribers]] - includes set up, data mapping, and schema information.
|}
 
= Services and applications =
 
These articles provide information about the services and applications provided by the Discovery information services.
 
* [[Discovery Query|Sextant]] - A data set distribution service with an option to post the data directly into the recipients database.
*[[Monitoring and alerting]] - provides an overview of the monitoring and alert systems in place.
 
*[[Data Sharing Manager|Data Sharing manager]]- DSM provides a visual representation of data that is being shared and processed and by which organisations, regions, and/or services.
 
*[[UPRN address match application|ASSIGN- UPRN address matching application -]] a web based application that matches single or batches of hand entered address to authoritative addresses and assigns a unique property reference number.
*[[Information model manager]] - an application designed to provide a view of the common information model, and support the authoring of the content of the model.
*[[Information_model_service|Information model services]] - the technical services that enable the common information model to be used.
*[[FHIR Get Structured Record API]] - returns a complete patient record in a structured format following GP and Care Connect standards.
*[[HL7v2 API]] - Send HL7v2 ADT and ORU (clinical events) data to Discovery.
*Get health record - the service that obtains a fully structured health record about a person, in real time via a standards based API.
*Record publication service - the service that publishes data from health records for specific purposes, often immediately on receipt.
*Data set distribution service - the service that distributes sets of data for populations of patients where the definition is complex or where the subscriber is hosting a copy of the data permanently.
*[[NHS 111 Discovery frailty flagging]] / [[Frailty algorithm]]
*[[GitHub repositories]] - descriptions and information relating to the application source code.
 
= Architectures =
These articles provide information about the overall high level software architecture models providing a descriptions and visualisations of the Discovery Data Service and information service components and how they interact.
 
* The [[Data Storage Architecture]] illustrates how data is stored within the service and the various implementation options available for organisations that wish to use some or parts of the services.
*[[Identity Authentication Authorisation|Identity, authentication and authorisation]] specifies the approach taken by Discovery for accessing utilities and controlling some aspects of access to application functionality and some data resources, after all of the other locks have been opened.
*[[Application zone architecture]] illustrates how the data is accessed in different places and in different ways according to different use cases
*[[Pseudonymisation]] describes how Discovery generates pseudonymised data.
*[[Data linking]] describes how Discovery links several patient records into a single person/citizen record.
 
= Application help =
 
*[[Data_Sharing_Manager|Data Sharing Manager]]&nbsp;-&nbsp;Data Sharing Manager provides a visual representation of data that is being shared and processed and by which organisations, regions, and/or services.
*:You can also amend or cancel data sharing and processing agreements in real time.
*[[Record Viewer]] - Record Viewer shows a combined view of an individual's patient records, using data from different source organisations.
*[[Jira]] - Jira is an effective way to share and track information and helps to make sure that tasks and issues are completed, and signed off, by all relevant teams.
 
= Wiki tips =
 
*[[Adding a new page to the wiki]]
*[[Adding tables from word documents]]
*[https://www.mediawiki.org/wiki/Help:Tables Mediawiki Help:Tables]
*[[Creating a DDS wiki watchlist]]
*[[Creating a new user]]
*[https://en.wikipedia.org/wiki/Help:List Wikipedia Help:Lists]
*[[Test page]]

Revision as of 11:35, 1 May 2022

This is an area of the Wiki for members of the Discovery Collaborative to share definitions of common terms. It is public and can be referenced from other documents and websites as necessary

Term Definition External Links
Cloud The Cloud refers to IT infrastructure, platforms, and services that are hosted in a remote data centre, managed by a third party. Typically these data centres are run and managed by large companies such as Microsoft (Azure), Amazon (AWS), and Google (GWP), but there are many other providers too. Using the cloud removes the need for customers to buy and maintain their own physical IT equipment. Instead, the cloud provider typically charges customers daily rates based on storage used, computational power used, memory, number of servers etc Cloud Computing
Cloud Native Cloud Native refers to building software and systems in the cloud that make use of services made available by cloud providers, to run scalable, resilient applications. For example, instead of creating a SQL database on a server in the cloud, which the customer would need to monitor, maintain, patch, back-up etc, they could make use of a “SQL database as a service” from the cloud provider, who would automatically take care of all the maintenance without any further intervention from the customer Cloud Native Computing
Cluster A group of concept codes making another idea. The Cluster may be made up of concepts from different ontologies or classifications e.g., Snomed CT and ICD10. Synonyms for Cluster are refset, value set, codelist, code set, grouper. The name "cluster" was strongly advocated for by John Robson (Clinaical Effectiveness Group (CEG) NEL) because of his work in primary care. Primary Care Domain Reference Set Portal
Common Data Model This is the shared Data Model which has been adopted as a standard for an organisation or group. It defines how the elements of data relate to one another and enables data to be transferred between different systems which share the Common Data Model with ease.

It defines the transformation of data from multiple different databases with potentially multiple coding systems into a common storage format (data model) as well as a common representation (terminologies, vocabularies, coding schemes)

Microsoft Example
Concept A coded idea. For example, Name = Baker's Asthma, SNOMED CT code = 34015007
Conceptual Data Model This is a high level model, describing business processes and how they relate to one another. In simple terms, it is a map of concepts and rules relating to the business which can then be used to define relationships between data entities in order to describe the business concepts – this leads to the creation of a logical data model
Data Dictionary A centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format e.g. NHS Data Dictionary. Data Dictionary
Data Model Data models exist at different levels of abstraction, from conceptual data models which are high-level and consider business activity and relationships, to a physical data model which represents the implementation of a database. A data model shows how elements of data relate to one another and how they are structured, how they relate to business needs, and how they can be manipulated. Data Model

NHS Data Model and Dictionary Service

Data Platform A Data Platform is any digital solution which enables an organisation to store, transform, aggregate, and analyse data. It integrates many different technologies for different purposes to meet the data needs of the organisation. It includes security and access controls to ensure IG compliance
Database Index Indexes are used to quickly sort and retrieve data in a database. They may take the form of a physical, clustered index, which is the physical ordering of the data in the table (e.g. by ID or surname), or a non-clustered index which describes the logical ordering of the data separately – like the index at the back of a book: the index key is ordered (e.g. surname) and next to it is information showing where to find the record.
Database Schema This is the definition of the structure of a relational database and defines how data is organised, including the table names, fields (also known as columns), data types, constraints, indexes, and relationships between these entities. The schema is a blueprint of how the database is constructed.
Data Warehouse A data warehouse stores data for the main purpose of reporting and data analysis. They are central repositories which bring together disparate data sets into one place, to enable reporting at scale and pace. Data Warehouse
DDS Information Model Manager The Information Model Manager is an application developed by the Endeavour Charitable Trust which provides a view of the Common Information Model and supports authoring of the content of the model. It can be used to view the model structure and content, download artefacts (e.g. value sets), and manage the ontology. It can be used to view entities and their relationships (e.g. an A&E admission, which is an Encounter and has a subject of a Patient and has a sub-encounter of a Triage Encounter etc) Discovery Health Information Model
Endeavour Information Model
Expression (Snomed CT) A combination of concepts that represent something that is not already represented as a Concept or Cluster. It typically links things across concepts (disease of infection (finding), with bacteria name (organism), with sepsis (observable entity) with kidney (body structure). Snomed CT Expression
FHIR FHIR stands for Fast Healthcare Interoperability Resources. It is an international standard for health care data exchange and is published by HL7. It is designed to enable the exchange of healthcare related information, including clinical, administrative, publich-health, and research data. DDS stores data in the core in FHIR format before it is restructured and sent to subscriber databases
Functional Requirements These are product features or functionality that must be developed in order for the system to satisfy requirements. They must be implemented in the system in order for it to achieve what it is supposed to accomplish. Examples include "users should be able to download data based on their filtered search criteria" or "only admin-level users should be able to reset passwords for other users" Functional requirement
GitHub GitHub is a provider of version control and source-code management using Git (which is software used for tracking changes in files). It enables software developers to safely store source code and work on it collaboratively, merging changes from different developers into the main "production" branch safely. It provides additional features such as bug/issue tracking, continuous integration, feature requests etc. GitHub was acquired by Microsoft in 2018. GitHub
Graph Database Graph Databases are used to represent data as elements and their relationships (known as nodes and

edges). They are particularly beneficial when representing large datasets with complex and numerous relationships, such as social media contacts and relations. Unlike a relational database, where to return data it is often necessary to join many tables, in a graph database this isn’t necessary. Relationships are stored natively alongside the data elements enabling much faster querying

HDRUK Phenotype Library This is a comprehensive, open access resource which exists to provide the research community with information, tools, and phenotyping algorithms for UK electronic health records (EHRs) HDR UK Phenotype Library
Information Model A model describing ontologically standardised data items following a specific database schema within a specific relational database architecture.
Knowledge Graph An information architecture where concepts are represented as nodes and edges in a network of relationships. This is usually represented in a NoSQL architecture designed to around relationships between concepts.
Logical Data Model The logical data model (or logical schema) describes how data objects relate to each other, but is independent of the technology or database management system used. Logical data models typically show entities (e.g. a patient, an appointment, a result), relationships (e.g. an appointment is linked to a patient via a unique identifier), and attributes (information that is useful to further describe the entities) Logical Schema
Mauro Data Mapper Mauro Data Mapper is a third-party toolkit for the design and documentation of databases, data flows, and data standards, as well as related software artefacts such as data schemas and data forms. It was originally developed for the description of data in clinical research. Mauro Data Mapper
Medical Classification Medical Classifications transform descriptions of procedures or diagnoses into standardised codes through the process of clinical coding. E.g., ICD10, OPCS4, LOINC. These

can also be called code sets. Hence, we call them a medical classification to avoid confusion with Clusters

Medical Classification
Metadata Library A tool that allows management of clusters
NHS Terminology Server The NHS Terminology Server “is a FHIR compliant solution that holds and disseminates assured international terminologies and classifications (such as SNOMED-CT and ICD-10) and national terminologies (such as NHS Data Model and Dictionary codes).”

See https://digital.nhs.uk/services/terminology-servers. Content is delivered in machine readable format and can be accessed as a real-time resource through APIs to support other applications.

NHSD Terminology Server
Non-functional Requirements These are requirements concerning the operation of a system rather than functionality, and typically relate to security, scalability, performance, reliability etc. Examples include "the page must load within 500ms" or "the website must conform to WCAG AA accessibility standards" Non-functional Requirements
Normalisation (health data) In healthcare, this is the process of taking data from different sources, in different formats and with different code sets, and converting the data into a singular, unified clinical language or terminology – for example mapping proprietary codes from EPR systems to SNOMED codes Healthcare Normalisation

What is data normalisation?

Normalisation (database) This is the process of structuring a relational database in accordance with a series of derived rules, called normal forms, in order to reduce redundancy, duplication, and improve data integrity. Normalisation
NoSQL Database NoSQL stands for Not Only SQL, and is a generic name given to databases which store data in a non-tabular format. They are not relational databases – they may use storage techniques such as key-value pairs, document stores, graph databases. In some cases, SQL can still be used to query them
OMOP Data Model OMOP stands for Observational Medical Outcomes Partnership, which was formed to inform the appropriate use of observational healthcare databases. OHDSI (Observational Health Data Sciences and Informatics) is a collaborative that now includes all of the original OMOP research investigators and will continue to develop tools using the OMOP common data model and vocabulary (OMOP is no longer and active programme). OMOP

OMOP Common Data Model (GitHub)

On-Premise (On-Prem) IT infrastructure build and maintained in-house – for example, database servers sitting in the basement of a hospital building which the IT department maintain and upgrade themselves
Ontology An Ontology is a set of concepts and categories showing the relationships and properties between them in a particular domain. In healthcare, and ontology is used for modelling the semantics of medical concepts and to enable the exchange of medical data between systems. The most currently used ontology is SNOMED CT - the DDS ontology is made up of several ontologies (such as READ, CTV3 etc) mapped to SNOMED where appropriate SNOMED terms already exist, and extended where they don't.

A representation, formal naming, and definition of the categories, properties, and relations between the concepts, data, and entities that substantiate one, many, or all domains of discourse (e.g., medicine). Example: Snomed CT.

Ontology

NHSD SNOMED browser

SNOMED.org

SNOMED - NHS England

OpenCodelists OpenCodelists was created by OpenSAFELY for creating and sharing codelists. See "Cluster" definition above. OpenCodelists

OpenSAFELY

PESTLE Analysis A PESTLE analysis studies the key external factors (Political, Economic, Sociological, Technological, Legal and Environmental) that influence an organisation. It can be used in a range of different scenarios, and can guide people professionals and senior managers in strategic decision-making Pestle Analysis
Physical Data Model The physical data model describes how a database should be structured and is a representation of table structures, columns, column names, column constraints, primary keys, foreign keys, and any other physical features of the database. A database is an implementation of a physical data model. Physical Schema
Primary Care Reference Set This is a cluster of codes used within business rules authored and maintained by NHSD's primary care domain. NHSD Primary Care Domain Reference Set Portal
Relational Database A Relational database is one where the data is stored in tables, featuring rows and columns and has predefined relationships between the data items. Typically, a table will have a primary key, which is a unique identifier for items in that table. Other tables will reference that primary key for items that are related to it by means of a foreign key. For example, in a table of patients, the primary key might be the NHS number
SNOMED CT Concept A clinical idea to which a unique concept identifier has been assigned. SNOMED CT Concept
SQL A Relational database is one where the data is stored in tables, featuring rows and columns and has predefined relationships between the data items. Typically, a table will have a primary key, which is a unique identifier for items in that table. Other tables will reference that primary key for items that are related to it by means of a foreign key. For example, in a table of patients, the primary key might be the NHS number Structured Query Language
SQL Database A SQL Database is a relational database, which is a collection of tables storing a specific set of structured data, with a fixed schema, which can be queried using SQL