Rest API over UMLS Terminologies

by Nadeem Nazeer

The Unified Medical Language System(UMLS)  is a repository of biomedical vocabularies developed by the US National Library of Medicine.

UMLS integrates over 2 million names for some 900 000 concepts from more than 60 families of biomedical vocabularies, as well as 12 million relations among these concepts.

Vocabularies integrated in the UMLS Metathesaurus include:

  • NCBI taxonomy
  • Gene Ontology
  • Medical Subject Headings (MeSH)
  • OMIM
  • Digital Anatomist Symbolic Knowledge Base.

In our series of tutorials on UMLS, we have discovered how we can install and use UMLS official tool MetamorphoSys to browse rich, huge data sets it has and explore items, details and different relations between them, and how we can load the same data to our local DB’s to use it in our own custom apps and utilities.

 Before starting, we should know that there is an official API available, click here to know more.

The UMLS Terminology Services (UTS) is the set of machines, programs and Application Programmer Interfaces (APIs), written in Java, located and maintained by staff at the NLM that allow access to the UTS services (UMLS Terminology Services).

The UTS provides two interface mechanisms:

  • The first is through the UTS website, which provides browsers to search Metathesaurus, SNOMED CT, as well as the Semantic Network. There are also links to other UMLS resources such as RxNorm, MetaMap, NewBorn Screening Coding, etc.
  • The second is through an Application Programmer Interface (API) that connects user programs to the UTS. This is meant for application developers who wish to make calls and retrieve UMLS data within their own applications.

Now the question remains as to what will be the most logical, obvious way to use that data. Whenever a question of web API arises Restful Architecture is what is preferred since it can be easily made available for consumption by a variety of apps and utilities. Moreover, the learning curve for using it is easy and quick. 

So to start the journey of providing UMLS data over API, we need to know some of the most prominent resources that are required so as to provide for consumption and what are the various filterings and constraints we should make available to make it a useful UMLS Metathesaurus API.

So starting with listing the resources we have:

  • The Concept Resource
  • The Terminology Code resource
  • The Terminology Relationship resource
  • The Terminology Mapping resource

Many other resources can be identified as per needs, and dataset we have at hand.

The data files listed below contain information obtained from source vocabularies. Concept Unique Identifiers (CUI) link concept data across files. The table below illustrates what information populates each data file. Click on the file name to see a sample of the data in that file.

Metadata File Name Contents
MRCONSO.RRF Names, Synonyms, Terms, Term Types, Codes
MRREL.RRF Relationships
MRHIER.RRF Hierarchies
MRSAT.RRF Attributes
MRDEF.RRF Definitions
MRMAP.RRF Mappings
MRSMAP.RRF Simplified Mappings
MRSTY.RRF Semantic Types

 

Below are some sample calls we identified for some of our apps:

  • Search concepts by term and source terminology (SAB)
  • Search concepts by terms in bulk and source terminology (SAB)
  • Get full details for a concept (specified by CUI)
  • Get all children for a given concept. By default it will get all children based on UMLS (REL=CHD) or the relationships can be restricted to a given vocabulary
  • Get all parents for a given concept. By default it will get all parents based on UMLS (REL=PAR) or the relationships can be restricted to a given vocabulary
  • Get all parents for a given list of concepts. By default it will get all parents based on UMLS (REL=PAR,RB) or the relationships can be restricted to a given vocabulary
  • Get all synonym strings for a given Concept (:cui). Optionally, restrict the synonyms to a list of vocabularies.
  • Get all relationships for a given concept.
  • Get all ancestor_trees for a given concept. It will get all ancestor_trees based on UMLS (MRREL). These can be restricted to a given vocabulary
  • Get all descendant_trees for a given concept. It will get all descendant_trees based on UMLS (MRREL). These can be restricted to a given vocabulary
  • Search all codes
  • Get details about a code.
  • Get semantically equivalent mappings for code1 in sab1 in sab2

And the list may go on and on.

We at Applied have one such API built in Django. Take a look here: https://github.com/chintanop/health-vocabulary-rest-api.

In the next part we will introduce the app and how to get started with using it.

Leave a query and I will get back to you. 

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *

Data Science & PopHealth

Methods, tools, systems for healthcare data analysis

Contact us now

Popular Posts