Categories
Healthcare Analytics Information Systems OpenSource

OHDSI OMOP to FHIR mapper

TL;DR Below is an open-source common-line tool for converting an OHDSI OMOP cohort (defined in ATLAS) to a FHIR bundle and vice versa.

Originally published by Bell Eapen at nuchange.ca on July 22, 2020. If you have some feedback, reach out to the author on Twitter,  LinkedIn or  Github.

OHDSI OMOP CDM is one of the most popular clinical data models for health data warehouses. The simple, but clinically motivated data structure is intuitively appealing to clinicians leading to its good adoption. In this respect, it has overtaken HL7-V3 which is more robust but has a steeper learning curve, especially for clinicians. The OHDSI OMOP CDM is widely used in the pharmaceutical industry for drug monitoring.

FHIR is emerging as the defacto standard for health system interoperability, owing largely to its simplicity and the use of existing and popular standards such as REST. As NoSQL databases become more and popular in healthcare, FHIR can also be a good persistence schema. It aligns well with search technologies such as elasticsearch.

As both standards are popular, conversion from one to the other may be commonly required. Researchers at Georgia Tech have an open-source tool – GT-FHIR2 – for mapping an existing OHDSI OMOP CDM database as FHIR endpoint. However, conversion between existing systems may not be easy with a full-stack solution. 

I have a simpler solution that I believe will be useful in the following scenarios:

  • To export a cohort to a FHIR based analytics tool.
  • To load new resources to OMOP CDM databases for incremental ETL.

Omopfhirmap is a command-line tool for mapping a OHDSI cohort, defined in ATLAS, to a FHIR bundle that can be optionally submitted to a FHIR server for processing. Conversely, it can process a FHIR bundle and add resources to an existing CDM database ignoring duplicates. Unlike GT-FHIR2, the OMOP on FHIR Project at Georgia Tech omopfhirmap does not expose OMOP database as FHIR endpoints. 

I have used spring-boot and JPA for easy wiring of services and abstraction of database and the hapi-fhir as it is an obvious choice for any java based FHIR applications. It is still work in progress and any help will be appreciated (Refer to CONTRBUTING.md).

Categories
Health Research Methodology Healthcare Analytics OpenSource

DADpy: The swiss army knife for discharge abstract database

Discharge Abstract Database (DAD) is a Canada-wide database of hospital admission and discharge data excluding the province of Quebec, maintained by the Canadian Institute for Health Information (CIHI). The data points in DAD include patient demographics, comorbidities coded in the International Statistical Classification of Diseases and Related Health Problems (ICD), interventions encoded in the Canadian Classification of Health Interventions (CCI) and the length of stay. DAD is the de-identified 10% sample available under the Data Liberation Initiative (DLI) for academic researchers. DAD is arguably the most comprehensive country-wide discharge dataset in the world.

The Swiss army knife for Discharge Abstract Database

Discharge Abstract Database is used for creating public reports for hospitals, researchers, and the general public. DAD data has also been used for disease-specific research and analysis, including public health, disease surveillance, and health services research. CIHI provides DAD in the SPSS (.sav) format with each record having horizontal fields for 20 comorbidities and 25 interventions. The format is not ideal for slicing and dicing the data for visualization for clinicians to obtain clinical insights.

DADpy provides a set of functions for using the DAD dataset for machine learning and visualization. The package does not include the dataset. Academic researchers can request the DAD dataset from CIHI. This is an unofficial repo, and I’m not affiliated with CIHI. Please retain the disclaimer below in forks.

Installation: (Will add to pypi soon)

pip install https://github.com/E-Health/dadpy/releases/download/1.0.0/dadpy-1.0.0-py3-none-any.whl
from dadpy import DadLoad
from dadpy import DadRead

# with the trailing slash
dl = DadLoad('/path/to/dad/sample/spss/sav/file/') # clin_sample_spss.sav
dr = dad_read(dl.sample)

# records with obesity as pandas df
print(dr.has_diagnosis('E66'))
# Partial gastrectomy for repair of gastric diverticulum
print(dr.has_treatment('1NF80')) 

# comorbidities as dict for visualization
print(dr.comorbidity('E66')) # Obesity
# co-occurance of treatments as dict
print(dr.interventions('1NF80')) # Partial gastrectomy for repair of gastric diverticulum

# Get the one-hot-encoded vector for machine learning
dr.vector(dr.has_diagnosis('E66'), significant_chars=3, include_treatments=True)

We use poetry for development. PR are welcome. Please see CONTRIBUTING.md in the repo. Start by renaming .env.example to .env and add path for tests to run. Add jupiter notebooks to the notebook folder. Include the disclaimer below.

Disclaimer: Parts of this material are based on the Canadian Institute for Health Information Discharge Abstract Database Research Analytic Files (sampled from fiscal years 2016-17). However the analysis, conclusions, opinions and statements expressed herein are those of the author(s) and not those of the Canadian Institute for Health Information.

Let us know if you use DADpy for creating interesting jupyter notebooks. 

Categories
Healthcare Analytics OpenSource Resources

OSCAR EMR EForm Export (CSV) to FHIR

This is a simple application to convert a CSV file to a FHIR bundle and post it to a FHIR server in Golang. The OSCAR EMR has an EForm export tool that exports EForms to a CSV file that can be downloaded. This tool can load that CSV file to a FHIR server for consolidated analysis. This tool can be used with any CSV, if columns specified below (CSV format section) are present.

Use Cases

This is useful for family practice groups with multiple OSCAR EMR instances. Analysts at each site can use this to send data to a central FHIR server for centralized data analysis and reporting. Public health agencies using OSCAR or similar health information systems can use this to consolidate data collection.

How to build

First go get all dependencies This package includes three tools (Go build them separately from the cmd folder):

Fhirpost: The application for posting the csv fie to the FHIR server

Serverfhir: A simple FHIR server for testing (requires mongodb). We recommend using PHIS-DW for production.

Report: A simple application for descriptive statistics on the csv file

Format of the CSV file


Using vocabulary such as SNOMED for field names in the E-Form is very useful for consolidated analysis.

Each record should have:

demographicNo → The patient ID
dateCreated
efmfid → The ID of the eform
fdid → The ID of the each form field.
(The Eform export csv of OSCAR typically has all these fields and requires no further processing)

Mapping

  • Bundle with unique patients. All columns mapped to observations.
  • Submitter mapped to Practitioner.
  • Document type bundle with composition as the first entry
  • Unique fullUrls are generated.
  • PatientID is location + demographicNo
  • Budle of 1 composition, 1 practitioner, 1 or more patients, and many observations
  • Validates with R4 schema

How to use:

  • Change the settings in .env
  • You can compile this for Windows, Mac or Linux. Check the fhirmap.go file and make any desired changes. You should be able to figure out the mapping rules from this file.
  • It reads data.csv file from the same folder by default. (can be specified by the -file commandline argument: fhirpost -file=data.csv)
  • Start mongodb and run server and fhirpost in separate windows for testing.
  • On windows, you can just double-click executables to run. (Closes automatically after run)

Privacy and security:
This application does not encrypt the data. Use it only in a secure network.

Disclaimer:
This is an experimental application. Use it at your own risk.

Categories
Information Systems OpenSource

OSCAR EMR and FHIR

OSCAR (Open Source Clinical Application and Resource) EMR is a web-based electronic medical record (EMR) system initially developed for primary care clinics in Canada. Oscar is a Java spring based web application with a relatively old codebase. OSCAR is widely used in the provinces of Ontario and British Columbia and is supported by many Oscar service providers.

Fast Healthcare Interoperability Resources (FHIR) is an HL7 standard describing data schema and a RESTful API for health information exchange. FHIR is fast emerging as the de-facto standard for interoperability between health information systems because of its simplicity and the use of existing web standards such as REST.

OSCAR being primarily designed for primary care clinics does not support interoperability with other systems out of the box. FHIR in its entirety is not supported by OSCAR. A partial implementation of FHIR to support the immunization dataflow as FHIR bundles is available. One of the requests that constantly pops up in the OSCAR community is the need for a full FHIR API implementation for OSCAR.

We had some initial discussions on how to go about implementing a FHIR API for OSCAR EMR. FHIR is a REST API exposing FHIR Resources such as Patients, Observations and CarePlan as JSON resources. The HAPI-FHIR java library defines all the FHIR resources and the associated functions. The first step in building the API is to map the relatively messy OSCAR data model to FHIR resources. The Patient resource has been mapped and is available in the OSCAR repository. This (/src/main/java/org/oscarehr/integration/fhir/model/Patient.java) can be used as the template to map other required resources.

The next step is to extend the REST API that is currently available to expose FHIR APIs after authentication. If you have some ideas/expertise/interest in this, please comment below.

Categories
OpenSource Resources

UMLS APIs for clinical vocabularies

Originally published by Bell Eapen at nuchange.ca on August 20, 2019. If you have some feedback, reach out to the author on Twitter,  LinkedIn or  Github.

UMLS, or Unified Medical Language System, is a set of files and software that brings together many health and biomedical vocabularies and standards to enable interoperability between computer systems.

Natural Language Processing (NLP) on the vast amount of data captured by electronic medical records (EMR) is gaining popularity. The recent advances in machine learning (ML) algorithms and the democratization of high-performance computing (HPC) have reduced the technical challenges in NLP. However, the real challenge is not the technology or the infrastructure, but the lack of interoperability — in this case, the inconsistent use of terminology systems.


natural language processing
UMLS for NLP

NLP tasks start with recognizing medical terms in the corpus of text and converting it into a standard terminology space such as SNOMED and ICD. This requires a terminology mapping service that can do this mapping in an easy and consistent manner. The Unified Medical Language System (UMLS) terminology server is the most popular for integrating and distributing key terminology, classification and coding standards. The consistent use of  UMLS resources leads to effective and interoperable biomedical information systems and services, including EMRs.


To make things easier, UMLS provides both REST-based and SOAP-based services that can be integrated into software applications. A high-level library that encapsulated these services, making the REST calls easy to the user is required for the efficient use of these resources.  Umlsjs is one such high-level library for the UMLS REST web services for javascript. It is free, open-source and available on NPM, making it easy to integrate into any javascript (for browsers) or any nodejs applications.


The umlsjs package is available on GitHub and the NPM. It is still work in progress and any coding/documentation contributions are welcome. Please read the CONTRIBUTING.md file on the repository for instructions. If you use it and find any issues, please report it on GitHub.


Categories
OpenSource Resources

Hephestus: Health data warehousing tool for public health and clinical research

Originally published by Bell Eapen at nuchange.ca on November 3, 2018. If you have some feedback, reach out to the author on TwitterLinkedIn or Github.

Health data warehousing is becoming an important requirement for deriving knowledge from the vast amount of health data that healthcare organizations collect. A data warehouse is vital for collaborative and predictive analytics. The first step in designing a data warehouse is to decide on a suitable data model. This is followed by the extract-transform-load (ETL) process that converts source data to the new data model amenable for analytics.

The OHDSI – OMOP Common Data Model is one such data model that allows for the systematic analysis of disparate observational databases and EMRs. The data from diverse systems needs to be extracted, transformed and loaded on to a CDM database. Once a database has been converted to the OMOP CDM, evidence can be generated using standardized analytics tools that are already available.

Each data source requires customized ETL tools for this conversion from the source data to CDM. The OHDSI ecosystem has made some tools available for helping the ETL process such as the White Rabbit and the Rabbit In a Hat. However, health data warehousing process is still challenging because of the variability of source databases in terms of structure and implementations.

Hephestus is an open-source python tool for this ETL process organized into modules to allow code reuse between various ETL tools for open-source EMR systems and data sources. Hephestus uses SqlAlchemy for database connection and automapping tables to classes and bonobo for managing ETL. The ultimate aim is to develop a tool that can translate the report from the OHDSI tools into an ETL script with minimal intervention. This is a good python starter project for eHealth geeks.

Anyone anywhere in the world can build their own environment that can store patient-level observational health data, convert their data to OHDSI’s open community data standards (including the OMOP Common Data Model), run open-source analytics using the OHDSI toolkit, and collaborate in OHDSI research studies that advance our shared mission toward reliable evidence generation. Join the journey! here

Disclaimer: Hephestus is just my experiment and is not a part of the official OHDSI toolset.

  • SSH URL
  • Clone URL
Categories
OpenSource

Are you ready to ‘Git’ into Open Source

Open Source health information systems provide cost-effective tools for healthcare. Even if you are not a coder, you may be able to contribute to open source projects. As a matter of fact, some open-source projects find it difficult to get volunteers to document and test the code. E-Health enthusiasts from the clinical and management fields often want to contribute to popular open source projects, but do not know how. 

Open source projects involve a collaboration of people with various skills, often with no way of physically meeting each other. In a complex software product, even a misplaced comma can break the system. How do open source projects effectively collaborate avoiding such code-breaking mistakes? Well, they use some specialized tools and workflows to manage code, many of which are not familiar to non-programmers. In the next few posts, I will introduce you to the most important tool that coders use; the versioning system. We shall discuss Git (the most popular versioning system), from a non-programmers perspective.

This is not for those who are familiar with Git and we will not be discussing advanced Git usage. Hence, let me state the assumptions that I am making about you as the reader. You have not heard of Git before. You are as scared of code as you are scared of python. When you hear Java, the first thing that comes to your mind is the island in Indonesia. You don’t know what ‘typing on the command line’ means. But you own a computer, know how to download and install programs, know how to navigate the web, wants to learn more about contributing to open-source projects and above all want to help save lives especially in resource-deprived areas. Watch the video below for inspiration.

At the end of this journey, you will know how to follow open-source projects and make minor code contributions. This might initiate you into learning computer programming, but that is not my intention. You might even win a free T-Shirt from DigitalOcean. If you are ready to jump right in, follow the steps herehttp://wiki.canehealth.com/index.php/GIT:_First_Steps_(Creating_GitHub_Account_and_Downloading_SourceTree)