Categories
Machine Learning Viewpoint

AI will never replace the doctor. Or will it?

Let me start with the usual and widely accepted narrative: AI is emerging as a major technological disrupter in medicine, crunching lots of data, providing accurate diagnosis and treatment. But it will (and can) never replace a doctor even in specialties amenable to machine-driven automation such as radiology or dermatology​1​. However, these assumptions are based on the current paradigms of medicine constrained by the boundaries of current cognitive abilities. Are we oblivious to a paradigm shift happening in medicine?


The human genome mapping project​2​ and the subsequent democratization of the ‘omics’ fields promised the new paradigm of ‘personalized medicine’ which never really materialized (at least till now)​3​. AI (used here as an encompassing term including big data analytics and machine learning) can potentially take personalized medicine to the realm of holistic medicine. Time will tell whether this paradigm shift will materialize. But it is important to understand how some of the concepts that we take for granted may get redefined and reconceptualized in the new paradigm (if it happens), just as modern medicine emerged from natural and alternative medical traditions.


The major tenets of modern medicine are diagnosis, prognosis and therapeutics (treatment). Diagnosis is the process of bucketing a given case into a pattern of observations that has been previously characterized — often represented by a recognizable name. Diabetes, Hypertension and typhoid fever are examples. The prognosis and the treatment depend on the diagnostic label assigned. Patterns that do not fit into the list emerge from time to time. A pattern that resembled pneumonia that emerged recently in Wuhan, China, caused by a coronavirus was labelled SARS-Cov-2. A common use case of AI in medicine is to assign a given set of observations into one of these named entities (diagnostic decision support systems). The clinical community argues that AI can help a clinician in this process, but cannot replace him or her. One of the main reasons for the clinician’s self-belief in irreplaceability is the fact that AI learns from existing labels — the training data set — that the clinicians themselves prepare.


The process of making a diagnosis is to reduce the stochastic observations in the human body into a set of named patterns (diagnoses) that humans can comprehend, identify and utilize. In an AI-dominated world ‘diagnoses’ lose their relevance as the machines can recognize, identify and utilize a potentially infinite number of patterns and entities. Even if ‘diagnoses’ exist, their number is likely to be huge, much beyond the cognitive capabilities of humans.


Currently, the prognosis of any disease state is based on limited observations and limited data points. Big data will extend these limits thereby making prognostic predictions more accurate. Machine learning models that drive such predictions are likely to be at best partially explainable and at worst complete black boxes. However, explainable or not, such prognostic predictors are likely to improve health system optimizations. The role of clinicians is going to be identifying the variables to optimize.


In the therapeutics realm, AI may push us closer to the promised personalized medicine. Traditional clinical research relying mostly on the ‘rigorous’ randomized controlled trials (RCT) may lose its relevance in the new paradigm. Some argue that RCTs have already become unsustainable with long turnover times and mounting costs. With no two humans having the same omics profile — the level of abstraction introduced by a statistically significant difference between the ‘random’ treatment and control groups — is useful for humans, but not for AI. The emerging methods such as nanotechnology, nanorobotics and 3D printing, combined with advanced predictive analytics, molecular modelling and drug designing would lead to tailored interventions that are created ‘just-in-time’ for every individual according to his or her needs. This process is likely to be beyond the reach of human comprehension, but human intervention may be needed to maintain the flow of data through the system.

‘Health’ is another concept that is taken for granted as something that everybody can instinctively understand. Health is widely recognized as a state of absence of disease. As disease/diagnosis states become infinite, ‘health’ may need a reconceptualization too. Let us call it Health 3.0 for now. Medicine ceases to be the art of restoring health but optimizing Health 3.0. I do not attempt to provide a framework to define Health 3.0 here, but posit that it will include abstract concepts such as happiness and quality of life, paradoxically beyond the cognitive capabilities of AI.

Clinicians may still be irreplaceable, but in helping AI to define health!
Some of the changes that AI and allied technologies can bring are already visible. The omics fields have introduced several subcategories of existing diagnostic entities​4​. In most cases, clinicians ignore these subtypes, seeing things at a higher and manageable level. Reinforcement Learning (RL) algorithms can potentially learn from big data that are not labelled by clinicians​5​. RL is closer to cognitive computing — computerized models that simulate human thought — optimizing ‘reward’, a concept closer to Health 3.0. Computer-aided drug design is becoming increasingly popular supplemented by an enormous amount of data derived from electronic medical records​6​.

I am neither trying to predict the future impact of AI in medicine nor arguing for or against the role of ‘human’ clinicians. The media and the scientific literature are replete with stories of AI approaching and in some cases surpassing, the clinicians in certain tasks. AI may not be an incremental disrupter that may change the way we practice. As paradigms change, some of the questions that we ask today such as — Can AI make the correct diagnosis, Can AI choose the correct treatment — may lose relevance? AI may never replace doctors, but it may change what doctors do and may take us a step closer to holistic medicine!

References

  1. 1.
    Karches KE. Against the iDoctor: why artificial intelligence should not replace physician judgment. Theor Med Bioeth. Published online April 2018:91-110. doi:10.1007/s11017-018-9442-3
  2. 2.
    Collins FS. The Human Genome Project: Lessons from Large-Scale Biology. Science. Published online April 11, 2003:286-290. doi:10.1126/science.1084564
  3. 3.
    Chen R, Snyder M. Promise of personalized omics to precision medicine. WIREs Syst Biol Med. Published online November 26, 2012:73-82. doi:10.1002/wsbm.1198
  4. 4.
    Boyd S, Galli S, Schrijver I, Zehnder J, Ashley E, Merker J. A Balanced Look at the Implications of Genomic (and Other “Omics”) Testing for Disease Diagnosis and Clinical Care. Genes. Published online September 1, 2014:748-766. doi:10.3390/genes5030748
  5. 5.
    Chen M, Herrera F, Hwang K. Cognitive Computing: Architecture, Technologies and Intelligent Applications. IEEE Access. Published online 2018:19774-19783. doi:10.1109/access.2018.2791469
  6. 6.
    Qian T, Zhu S, Hoshida Y. Use of big data in drug development for precision medicine: an update. Expert Review of Precision Medicine and Drug Development. Published online May 4, 2019:189-200. doi:10.1080/23808993.2019.1617632
Cite this article as: Eapen BR. (July 7, 2021). CanEHealth.com - AI will never replace the doctor. Or will it?. Retrieved October 20, 2021, from https://canehealth.com/2021/07/ai-will-never-replace-the-doctor-or-will-it/.
Categories
HIS

Death by beep? Bad sound design costs lives

This article was first published on Brighter World. Read the original article.

Medical alarms have appeared on the Emergency Care Research Institute’s list of top medical hazards four times — twice in the number one spot. According to a recent FDA survey, bad sound design for medical devices accounted for 566 deaths over four years, mostly because the sounds can be so annoying that they get turned down so doctors and nurses can concentrate, leading to potentially deadly consequences.

In this TEDx McMaster talk from February 2021, Michael Schutz, an associate professor of music cognition and percussion, explains how his research with the Music Acoustics Perception Learning (MAPLE) lab is helping to create better alarms — and better patient outcomes.

Read the papers related to this research

Categories
Healthcare Analytics Information Systems OpenSource

OHDSI OMOP to FHIR mapper

TL;DR Below is an open-source common-line tool for converting an OHDSI OMOP cohort (defined in ATLAS) to a FHIR bundle and vice versa.

Originally published by Bell Eapen at nuchange.ca on July 22, 2020. If you have some feedback, reach out to the author on Twitter,  LinkedIn or  Github.

OHDSI OMOP CDM is one of the most popular clinical data models for health data warehouses. The simple, but clinically motivated data structure is intuitively appealing to clinicians leading to its good adoption. In this respect, it has overtaken HL7-V3 which is more robust but has a steeper learning curve, especially for clinicians. The OHDSI OMOP CDM is widely used in the pharmaceutical industry for drug monitoring.

FHIR is emerging as the defacto standard for health system interoperability, owing largely to its simplicity and the use of existing and popular standards such as REST. As NoSQL databases become more and popular in healthcare, FHIR can also be a good persistence schema. It aligns well with search technologies such as elasticsearch.

As both standards are popular, conversion from one to the other may be commonly required. Researchers at Georgia Tech have an open-source tool – GT-FHIR2 – for mapping an existing OHDSI OMOP CDM database as FHIR endpoint. However, conversion between existing systems may not be easy with a full-stack solution. 

I have a simpler solution that I believe will be useful in the following scenarios:

  • To export a cohort to a FHIR based analytics tool.
  • To load new resources to OMOP CDM databases for incremental ETL.

Omopfhirmap is a command-line tool for mapping a OHDSI cohort, defined in ATLAS, to a FHIR bundle that can be optionally submitted to a FHIR server for processing. Conversely, it can process a FHIR bundle and add resources to an existing CDM database ignoring duplicates. Unlike GT-FHIR2, the OMOP on FHIR Project at Georgia Tech omopfhirmap does not expose OMOP database as FHIR endpoints. 

I have used spring-boot and JPA for easy wiring of services and abstraction of database and the hapi-fhir as it is an obvious choice for any java based FHIR applications. It is still work in progress and any help will be appreciated (Refer to CONTRBUTING.md).

Categories
Health Research Methodology Healthcare Analytics

OHDSI OMOP CDM ETL Tools in Python, .Net and Go

TL;DR Here are few OHDSI OMOP CDM tools that may save you time if you are developing ETL tools!

Originally published by Bell Eapen at nuchange.ca on June 11, 2020. If you have some feedback, reach out to the author on Twitter,  LinkedIn or  Github.

Python: pyomop | pypi
.NET: omopcdmlib | NuGet
Golang: gocdm

The COVID-19 pandemic brought to light many of the vulnerabilities in our data collection and analytics workflows. Lack of uniform data models limits the analytical capabilities of public health organizations and many of them have to re-invent the wheel even for basic analysis. As many other sectors embrace big data and machine learning, many healthcare analysts are still stuck with the basic data wrenching with Excel.

The OHDSI OMOP CDM (Common data model) for observational data is a popular initiative for bringing data into a common format that allows for collaborative research, large-scale analytics, and sharing of sophisticated tools and methodologies. Though OHDSI OMOP CDM is primarily for patient-centred observational analysis, mostly for clinical research, it can be used with minor tweaks for public health and epidemiologic data as well. We have written about some of the technical details here.

The OHDSI OMOP CDM is relatively simple and intuitive for clinical teams than emerging standards such as FHIR. Though the relational database approach and some of the software tools associated with OHDSI OMOP CDM are archaic, the data model is clinically motivated. There is an ecosystem of software tools for many of the analytics tools that can be used out of the box. The Observational Medical Outcomes Partnership (OMOP) CDM, now in its version 6.0, has simple but powerful vocabulary management. OHDSI OMOP CDM is a good choice for healthcare organizations moving towards health data warehousing and OLAP.

One weakness of OHDSI is the lack of tools for efficient ETL from existing EHR and HIS. Converting existing EHR data to the CDM is still a complex task that requires technical expertise. During the additional “home time” during the COVID pandemic, I have created three software libraries for ETL tool developers. These libraries in Python, .NET and Golang encapsulated the V6.0 CDM and helps in writing and reading data from a variety of databases with the V6.0 tables. The libraries also support creating the CDM tables for new databases and loading the vocabulary files.

Python: pyomop | pypi
.NET: omopcdmlib | NuGet
Golang: gocdm


These libraries might save you some time if you are building scripts for ETL to CDM. They are all open-source and free to use in your tools. Do give me a shout if you find these libraries useful and please star the repositories on GitHub.

Categories
Health Research Methodology Healthcare Analytics OpenSource

DADpy: The swiss army knife for discharge abstract database

Discharge Abstract Database (DAD) is a Canada-wide database of hospital admission and discharge data excluding the province of Quebec, maintained by the Canadian Institute for Health Information (CIHI). The data points in DAD include patient demographics, comorbidities coded in the International Statistical Classification of Diseases and Related Health Problems (ICD), interventions encoded in the Canadian Classification of Health Interventions (CCI) and the length of stay. DAD is the de-identified 10% sample available under the Data Liberation Initiative (DLI) for academic researchers. DAD is arguably the most comprehensive country-wide discharge dataset in the world.

The Swiss army knife for Discharge Abstract Database

Discharge Abstract Database is used for creating public reports for hospitals, researchers, and the general public. DAD data has also been used for disease-specific research and analysis, including public health, disease surveillance, and health services research. CIHI provides DAD in the SPSS (.sav) format with each record having horizontal fields for 20 comorbidities and 25 interventions. The format is not ideal for slicing and dicing the data for visualization for clinicians to obtain clinical insights.

DADpy provides a set of functions for using the DAD dataset for machine learning and visualization. The package does not include the dataset. Academic researchers can request the DAD dataset from CIHI. This is an unofficial repo, and I’m not affiliated with CIHI. Please retain the disclaimer below in forks.

Installation: (Will add to pypi soon)

We use poetry for development. PR are welcome. Please see CONTRIBUTING.md in the repo. Start by renaming .env.example to .env and add path for tests to run. Add jupiter notebooks to the notebook folder. Include the disclaimer below.

Disclaimer: Parts of this material are based on the Canadian Institute for Health Information Discharge Abstract Database Research Analytic Files (sampled from fiscal years 2016-17). However the analysis, conclusions, opinions and statements expressed herein are those of the author(s) and not those of the Canadian Institute for Health Information.

Let us know if you use DADpy for creating interesting jupyter notebooks. 

Categories
Healthcare Analytics OpenSource Resources

OSCAR EMR EForm Export (CSV) to FHIR

This is a simple application to convert a CSV file to a FHIR bundle and post it to a FHIR server in Golang. The OSCAR EMR has an EForm export tool that exports EForms to a CSV file that can be downloaded. This tool can load that CSV file to a FHIR server for consolidated analysis. This tool can be used with any CSV, if columns specified below (CSV format section) are present.

Use Cases

This is useful for family practice groups with multiple OSCAR EMR instances. Analysts at each site can use this to send data to a central FHIR server for centralized data analysis and reporting. Public health agencies using OSCAR or similar health information systems can use this to consolidate data collection.

How to build

First go get all dependencies This package includes three tools (Go build them separately from the cmd folder):

Fhirpost: The application for posting the csv fie to the FHIR server

Serverfhir: A simple FHIR server for testing (requires mongodb). We recommend using PHIS-DW for production.

Report: A simple application for descriptive statistics on the csv file

Format of the CSV file


Using vocabulary such as SNOMED for field names in the E-Form is very useful for consolidated analysis.

Each record should have:

demographicNo → The patient ID
dateCreated
efmfid → The ID of the eform
fdid → The ID of the each form field.
(The Eform export csv of OSCAR typically has all these fields and requires no further processing)

Mapping

  • Bundle with unique patients. All columns mapped to observations.
  • Submitter mapped to Practitioner.
  • Document type bundle with composition as the first entry
  • Unique fullUrls are generated.
  • PatientID is location + demographicNo
  • Budle of 1 composition, 1 practitioner, 1 or more patients, and many observations
  • Validates with R4 schema

How to use:

  • Change the settings in .env
  • You can compile this for Windows, Mac or Linux. Check the fhirmap.go file and make any desired changes. You should be able to figure out the mapping rules from this file.
  • It reads data.csv file from the same folder by default. (can be specified by the -file commandline argument: fhirpost -file=data.csv)
  • Start mongodb and run server and fhirpost in separate windows for testing.
  • On windows, you can just double-click executables to run. (Closes automatically after run)

Privacy and security:
This application does not encrypt the data. Use it only in a secure network.

Disclaimer:
This is an experimental application. Use it at your own risk.

Categories
Information Systems OpenSource

OSCAR EMR and FHIR

OSCAR (Open Source Clinical Application and Resource) EMR is a web-based electronic medical record (EMR) system initially developed for primary care clinics in Canada. Oscar is a Java spring based web application with a relatively old codebase. OSCAR is widely used in the provinces of Ontario and British Columbia and is supported by many Oscar service providers.

Fast Healthcare Interoperability Resources (FHIR) is an HL7 standard describing data schema and a RESTful API for health information exchange. FHIR is fast emerging as the de-facto standard for interoperability between health information systems because of its simplicity and the use of existing web standards such as REST.

OSCAR being primarily designed for primary care clinics does not support interoperability with other systems out of the box. FHIR in its entirety is not supported by OSCAR. A partial implementation of FHIR to support the immunization dataflow as FHIR bundles is available. One of the requests that constantly pops up in the OSCAR community is the need for a full FHIR API implementation for OSCAR.

We had some initial discussions on how to go about implementing a FHIR API for OSCAR EMR. FHIR is a REST API exposing FHIR Resources such as Patients, Observations and CarePlan as JSON resources. The HAPI-FHIR java library defines all the FHIR resources and the associated functions. The first step in building the API is to map the relatively messy OSCAR data model to FHIR resources. The Patient resource has been mapped and is available in the OSCAR repository. This (/src/main/java/org/oscarehr/integration/fhir/model/Patient.java) can be used as the template to map other required resources.

The next step is to extend the REST API that is currently available to expose FHIR APIs after authentication. If you have some ideas/expertise/interest in this, please comment below.

Categories
OpenSource Resources

UMLS APIs for clinical vocabularies

Originally published by Bell Eapen at nuchange.ca on August 20, 2019. If you have some feedback, reach out to the author on Twitter,  LinkedIn or  Github.

UMLS, or Unified Medical Language System, is a set of files and software that brings together many health and biomedical vocabularies and standards to enable interoperability between computer systems.

Natural Language Processing (NLP) on the vast amount of data captured by electronic medical records (EMR) is gaining popularity. The recent advances in machine learning (ML) algorithms and the democratization of high-performance computing (HPC) have reduced the technical challenges in NLP. However, the real challenge is not the technology or the infrastructure, but the lack of interoperability — in this case, the inconsistent use of terminology systems.


natural language processing
UMLS for NLP

NLP tasks start with recognizing medical terms in the corpus of text and converting it into a standard terminology space such as SNOMED and ICD. This requires a terminology mapping service that can do this mapping in an easy and consistent manner. The Unified Medical Language System (UMLS) terminology server is the most popular for integrating and distributing key terminology, classification and coding standards. The consistent use of  UMLS resources leads to effective and interoperable biomedical information systems and services, including EMRs.


To make things easier, UMLS provides both REST-based and SOAP-based services that can be integrated into software applications. A high-level library that encapsulated these services, making the REST calls easy to the user is required for the efficient use of these resources.  Umlsjs is one such high-level library for the UMLS REST web services for javascript. It is free, open-source and available on NPM, making it easy to integrate into any javascript (for browsers) or any nodejs applications.


The umlsjs package is available on GitHub and the NPM. It is still work in progress and any coding/documentation contributions are welcome. Please read the CONTRIBUTING.md file on the repository for instructions. If you use it and find any issues, please report it on GitHub.


Categories
Healthcare Analytics Research

Elasticsearch for analyzing CORD-19 dataset

COVID-19 Open Research Dataset (CORD-19) is a dataset of approximately 47,000 scholarly articles, about COVID-19, SARS-CoV-2, and related coronaviruses made free to the research community by a coalition of research groups. The articles are provided as JSON files for the global research community to apply natural language processing.

Siouxsie Wiles and Toby Morris / CC BY-SA (https://creativecommons.org/licenses/by-sa/4.0)

Elasticsearch (ES) is a Lucene based text search engine using schema-free JSON documents. Elasticsearch is fast and has clients libraries available for most programming languages including python. Loading the COVID-19 data on to an ES instance will be helpful for easy search and analysis, all within the comfort of the jupyter notebook. The availability of the Apache spark (spark) connector makes the exchange of data between ES and spark easy. I have listed below, the simple steps to load the files to an ES instance.

First, download and install ES and the ES-spark connector from here, and start ES. Next, Download and install Apache spark from here: https://spark.apache.org/ CORD-19 dataset is available here.

STEP 1: Create a spark session: (add the path to the connector jar)

STEP 2: Load the JSON files:

+——–+——————–+——————–+——————–+——–+——————–+——————–+
|abstract| back_matter| bib_entries| body_text|metadata| paper_id| ref_entries|
+——–+——————–+——————–+——————–+——–+——————–+——————–+
| []|[[[[456,, 453, 8 …|[[[[R, Zhang, [],…|[[[], [], , i c ,…| [[], ]|28b10724357672324…|[[, Fumagalli M, …|
+——–+——————–+——————–+——————–+——–+——————–+——————–+
only showing top 1 row

STEP 3: Select the required fields:

+——————–+——————–+——————–+——————–+——————–+
| paper_id| metadata.title| metadata.authors| abstract.text| body_text.text|
+——————–+——————–+——————–+——————–+——————–+
|28b10724357672324…| | []| []|[i c , a n t i f …|
|1aa3e788fc6b03c14…|Dark Proteome of …|[[[Himachal Prade…|[Recently emerged…|[World health org…|
|558d318e1655da9f5…|Connectivity anal…|[[[University of …|[We utilized a ce…|[Schizophrenia is…|
+——————–+——————–+——————–+——————–+——————–+
only showing top 3 rows

STEP 4: Create an index and write the spark df into ES:

STEP 5: Do the search!

That’s it! You can now use it for search and do analytics on the returned records. Next, I will show you how to use QRMine on CORD-19!

Categories
Research Resources

McMaster develops tool for COVID-19 battle

This article was first published on Brighter World. Read the original article.

McMaster University researchers have developed a tool to share with the international health sciences community which can help determine how the coronavirus that causes COVID-19 is spreading and whether it is evolving.

Simply put, the tool is a set of molecular ‘fishing hooks’ to isolate the virus, SARS-CoV-2, from biological samples. This allows laboratory researchers to gain insight into the properties of the isolated virus COVID-19 by then using a technology called next-generation sequencing.

The details were published on Preprints.org.

“You wouldn’t use this technology to diagnose the patient, but you could use it to track how the virus evolves over time, how it transmits between people, how well it survives outside the body, and to find answers to other questions,” said principal investigator Andrew McArthur, associate professor of biochemistry and biomedical sciences, and a member of the Michael G. DeGroote Institute for Infectious Disease Research (IIDR) at McMaster.

“Our tool, partnered with next-generation sequencing, can help scientists understand, for example, if the virus has evolved between patient A and patient B.”

McArthur points out that the standard technique to isolate the virus involves culturing it in cells in contained labs by trained specialists. The McMaster tool gives a faster, safer, easier and less-expensive alternative, he said.

“Not every municipality or country will have specialized labs and researchers, not to mention that culturing a virus is dangerous,” he said.

“This tool removes some of these barriers and allows for more widespread testing and analyses.”

First author Jalees Nasir, a PhD candidate in biochemistry and biomedical sciences at McMaster, has been working with McMaster and Sunnybrook Health Sciences Centre researchers to develop a bait capture tool that can specifically isolate respiratory viruses. When news recently broke of COVID-19, Nasir knew he could develop a “sequence recipe” to help researchers to isolate the novel virus more easily.

“When you have samples from a patient, for example, it can consist of a combination of virus, bacteria and human material, but you’re really only interested in the virus,” Nasir said. “It’s almost like a fishing expedition. We are designing baits that we can throw into the sample as hooks and pull out the virus from that mixture.”

The decision was made to release the sequences publicly without the normal practice of peer-review or clinical evaluation to ensure this tool was available to all quickly, recognizing the urgency of the situation, said McArthur.

The research team plans to collaborate with Sunnybrook for further testing but also hopes other scientists can quickly perform their own validation.

McArthur added that a postdoctoral fellow in his lab, David Speicher, is currently communicating details of the technology to the international clinical epidemiology community.

“Since we’re dealing with an outbreak, there was no value in us doing a traditional academic study and the experiments,” said McArthur. “We designed this tool and are releasing it for use by others.

“In part, we’re relying on our track record of knowing what we are doing, but we’re also relying on people who have the virus samples in hand being able to do the validation experiment so that it’s reliable.”

The research was funded by the Comprehensive Antibiotic Resistance Database at McMaster.

This article was first published on Brighter World. Read the original article.