einstein (São Paulo). 15/Aug/2023;21(Suppl 1):EISIC_MV0022.
Paving the way to Precision Medicine in ICU: Biobank integration with OMOP-CDM. Challenges, opportunities, and insights from Hospital Israelita Albert Einstein, Brazil
DOI: 10.31744/einstein_journal/2023ABS_EISIC_MV0022
III Einstein International Symposium on Intensive Care and the XXX International Symposium on Mechanical Ventilation. Aug 16-18, 2023.
Category: Safety / Quality / Management
Introduction:
Integration of biobank information into the OMOP Common Data Model (CDM) presents both opportunities and challenges in the field of healthcare data management. However, the process of mapping biobank information to the OMOP CDM poses various challenges, including data harmonization, standardization of terminologies, ensuring data quality, and addressing privacy and ethical considerations.()
Objective:
To demonstrate the challenges encountered in mapping the biobank patients of Hospital Israelita Albert Einstein (HIAE) to the OMOP CDM, as well as the opportunities and insights that arised in the context of critically ill patients.
Methods:
Qualitative and descriptive study (Conversion to CDM uses SQL and Python scripts). The study utilizes SQL and Python for data mapping and employs Extract, Transform, and Load (ETL) processes. The OHDSI Atlas tool is utilized to visualize mapped data and create cohorts. The Impala engine is used to retrieve data from files stored in Hadoop Distributed File System in OMOP CDM format.
Results:
The architecture of the HIAE OMOP Biobank is built upon a local Hadoop cluster boasting a total of 168 cores, 655 GB of memory, and 32.4 TB of storage. This cluster is shared with additional applications and can be accessed through Impala or Spark. Currently, a total of 4,788 biobank patients have been successfully mapped (), with their information spread across 11 OMOP-CDM Domain tables, resulting in a grand total of 35.093.493 records and 5,942 unique terms. Of those, 186 patients are critically ill and 2,790 samples of them are stored. By aligning biobank data with the standardized structure of the OMOP-CDM, researchers and healthcare providers can leverage the vast potential of biobank resources for real-world evidence generation, observational research, and personalized medicine. Precision medicine-based interventions could be proposed and tested in clinical trials, using this structure, in near future, in a scenario in ICUs of many negative/neutral trials and generic diagnosis (syndromes).() The project encountered several challenges, some of which are commonly faced by projects adopting this model. These challenges include a shortage of experienced teams knowledgeable in OMOP, the requirement for specialized computational resources, the need for ETL work, and the integration of diverse databases.(,) However, working with genetic data introduces additional complexities. The standard OMOP-CDM certification is still being developed for genetic data, resulting in a lack of appropriate concepts for detailed mapping of information.()
Conclusion:
The initial experience on integrating biobank data and OMOP-CDM in HIAE here reported is believed to be the way to launch the conditions to bring precision to the ICU. Allowing collaborative research using this infrastructure created can support Intensive Care Medicine to overcome the challenge of treating complex and almost unique patients.
220