Journal:Infrastructure tools to support an effective radiation oncology learning health system

From LIMSWiki
Revision as of 22:25, 9 May 2024 by Shawndouglas (talk | contribs) (Saving and adding more.)
Jump to navigationJump to search
Full article title Infrastructure tools to support an effective radiation oncology learning health system
Journal Journal of Applied Clinical Medical Physics
Author(s) Kapoor, Rishabh; Sleeman IV, William C.; Ghosh, Preetam; Palta, Jatinder
Author affiliation(s) Virginia Commonwealth University
Primary contact rishabh dot kapoor at vcuhealth dot org
Year published 2023
Volume and issue 24(10)
Article # e14127
DOI 10.1002/acm2.14127
ISSN 1526-9914
Distribution license Creative Commons Attribution 4.0 International
Website https://aapm.onlinelibrary.wiley.com/doi/10.1002/acm2.14127
Download https://aapm.onlinelibrary.wiley.com/doi/pdfdirect/10.1002/acm2.14127 (PDF)

Abstract

Purpose: The concept of the radiation oncology learning health system (RO‐LHS) represents a promising approach to improving the quality of care by integrating clinical, dosimetry, treatment delivery, and research data in real‐time. This paper describes a novel set of tools to support the development of an RO‐LHS and the current challenges they can address.

Methods: We present a knowledge graph‐based approach to map radiotherapy data from clinical databases to an ontology‐based data repository using FAIR principles. This strategy ensures that the data are easily discoverable, accessible, and can be used by other clinical decision support systems. It allows for visualization, presentation, and analysis of valuable data and information to identify trends and patterns in patient outcomes. We designed a search engine that utilizes ontology‐based keyword searching and synonym‐based term matching that leverages the hierarchical nature of ontologies to retrieve patient records based on parent and children classes, as well as connects to the Bioportal database for relevant clinical attributes retrieval. To identify similar patients, a method involving text corpus creation and vector embedding models (Word2Vec, Doc2Vec, GloVe, and FastText) are employed, using cosine similarity and distance metrics.

Results: The data pipeline and tool were tested with 1,660 patient clinical and dosimetry records, resulting in 504,180 RDF (Resource Description Framework) tuples and visualized data relationships using graph‐based representations. Patient similarity analysis using embedding models showed that the Word2Vec model had the highest mean cosine similarity, while the GloVe model exhibited more compact embeddings with lower Euclidean and Manhattan distances.

Conclusions: The framework and tools described support the development of an RO‐LHS. By integrating diverse data sources and facilitating data discovery and analysis, they contribute to continuous learning and improvement in patient care. The tools enhance the quality of care by enabling the identification of cohorts, clinical decision support, and the development of clinical studies and machine learning (ML) programs in radiation oncology.

Keywords: FAIR, learning health system infrastructure, ontology, Radiation Oncology Ontology, Semantic Web

Background and significance

For the past three decades, there is a growing interest in building learning organizations to address the most pressing and complex business, social, and economic challenges facing society today. [1] For healthcare, the National Academy of Medicine has defined the concept of a learning health system (LHS) as an entity where science, incentive, culture, and informatics are aligned for continuous innovation, with new knowledge capture and discovery as an integral part for practicing evidence-based medicine. [2] The current dependency on randomized controlled clinical trials that use a controlled environment for scientific evidence creation with only a small percent (<3%) of patient samples is inadequate now and may be irrelevant in the future since these trials take too much time, are too expensive, and are fraught with questions of generalizability. The Agency for Healthcare Research and Quality has also been promoting the development of LHSs as part of a key strategy for healthcare organizations to make transformational changes to improve healthcare quality and value. Large-scale healthcare systems are now recognizing the need to build infrastructure capable of continuous learning and improvement in delivering care to patients and address critical population health issues. [3] In an LHS, data collection should be performed from various sources such as electronic health records (EHRs), treatment delivery records, imaging records, patient-generated data records, and administrative and claims data, which then allows for this aggregated data to be analyzed for generating new insights and knowledge that can be used to improve patient care and outcomes.

However, only a few attempts at leveraging existing infrastructure tools used in routine clinical practice to transform the healthcare domain into an LHS have been made. [5, 6] Some examples of actual implementation have emerged, but by and large these concepts have been mostly discussed as conceptual ideas and strategies in the literature. There are several data organization and management challenges that must be addressed in order to effectively implement a radiation oncology LHS:

1. Data integration: Radiation oncology data are generated from a variety of sources, including EHRs, imaging systems, treatment planning systems (TPSs), and clinical trials. Integration of this data into a single repository can be challenging due to differences in data formats, terminologies, and storage systems. There is often significant semantic heterogeneity in the way that different clinicians and researchers use terminology to describe radiation oncology data. For example, different institutions may use different codes or terms to describe the same condition or treatment.
2. Data stored in disparate database schemas: Presently, the EHR, TPS, and treatment management system (TMS) data are housed in a series of relational database management systems (RDMS), which have rigid database structures, varying data schemas and can include lots of uncoded textual data. Tumor registries also stores data in their own defined schemas. Although the column names in the relational databases between two software products might be the same, semantic meaning based on the application of use may be completely different. Changing a database schema requires a lot of programming effort and code changes because of the rigid structure of the stored data, and it is generally advisable to retire old tables and build new tables with the added column definitions.
3. Episodic linking of records: Episodic linking of records refers to the process of integrating patient data from multiple encounters or episodes of care into a single comprehensive record. This record includes information about the patient's medical history, diagnosis, treatment plan, and outcomes, which can be used to improve care delivery, research, and education. Linking multiple data sources based on the patients episodic history of care is quite challenging because the heterogeneity of these data sources does not normally follow any common data storage standards.
4. Build data query tools based on semantic meaning of the data: Since the data are currently stored in multiple RDMSs for the specific purpose to cater the operations aspects of the patient care, extracting common semantic meaning from this data is very challenging. Common semantic meaning in healthcare data is typically achieved through the use of standardized vocabularies and ontologies that define concepts and relationships between them. Developing data query tools based on semantic meaning requires a high level of expertise in both the technical and domain-specific aspects of radiation oncology. Moreover, executing complex data queries, which includes tree-based queries, recursive queries, and derived data queries requires multiple tables joining operations in RDMSs, which is a costly operation.

While we are on the cusp of an artificial intelligence (AI) revolution in biomedicine, with the fast-growing development of advanced machine learning (ML) methods that can analyze complex datasets, there is an urgent need for a scalable intelligent infrastructure that can support these methods. The radiation oncology domain is also one of the most technically advanced medical specialties, with a long history of electronic data generation (e.g., radiation treatment simulation, treatment planning, etc.) that is modeled for each individual patient. This large volume of patient-specific real-world data captured during routine clinical practice, dosimetry, and treatment delivery make this domain ideally suited for rapid learning. [4] Rapid learning concepts could be applied using an LHS, providing the potential to improve patient outcomes and care delivery, reduce costs, and generate new knowledge from real world clinical and dosimetry data.


Abbreviations, acronyms, and initialisms

Acknowledgements

References

Notes

This presentation is faithful to the original, with only a few minor changes to presentation, though grammar and word usage was substantially updated for improved readability. In some cases important information was missing from the references, and that information was added.