In the pursuit of safe and effective medical care, personalized medicine at scale requires methods that can distill insights from longitudinal patient journeys. These journeys are essentially sequences of medical events. Pretrained foundation models on large-scale medical event data offer a promising direction for scaling real-world evidence generation and generalizing to diverse downstream tasks. Leveraging Epic Cosmos, a dataset comprising de-identified longitudinal health records for over 300 million unique patients and 16.3 billion encounters from 310 health systems, the Cosmos Medical Event Transformer (CoMET) models were introduced. These decoder-only transformer models were pretrained on a massive dataset representing 118 million patients and 115 billion discrete medical events. A comprehensive scaling-law study was conducted for medical event data, establishing a methodology for pretraining and revealing power-law scaling relationships for compute, tokens, and model size. Subsequently, a series of compute-optimal models with up to 1 billion parameters were pretrained. Conditioned on a patient's real-world history, CoMET can autoregressively generate the next medical event, simulating patient health timelines. The study encompassed 78 real-world tasks including diagnosis prediction, disease prognosis, and healthcare operations. Remarkably, CoMET outperformed or matched task-specific supervised models on these tasks without requiring task-specific fine-tuning or few-shot examples. The predictive power of CoMET consistently improved as the model and pretraining scale increased. Results demonstrate that CoMET effectively captures complex clinical dynamics as a generative medical event foundation model. This framework provides an extensible and generalizable approach to support clinical decision-making, streamline healthcare operations, and enhance patient outcomes. Furthermore,<kg>the introduction of Epic Cosmos</kg> has addressed challenges in leveraging real-world data for personalized medicine at scale by aggregating de-identified longitudinal health records across multiple health systems. The platform unifies various clinical data types to support patient care and accelerate scientific discovery while delivering actionable insights to clinicians at the point of care through features like Cosmos Median Length of Stay and Best Care Choices for My Patient™. Despite the vast potential of Cosmos data in informing healthcare decisions and research priorities such as understanding trends in healthcare utilization or investigating rare diseases, answering specific clinical questions still requires manual effort in crafting custom cohort definitions and feature-engineering pipelines. To enable routine clinical decision-making with personalized medicine at scale using RWE demands tools that can learn from integrated patient records efficiently answer complex medical inquiries across diverse contexts.
- - Personalized medicine at scale requires distilling insights from longitudinal patient journeys, which are sequences of medical events.
- - Pretrained foundation models on large-scale medical event data, such as the Cosmos Medical Event Transformer (CoMET) models, offer a promising direction for scaling real-world evidence generation and generalizing to diverse downstream tasks.
- - CoMET, pretrained on a dataset representing 118 million patients and 115 billion discrete medical events, can autoregressively generate the next medical event based on a patient's history and outperformed task-specific supervised models across 78 real-world tasks without requiring fine-tuning.
- - The study established a methodology for pretraining transformer models and revealed power-law scaling relationships for compute, tokens, and model size in medical event data.
- - CoMET effectively captures complex clinical dynamics as a generative medical event foundation model, supporting clinical decision-making, streamlining healthcare operations, and enhancing patient outcomes.
Summary- Personalized medicine means using information from a person's medical history to help them get better.
- Scientists have created a special computer program called CoMET that can predict what might happen next in someone's medical journey.
- CoMET was trained on data from millions of patients and billions of medical events, and it can make predictions without needing extra training.
- The study showed how to train these computer models and found patterns in how they work with medical data.
- CoMET helps doctors make better decisions, run hospitals more smoothly, and improve patient health.
Definitions- Personalized medicine: Tailoring medical treatment to individual characteristics of each patient.
- Longitudinal: Relating to data collected over a long period of time.
- Pretrained: A model that has been trained on a large dataset before being used for specific tasks.
- Autoregressively: Making predictions based on previous events in a sequence.
- Generative: Capable of producing new content or information.
Introduction
In the world of healthcare, personalized medicine has emerged as a promising approach to providing safe and effective medical care. This method involves tailoring treatments and interventions to individual patients based on their unique characteristics, such as genetics, lifestyle, and medical history. However, implementing personalized medicine at scale presents challenges in distilling insights from longitudinal patient journeys – essentially sequences of medical events.
To address this issue, researchers have turned to pretrained foundation models on large-scale medical event data. These models offer a promising direction for scaling real-world evidence (RWE) generation and generalizing to diverse downstream tasks. In this article, we will delve into a recent research paper that introduces the Cosmos Medical Event Transformer (CoMET) models – decoder-only transformer models pretrained on a massive dataset representing 118 million patients and 115 billion discrete medical events.
The Dataset: Epic Cosmos
The CoMET models were trained using Epic Cosmos – a dataset comprising de-identified longitudinal health records for over 300 million unique patients and 16.3 billion encounters from 310 health systems. This platform addresses challenges in leveraging real-world data for personalized medicine at scale by aggregating de-identified longitudinal health records across multiple health systems.
Epic Cosmos unifies various clinical data types to support patient care and accelerate scientific discovery while delivering actionable insights to clinicians at the point of care through features like Cosmos Median Length of Stay and Best Care Choices for My Patient™. It provides an extensible and generalizable approach to support clinical decision-making, streamline healthcare operations, and enhance patient outcomes.
Despite its vast potential in informing healthcare decisions and research priorities such as understanding trends in healthcare utilization or investigating rare diseases, answering specific clinical questions still requires manual effort in crafting custom cohort definitions and feature-engineering pipelines.
The Study: Pretraining CoMET Models
The study conducted by the researchers involved pretraining CoMET models on a massive dataset representing 118 million patients and 115 billion discrete medical events. A comprehensive scaling-law study was also conducted for medical event data, establishing a methodology for pretraining and revealing power-law scaling relationships for compute, tokens, and model size.
Subsequently, a series of compute-optimal models with up to 1 billion parameters were pretrained. Conditioned on a patient's real-world history, CoMET can autoregressively generate the next medical event, simulating patient health timelines.
Results
The study encompassed 78 real-world tasks including diagnosis prediction, disease prognosis, and healthcare operations. Remarkably, CoMET outperformed or matched task-specific supervised models on these tasks without requiring task-specific fine-tuning or few-shot examples. The predictive power of CoMET consistently improved as the model and pretraining scale increased.
These results demonstrate that CoMET effectively captures complex clinical dynamics as a generative medical event foundation model. This framework provides an extensible and generalizable approach to support clinical decision-making, streamline healthcare operations, and enhance patient outcomes.
Conclusion
In conclusion,the introduction of Epic Cosmos has addressed challenges in leveraging real-world data for personalized medicine at scale by aggregating de-identified longitudinal health records across multiple health systems. The use of pretrained foundation models like CoMET offers a promising direction for scaling RWE generation and generalizing to diverse downstream tasks.
This research paper highlights the potential of using large-scale medical event data in improving healthcare outcomes through personalized medicine at scale. With further advancements in technology and access to comprehensive datasets like Epic Cosmos, we can expect significant progress in this field in the future – ultimately leading to better patient care worldwide.