MDS-ED: Multimodal Decision Support in the Emergency Department -- a Benchmark Dataset for Diagnoses and Deterioration Prediction in Emergency Medicine

AI-generated keywords: Multimodal Decision Support Emergency Department MIMIC-IV Dataset Benchmarking Protocol AI Models

AI-generated Key Points

  • Introduction of a dataset and benchmarking protocol for evaluating multimodal decision support in the emergency department (ED)
  • Dataset based on MIMIC-IV with diverse data modalities collected within the first 1.5 hours of patient arrival
  • Analysis covers 1443 clinical labels for predicting diagnoses with ICD-10 codes and forecasting patient deterioration
  • Multimodal diagnostic model achieves AUROC score over 0.8 for various conditions including cardiac issues like myocardial infarction and non-cardiac conditions such as renal disease and diabetes
  • Deterioration model performs well with scores above 0.8 for critical events like cardiac arrest, mechanical ventilation, ICU admission, short- and long-term mortality
  • Effectiveness of incorporating raw waveform data into models highlighted
  • Potential to revolutionize decision-making in acute and emergency medicine by enhancing acute care through AI models enabling early diagnosis, predicting admissions to different care units, estimating survival rates, and providing cost-effective diagnoses
  • Methodology involves linking ECG waveforms from MIMIC-IV to clinical features from MIMIC-IV and MIMIC-IV-ED datasets to predict discharge diagnoses and patient deterioration during an ED visit
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Juan Miguel Lopez Alcaraz, Nils Strodthoff

14 pages, 1 figure, code available under https://github.com/AI4HealthUOL/MDS-ED
License: CC BY 4.0

Abstract: Background: Benchmarking medical decision support algorithms often struggles due to limited access to datasets, narrow prediction tasks, and restricted input modalities. These limitations affect their clinical relevance and performance in high-stakes areas like emergency care, complicating replication, validation, and improvement of benchmarks. Methods: We introduce a dataset based on MIMIC-IV, benchmarking protocol, and initial results for evaluating multimodal decision support in the emergency department (ED). We use diverse data modalities from the first 1.5 hours of patient arrival, including demographics, biometrics, vital signs, lab values, and electrocardiogram waveforms. We analyze 1443 clinical labels across two contexts: predicting diagnoses with ICD-10 codes and forecasting patient deterioration. Results: Our multimodal diagnostic model achieves an AUROC score over 0.8 in a statistically significant manner for 357 out of 1428 conditions, including cardiac issues like myocardial infarction and non-cardiac conditions such as renal disease and diabetes. The deterioration model scores above 0.8 in a statistically significant manner for 13 out of 15 targets, including critical events like cardiac arrest and mechanical ventilation, ICU admission as well as short- and long-term mortality. Incorporating raw waveform data significantly improves model performance, which represents one of the first robust demonstrations of this effect. Conclusions: This study highlights the uniqueness of our dataset, which encompasses a wide range of clinical tasks and utilizes a comprehensive set of features collected early during the emergency after arriving at the ED. The strong performance, as evidenced by high AUROC scores across diagnostic and deterioration targets, underscores the potential of our approach to revolutionize decision-making in acute and emergency medicine.

Submitted to arXiv on 25 Jul. 2024

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2407.17856v1

The researchers have introduced a dataset and benchmarking protocol for evaluating multimodal decision support in the emergency department (ED) to address challenges in benchmarking medical decision support algorithms. The dataset is based on MIMIC-IV and includes diverse data modalities collected within the first 1.5 hours of patient arrival. The analysis covers 1443 clinical labels across two contexts: predicting diagnoses with ICD-10 codes and forecasting patient deterioration. Results show that the multimodal diagnostic model achieves an AUROC score over 0.8 for 357 out of 1428 conditions including cardiac issues like myocardial infarction and non-cardiac conditions such as renal disease and diabetes. The deterioration model performs well with scores above 0.8 for critical events like cardiac arrest, mechanical ventilation, ICU admission, short- and long-term mortality, highlighting the effectiveness of incorporating raw waveform data into the models. This study emphasizes the uniqueness of their dataset in covering a wide range of clinical tasks early in the ED visit and its potential to revolutionize decision-making in acute and emergency medicine. Furthermore, it discusses how AI models can enhance acute care by enabling early diagnosis, predicting admissions to different care units, estimating survival rates, and providing cost-effective diagnoses. While there has been exponential growth in AI-related publications in healthcare, many studies have limitations such as narrow scopes or requiring costly diagnostic tests. The methodology used involves linking ECG waveforms from MIMIC-IV-ECG to clinical features from MIMIC-IV and MIMIC-IV-ED datasets to create a comprehensive dataset for predicting discharge diagnoses and patient deterioration during an ED visit. Overall, this research contributes valuable insights into improving decision-making processes in emergency medicine through advanced AI models leveraging multimodal data early in patient care.
Created on 30 Sep. 2025

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.