Natural language processing to identify lupus nephritis phenotype in electronic health records

AI-generated keywords: Systemic Lupus Erythematosus

AI-generated Key Points

Systemic lupus erythematosus (SLE) is a rare autoimmune disorder characterized by unpredictable flares and remission with diverse manifestations.
Lupus nephritis, a major manifestation of SLE, can cause organ damage and mortality.
Accurate identification of lupus nephritis is crucial for large cohort observational studies and clinical trials.
Procedure codes and structured data in electronic health records (EHRs) can help recognize lupus nephritis, but critical information like histologic reports and medical history narratives require sophisticated text processing.
Researchers developed algorithms to identify lupus nephritis using EHR data, with and without natural language processing (NLP).
Four algorithms were created: a rule-based algorithm using only structured data as the baseline, and three algorithms utilizing different NLP models.
The best performing NLP model showed significant improvement in F measure compared to the baseline algorithm in both datasets used for validation.
NLP-based algorithms have the potential to accurately identify lupus nephritis in EHRs.
This has important implications for recruitment, study design, analysis in large cohort observational studies and clinical trials focused on SLE.

Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Yu Deng, Jennifer A. Pacheco, Anh Chung, Chengsheng Mao, Joshua C. Smith, Juan Zhao, Wei-Qi Wei, April Barnado, Chunhua Weng, Cong Liu, Adam Cordon, Jingzhi Yu, Yacob Tedla, Abel Kho, Rosalind Ramsey-Goldman, Theresa Walunas, Yuan Luo

arXiv: 2112.10821v1 - DOI (cs.LG)

License: CC BY 4.0

Abstract: Systemic lupus erythematosus (SLE) is a rare autoimmune disorder characterized by an unpredictable course of flares and remission with diverse manifestations. Lupus nephritis, one of the major disease manifestations of SLE for organ damage and mortality, is a key component of lupus classification criteria. Accurately identifying lupus nephritis in electronic health records (EHRs) would therefore benefit large cohort observational studies and clinical trials where characterization of the patient population is critical for recruitment, study design, and analysis. Lupus nephritis can be recognized through procedure codes and structured data, such as laboratory tests. However, other critical information documenting lupus nephritis, such as histologic reports from kidney biopsies and prior medical history narratives, require sophisticated text processing to mine information from pathology reports and clinical notes. In this study, we developed algorithms to identify lupus nephritis with and without natural language processing (NLP) using EHR data. We developed four algorithms: a rule-based algorithm using only structured data (baseline algorithm) and three algorithms using different NLP models. The three NLP models are based on regularized logistic regression and use different sets of features including positive mention of concept unique identifiers (CUIs), number of appearances of CUIs, and a mixture of three components respectively. The baseline algorithm and the best performed NLP algorithm were external validated on a dataset from Vanderbilt University Medical Center (VUMC). Our best performing NLP model incorporating features from both structured data, regular expression concepts, and mapped CUIs improved F measure in both the NMEDW (0.41 vs 0.79) and VUMC (0.62 vs 0.96) datasets compared to the baseline lupus nephritis algorithm.

Submitted to arXiv on 20 Dec. 2021

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2112.10821v1

Comprehensive Summary
Key points
Layman's Summary
Blog article

Systemic lupus erythematosus (SLE) is a rare autoimmune disorder characterized by unpredictable flares and remission with diverse manifestations. Lupus nephritis, a major manifestation of SLE, can cause organ damage and mortality, making its accurate identification crucial for large cohort observational studies and clinical trials. While procedure codes and structured data like laboratory tests can help recognize lupus nephritis in electronic health records (EHRs), critical information such as histologic reports from kidney biopsies and medical history narratives require sophisticated text processing. In this study, the researchers developed algorithms to identify lupus nephritis using EHR data, with and without natural language processing (NLP). They created four algorithms: a rule-based algorithm using only structured data as the baseline, and three algorithms utilizing different NLP models. The NLP models were based on regularized logistic regression and incorporated various features like positive mention of concept unique identifiers (CUIs) and the number of appearances of CUIs. To validate their algorithms, the researchers conducted external validation on a dataset from Vanderbilt University Medical Center (VUMC). The best performing NLP model, which incorporated features from both structured data, regular expression concepts, and mapped CUIs, showed significant improvement in F measure compared to the baseline algorithm in both the NMEDW dataset (0.41 vs 0.79) and VUMC dataset (0.62 vs 0.96). The findings highlight the potential of NLP-based algorithms in accurately identifying lupus nephritis in EHRs. This has important implications for recruitment, study design, analysis in large cohort observational studies and clinical trials focused on SLE. By leveraging sophisticated text processing techniques to mine information from pathology reports and medical history narratives, researchers can enhance their understanding of lupus nephritis phenotypes for improved patient characterization.

- Systemic lupus erythematosus (SLE) is a rare autoimmune disorder characterized by unpredictable flares and remission with diverse manifestations.
- Lupus nephritis, a major manifestation of SLE, can cause organ damage and mortality.
- Accurate identification of lupus nephritis is crucial for large cohort observational studies and clinical trials.
- Procedure codes and structured data in electronic health records (EHRs) can help recognize lupus nephritis, but critical information like histologic reports and medical history narratives require sophisticated text processing.
- Researchers developed algorithms to identify lupus nephritis using EHR data, with and without natural language processing (NLP).
- Four algorithms were created: a rule-based algorithm using only structured data as the baseline, and three algorithms utilizing different NLP models.
- The best performing NLP model showed significant improvement in F measure compared to the baseline algorithm in both datasets used for validation.
- NLP-based algorithms have the potential to accurately identify lupus nephritis in EHRs.
- This has important implications for recruitment, study design, analysis in large cohort observational studies and clinical trials focused on SLE.

Systemic lupus erythematosus (SLE) is a rare disease where the body's immune system attacks itself and causes different symptoms that come and go. Lupus nephritis is a serious problem that can damage organs and even cause death. It is important to find lupus nephritis accurately for big studies and tests. Electronic health records (EHRs) can help find lupus nephritis, but some information needs special computer processing. Scientists made computer programs to find lupus nephritis in EHRs, using different methods. The best program improved a lot compared to the basic one. These programs can help with big studies and tests for SLE." Definitions- Systemic lupus erythematosus (SLE): A rare disease where the body's immune system attacks itself. - Autoimmune disorder: When the immune system mistakenly attacks healthy cells in the body. - Flares: Periods when symptoms of a disease become worse. - Remission: Periods when symptoms of a disease improve or disappear. - Manifestations: Different ways a disease shows up or affects the body. - Lupus nephritis: Kidney problems caused by systemic lupus erythematosus (SLE). - Organ damage: Harm or injury to organs in the body. - Mortality: The state of being dead or causing death. - Accurate identification: Finding something correctly without mistakes. - Cohort observational studies: Studies that follow groups of people over time to learn about their

Understanding Lupus Nephritis with Natural Language Processing

Systemic lupus erythematosus (SLE) is a rare autoimmune disorder characterized by unpredictable flares and remission with diverse manifestations. One of the major manifestations of SLE is lupus nephritis, which can cause organ damage and mortality. Accurately identifying lupus nephritis in large cohort observational studies and clinical trials is therefore essential for understanding this condition better and improving patient care. In this study, researchers developed algorithms to identify lupus nephritis using electronic health records (EHRs). They created four algorithms: a rule-based algorithm using only structured data as the baseline, and three algorithms utilizing different natural language processing (NLP) models. The NLP models were based on regularized logistic regression and incorporated various features like positive mention of concept unique identifiers (CUIs) and the number of appearances of CUIs. To validate their algorithms, they conducted external validation on datasets from Vanderbilt University Medical Center (VUMC) and NMEDW.

Baseline Algorithm

The baseline algorithm used only structured data such as procedure codes to recognize lupus nephritis in EHRs. This was compared to three NLP-based algorithms that incorporated features from both structured data, regular expression concepts, mapped CUIs, etc., into their models.

Results

The best performing NLP model showed significant improvement in F measure compared to the baseline algorithm in both the NMEDW dataset (0.41 vs 0.79) and VUMC dataset (0.62 vs 0.96). This highlights the potential of NLP-based algorithms in accurately identifying lupus nephritis from EHRs for improved patient characterization when it comes to recruitment, study design, analysis in large cohort observational studies or clinical trials focused on SLE .

Conclusion

This research paper demonstrates how sophisticated text processing techniques can be used to mine information from pathology reports and medical history narratives for improved understanding of lupus nephritis phenotypes for better patient care outcomes related to SLE management .

Created on 30 Sep. 2023

Assess the quality of the AI-generated content by voting

Score: 0

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

54.8%

Common human diseases prediction using machine learning based on survey data

cs.LG

48.7%

Automated Clinical Coding: What, Why, and Where We Are?

cs.CL

46.0%

Do We Still Need Clinical Language Models?

cs.CL

45.9%

Regression-based Deep-Learning predicts molecular biomarkers from pathology s…

cs.CV

45.7%

Spark NLP: Natural Language Understanding at Scale

cs.CL

44.8%

Comparison of biomedical relationship extraction methods and models for knowl…

cs.AI

44.5%

What We Know So Far: Artificial Intelligence in African Healthcare

cs.CY

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.