Evaluation of Large Language Models for Structured Data Extraction From Interstitial Lung Disease Clinical Notes: Comparative Study

By
Stephanie Ji Chen
Manoj Venkat Maddali
Curtis Langlotz
Christian Bluethgen
Jonathan Chen
Rishi Raj
June 26, 2026
0 min

Journal Of Medical Internet Research (Jmir)

Overview

This study evaluates the performance of large language models (LLMs) in extracting structured binary data from clinical notes related to interstitial lung disease (ILD).

Background

Clinical notes often contain critical information in unstructured formats, making data extraction labor-intensive and costly. This issue is particularly significant in interstitial lung disease (ILD), where documentation can be verbose and ambiguous.

Data Highlights

No numerical data or trial data were provided in the source material.

Key Findings

LLMs can extract binary data (yes/no answers) from clinical notes effectively.
Prompt engineering is crucial for obtaining accurate responses from LLMs.
The study involved a cohort of patients from the Stanford Interstitial Lung Disease Clinic.
Ethical approval was obtained for the retrospective chart review study.
LLMs may facilitate the creation and maintenance of ILD registries.

Clinical Implications

The use of LLMs in clinical settings may streamline the process of data extraction from unstructured clinical notes.

Conclusion

The study demonstrates the potential of LLMs to assist in the extraction of structured data from clinical notes.

Evaluation of Large Language Models for Structured Data Extraction From Interstitial Lung Disease Clinical Notes: Comparative Study

Clinical Report: Assessment of Large Language Models for Extracting Structured Data

Overview

Background

Data Highlights

Key Findings

Clinical Implications

Conclusion

Related Resources & Content

Original Source(s)

Evaluation of Large Language Models for Structured Data Extraction From Interstitial Lung Disease Clinical Notes: Comparative Study

Related Content

Hemodynamic phenotyping of bronchopulmonary dysplasia: from transitional circulation to precision cardiopulmonary care

Analysis of influencing factors and construction of nomogram prediction model for pulmonary infection in patients with acute exacerbation of chronic obstructive pulmonary disease complicated with type II respiratory failure

Coronary heart disease and chronic obstructive pulmonary disease prevalence and temporal trends among United States adults: a national population-based study