Leveraging large language models to extract smoking history from clinical notes for lung cancer surveillance - Takeaways - MDSpire

Leveraging large language models to extract smoking history from clinical notes for lung cancer surveillance

  • By

  • Ingrid Luo

  • Anna Graber-Naidich

  • Mengrui Zhang

  • Rakshit Kaushik

  • Grant M. Nieda

  • Tony Chen

  • Bo Gu

  • Eunji Choi

  • Victoria Y. Ding

  • Fatma Gunturkun

  • Mina Satoyoshi

  • Archana Bhat

  • Tae Yoon Lee

  • Chloe C. Su

  • Timothy John Ellis-Caleo

  • A. Solomon Henry

  • Manisha Desai

  • Leah M. Backhus

  • Natalie S. Lui

  • Ann Leung

  • Joel W. Neal

  • Allison W. Kurian

  • Curtis P. Langlotz

  • Heather A. Wakelee

  • Su-Ying Liang

  • Aparajita Khan

  • Summer S. Han

  • November 28, 2025

  • 0 min

Share

  • 1

    Accurate smoking documentation in EHRs is essential for risk assessment and monitoring in lung cancer patients.

  • 2

    Generative LLMs demonstrated over 96% accuracy in extracting smoking history from clinical notes, outperforming BERT-based models.

  • 3

    The developed framework combines LLMs with rule-based techniques to enhance the quality of longitudinal smoking data.

  • 4

    Deployment of the LLM framework on 79,408 notes improved lung cancer surveillance compared to NCCN Guidelines.

  • 5

    This study highlights the potential of LLMs to improve smoking history documentation, benefiting broader clinical applications.

Original Source(s)

Related Content