Improving disease misclassification and prevalence estimates by linking primary and secondary care electronic health records: an illustration from arthritis research - Summary - MDSpire

Improving disease misclassification and prevalence estimates by linking primary and secondary care electronic health records: an illustration from arthritis research

  • By

  • Belay Birlie Yimer

  • Fangyuan Zhang

  • Jenny Humphreys

  • Mark Lunt

  • Meghna Jani

  • John McBeth

  • William G Dixon

  • September 17, 2025

  • 0 min

Share

Objective:

To examine and adjust for misclassification in disease prevalence estimates by linking primary care records with text-mined outpatient letters, focusing on psoriatic arthritis (PsA) and its implications for accurate health data.

Key Findings:
  • Observed prevalence of PsA in primary care was 0.13% (95% CI, 0.11%-0.15%), indicating a significant underestimation.
  • Primary care codes identified 188 true PsA cases but missed 196 hospital-diagnosed cases, leading to over 2-fold underestimation, highlighting the need for improved coding.
  • Adjusted prevalence accounting for misclassification was 0.25% (95% CI, 0.21%-0.28%), demonstrating the effectiveness of the integration approach.
Interpretation:

Linking primary care with hospital records effectively identified both false positives and negatives, enabling more accurate prevalence estimates for PsA.

Limitations:
  • The study may not be generalizable beyond the specific region and population studied, which could limit the applicability of findings.
  • Reliance on text-mining may introduce variability in the accuracy of extracted data, potentially affecting the reliability of results.
Conclusion:

Integrating primary and secondary care data enhances the accuracy of disease classification and prevalence assessments, underscoring the importance of addressing misclassification in health data for future research and policy.

Original Source(s)

Related Content