Accuracy of AI Laryngeal Disorder Detection - Summary - MDSpire

Accuracy of AI Laryngeal Disorder Detection

  • By

  • Andrea Surnit

  • June 22, 2026

  • 4 min

Share

Objective:

To evaluate the accuracy of artificial intelligence systems in detecting laryngeal disorders.

Approach:
    Key Findings:
    • AI models performed best in binary classification tasks, with accuracies ranging from 88% to 99% for distinguishing healthy from pathologic voices.
    • Performance declined to approximately 70% to 90% for broader pathophysiologic categories and generally remained below 75% for specific disorders.
    • AI performance varied by model architecture and data type, with traditional machine-learning achieving 88% to 96% accuracy for binary tasks and deep-learning systems achieving 97% to 99% on standardized datasets.
    • Most studies relied on internal validation, with performance often declining by 10-20 percentage points on independent cohorts.
    Interpretation:

    The decline in performance from detection to diagnosis is attributed to acoustic overlap among laryngeal disorders, where distinct diseases can produce similar voice abnormalities.

    Limitations:
    • Many studies had methodological concerns, including dependence on limited databases, class imbalance, and lack of demographic diversity.
    • Approximately 82% of studies used sustained-vowel tasks, which may not capture clinically relevant vocal variability.
    • Fewer than 15% of studies shared source code or complete model documentation, limiting reproducibility.
    Conclusion:

    Current evidence supports AI primarily as a tool for screening, triage, and decision support rather than as an autonomous diagnostic system.

    Sources:

Original Source(s)

Related Content