From voice biomarkers to telemedicine screening: developing and evaluating a voice-based AI model for laryngeal lesion detection using the Bridge2AI-Voice dataset - Summary - MDSpire

From voice biomarkers to telemedicine screening: developing and evaluating a voice-based AI model for laryngeal lesion detection using the Bridge2AI-Voice dataset

  • By

  • Phillip D. Jenkins

  • Steven Bedrick

  • Lisa Karstens

  • William Hersh

  • the Bridge2AI-Voice Consortium

  • David A. Dorr

  • July 1, 2026

  • 0 min

Share

Objective:

To determine whether the derived-feature release of Bridge2AI-Voice v3.0.0 can support a high-sensitivity screening model for laryngeal lesions and to evaluate its translational readiness using telemedicine implementation frameworks.

Approach:
  • Data Analysis: Analyzed data from 205 adult participants (136 controls, 52 benign vocal fold lesions, 13 precancerous lesions, 4 laryngeal cancer) using an L2-regularized logistic regression model fitted to 131 OpenSMILE static acoustic features.
  • Validation: Evaluated model performance under participant-level stratified 10-fold nested cross-validation and conducted pre-specified validity tests against age confounding.
  • Robustness Checks: Assessed alternative feature modalities and classifier families to ensure robustness of findings.
Key Findings:
  • The OpenSMILE-based model achieved cross-validated AUC of 0.812 (95% CI 0.744–0.876), indicating a strong ability to discriminate between laryngeal lesions and controls.
  • Operating-point sensitivity was 0.870 (95% CI 0.767–0.939) and specificity was 0.566 (95% CI 0.479–0.651), demonstrating a trade-off between sensitivity and specificity.
  • Model discrimination significantly exceeded an age-only baseline (DeLong p = 0.0008), suggesting that the model provides valuable information beyond age alone.
  • Subgroup analysis showed consistent sensitivity across benign (0.865) and precancerous (0.846) lesion subgroups, indicating robustness in different lesion types.
  • Alternative feature modalities did not provide incremental discriminative information beyond OpenSMILE, reinforcing the effectiveness of the chosen features.
Interpretation:

Derived acoustic features from the Bridge2AI-Voice v3.0.0 release combined with basic demographic information support cross-validated discrimination of vocal fold lesions, aligning with existing literature on voice-based pathology detection.

Limitations:
  • The study is based on a limited sample size of 205 participants, which may affect the generalizability of the findings.
  • Further confirmatory investigation in a larger, prospectively recruited cohort is warranted to validate these results.
Conclusion:

The results indicate the need for further research into voice-based AI models for laryngeal lesion detection, emphasizing the importance of additional studies to confirm these findings.

Original Source(s)

Related Content