From voice biomarkers to telemedicine screening: developing and evaluating a voice-based AI model for laryngeal lesion detection using the Bridge2AI-Voice dataset - Summary - MDSpire
Advertisement
From voice biomarkers to telemedicine screening: developing and evaluating a voice-based AI model for laryngeal lesion detection using the Bridge2AI-Voice dataset
To determine whether the derived-feature release of Bridge2AI-Voice v3.0.0 can support a high-sensitivity screening model for laryngeal lesions and to evaluate its translational readiness using telemedicine implementation frameworks.
Approach:
Data Analysis: Analyzed data from 205 adult participants (136 controls, 52 benign vocal fold lesions, 13 precancerous lesions, 4 laryngeal cancer) using an L2-regularized logistic regression model fitted to 131 OpenSMILE static acoustic features.
Validation: Evaluated model performance under participant-level stratified 10-fold nested cross-validation and conducted pre-specified validity tests against age confounding.
Robustness Checks: Assessed alternative feature modalities and classifier families to ensure robustness of findings.
Key Findings:
The OpenSMILE-based model achieved cross-validated AUC of 0.812 (95% CI 0.744–0.876), indicating a strong ability to discriminate between laryngeal lesions and controls.
Operating-point sensitivity was 0.870 (95% CI 0.767–0.939) and specificity was 0.566 (95% CI 0.479–0.651), demonstrating a trade-off between sensitivity and specificity.
Model discrimination significantly exceeded an age-only baseline (DeLong p = 0.0008), suggesting that the model provides valuable information beyond age alone.
Subgroup analysis showed consistent sensitivity across benign (0.865) and precancerous (0.846) lesion subgroups, indicating robustness in different lesion types.
Alternative feature modalities did not provide incremental discriminative information beyond OpenSMILE, reinforcing the effectiveness of the chosen features.
Interpretation:
Derived acoustic features from the Bridge2AI-Voice v3.0.0 release combined with basic demographic information support cross-validated discrimination of vocal fold lesions, aligning with existing literature on voice-based pathology detection.
Limitations:
The study is based on a limited sample size of 205 participants, which may affect the generalizability of the findings.
Further confirmatory investigation in a larger, prospectively recruited cohort is warranted to validate these results.
Conclusion:
The results indicate the need for further research into voice-based AI models for laryngeal lesion detection, emphasizing the importance of additional studies to confirm these findings.
Higher annual oral corticosteroid exposure was associated with greater odds of systemic adverse events, with avascular bone necrosis and pneumonia showing dose-dependent associations with cumulative dose and osteoporosis associated with longer annual exposure duration.