To assess the diagnostic accuracy of AI-assisted systems in distinguishing benign from malignant thyroid nodules in clinical practice.
Key Findings:
AI-assisted diagnostic systems showed pooled sensitivity of 0.89 and specificity of 0.84, with a positive likelihood ratio of 5.60 and a negative likelihood ratio of 0.13.
The diagnostic odds ratio was 43.94, with an SROC area under the curve of 0.93.
Higher accuracy was noted in Asian countries and in studies with external validation cohorts.
EDLC-TN, an ensemble deep learning model, demonstrated the highest diagnostic accuracy.
Interpretation:
AI models, particularly deep learning systems, are effective in diagnosing thyroid nodules, especially in specific patient demographics, which may influence clinical decision-making.
Limitations:
Most studies were conducted in Asian regions, limiting generalizability.
Significant heterogeneity was observed in diagnostic accuracy across different cohorts, which may affect the reliability of the findings.
Conclusion:
Future AI developments should focus on international multicenter datasets, adaptability, algorithmic transparency, and ensuring diverse representation in training data.
Most surgeons reported using intraoperative parathyroid hormone monitoring, but approaches to imaging and intraoperative criteria varied, particularly in secondary and tertiary disease