To systematically review explainable artificial intelligence (XAI) methods applied to deep learning models for clinical voice and speech analysis.
Approach:
Research Questions: The review addresses the following questions: What XAI methods have been used to explain the decisions of deep learning voice and speech AI systems in health care? What insights are derived from the application of these XAI methods? What are the limitations of these XAI methods in the context of clinical audio in terms of clinical applicability and stakeholder relevance?
Key Findings:
Voice and speech biomarkers have potential applications in various medical fields, including voice pathology detection and neurodegenerative disease diagnosis.
AI-driven audio-based medical systems can improve accessibility to healthcare services.
The integration of AI in clinical settings is hindered by the lack of high-quality data and the 'black-box' problem of deep learning models.
Explainable AI (XAI) aims to enhance the transparency of AI models, but existing techniques are often more developed for image-based domains than for voice analysis.
Interpretation:
The review highlights the need for multidisciplinary efforts in designing XAI methods that cater to the diverse backgrounds of stakeholders in healthcare.
Limitations:
Current XAI methods may not adequately address the complexities of audio data interpretation.
There is a lack of consensus on the definitions of explainability and interpretability in AI literature.
Conclusion:
The paper emphasizes the importance of developing tailored XAI approaches for clinical voice and speech analysis to enhance trust and applicability in healthcare settings.