Assessing ChatGPT vs. evidence-based online responses for polycystic ovary syndrome self-management and education: an international cross-sectional blinded survey of healthcare professionals - Report - MDSpire
Advertisement
Assessing ChatGPT vs. evidence-based online responses for polycystic ovary syndrome self-management and education: an international cross-sectional blinded survey of healthcare professionals
Evaluating ChatGPT vs Evidence-Based Resources for PCOS Education
Overview
In a blinded international survey of 43 healthcare providers from 14 countries, ChatGPT-generated responses to 12 frequently asked PCOS questions were rated significantly higher than evidence-based online resources. ChatGPT demonstrated comparable readability and improved simplification capabilities, suggesting its potential as a complementary tool for PCOS patient education.
Background
Polycystic ovary syndrome (PCOS) affects approximately 11%–13% of women worldwide and presents with diverse symptoms leading to diagnostic and management challenges. Patients often seek information online, but many sources lack quality and accuracy. The rise of AI-powered large language models like ChatGPT offers interactive, tailored health information, yet their reliability and clinical applicability for PCOS education require rigorous evaluation.
Data Highlights
Measure
ChatGPT Rating
Evidence-Based Rating
Effect Size (rrb)
p-value
Average Rating (12 questions)
0.824 units higher
Reference
-0.46 to -1.00
<0.05
Inter-rater Agreement (κ)
Fair for 7 questions
Poor to fair overall
0.24–0.37
<0.05
Readability Comparison
No significant difference
No significant difference
Not applicable
Not significant
Readability After Simplification by ChatGPT
Significant improvement
Not applicable
Not applicable
<0.05
Key Findings
ChatGPT responses to PCOS FAQs were rated significantly higher than evidence-based online resources for 11 of 12 questions by healthcare providers.
Effect sizes ranged from moderate to large (rrb = -0.46 to -1.00), indicating meaningful differences favoring ChatGPT.
Inter-rater agreement among healthcare providers was poor to fair, with fair agreement achieved on seven questions (κ = 0.24–0.37).
Readability analyses showed no significant difference between ChatGPT and evidence-based responses initially.
Using ChatGPT to simplify responses significantly improved readability, enhancing patient comprehension potential.
ChatGPT offers interactive and tailored engagement, which may complement static evidence-based resources for PCOS self-education.
Clinical Implications
ChatGPT may serve as a valuable adjunct for patient education in PCOS by providing accessible, interactive, and simplified information that healthcare providers rate highly. Clinicians should consider integrating AI tools like ChatGPT to enhance patient understanding and engagement while continuing to rely on validated evidence-based guidelines. Further research is needed to optimize AI integration and confirm clinical applicability in diverse patient populations.
Conclusion
This study demonstrates that ChatGPT can deliver credible, user-friendly PCOS information rated favorably by healthcare professionals, highlighting its promise as a complementary self-management and education tool. Ongoing validation and strategic implementation are essential to maximize its clinical utility.
References
Authors/Monash University/2025 -- Evaluating ChatGPT and Evidence-Based Online Resources for Self-Management and Education in Polycystic Ovary Syndrome