Assessing ChatGPT vs. evidence-based online responses for polycystic ovary syndrome self-management and education: an international cross-sectional blinded survey of healthcare professionals - Report - MDSpire

Assessing ChatGPT vs. evidence-based online responses for polycystic ovary syndrome self-management and education: an international cross-sectional blinded survey of healthcare professionals

  • By

  • Sandro Graca

  • Alexander Dallaway

  • Folashade Alloh

  • Harpal S. Randeva

  • Chris Kite

  • Ioannis Kyrou

  • March 31, 2026

  • 0 min

Share

Evaluating ChatGPT vs Evidence-Based Resources for PCOS Education

Overview

In a blinded international survey of 43 healthcare providers from 14 countries, ChatGPT-generated responses to 12 frequently asked PCOS questions were rated significantly higher than evidence-based online resources. ChatGPT demonstrated comparable readability and improved simplification capabilities, suggesting its potential as a complementary tool for PCOS patient education.

Background

Polycystic ovary syndrome (PCOS) affects approximately 11%–13% of women worldwide and presents with diverse symptoms leading to diagnostic and management challenges. Patients often seek information online, but many sources lack quality and accuracy. The rise of AI-powered large language models like ChatGPT offers interactive, tailored health information, yet their reliability and clinical applicability for PCOS education require rigorous evaluation.

Data Highlights

MeasureChatGPT RatingEvidence-Based RatingEffect Size (rrb)p-value
Average Rating (12 questions)0.824 units higherReference-0.46 to -1.00<0.05
Inter-rater Agreement (κ)Fair for 7 questionsPoor to fair overall0.24–0.37<0.05
Readability ComparisonNo significant differenceNo significant differenceNot applicableNot significant
Readability After Simplification by ChatGPTSignificant improvementNot applicableNot applicable<0.05

Key Findings

  • ChatGPT responses to PCOS FAQs were rated significantly higher than evidence-based online resources for 11 of 12 questions by healthcare providers.
  • Effect sizes ranged from moderate to large (rrb = -0.46 to -1.00), indicating meaningful differences favoring ChatGPT.
  • Inter-rater agreement among healthcare providers was poor to fair, with fair agreement achieved on seven questions (κ = 0.24–0.37).
  • Readability analyses showed no significant difference between ChatGPT and evidence-based responses initially.
  • Using ChatGPT to simplify responses significantly improved readability, enhancing patient comprehension potential.
  • ChatGPT offers interactive and tailored engagement, which may complement static evidence-based resources for PCOS self-education.

Clinical Implications

ChatGPT may serve as a valuable adjunct for patient education in PCOS by providing accessible, interactive, and simplified information that healthcare providers rate highly. Clinicians should consider integrating AI tools like ChatGPT to enhance patient understanding and engagement while continuing to rely on validated evidence-based guidelines. Further research is needed to optimize AI integration and confirm clinical applicability in diverse patient populations.

Conclusion

This study demonstrates that ChatGPT can deliver credible, user-friendly PCOS information rated favorably by healthcare professionals, highlighting its promise as a complementary self-management and education tool. Ongoing validation and strategic implementation are essential to maximize its clinical utility.

References

  1. Authors/Monash University/2025 -- Evaluating ChatGPT and Evidence-Based Online Resources for Self-Management and Education in Polycystic Ovary Syndrome

Original Source(s)

Related Content