To analyze the capability of GPT in extracting OPS codes from surgical reports of meningioma resection conducted between January 2023 and December 2024.
Key Findings:
Sufficient coding achieved 99-100% by surgeons and professional coders, compared to 78-89% by LLMs.
GPT CodeMedic outperformed GPT-4o by over 11% in sufficient coding.
Professional coders achieved the highest optimal coding performance (94%), while surgeons had the highest error rate (69% of optimal coding).
GPT CodeMedic significantly outperformed surgeons in optimal coding.
Interpretation:
While human coders performed better overall, GPT CodeMedic showed promise in OPS coding, particularly excelling in optimal coding scenarios.
Limitations:
Study limited to specific tumor types and a single institution, which may not represent broader coding practices.
Potential biases in report selection and coding accuracy assessment could influence the results.
Conclusion:
GPT CodeMedic demonstrates potential for improving OPS coding accuracy, but human coders remain superior in overall performance.