Low-energy Small Language Models with Retrieval-Augmented Generation can Surpass Large-Model Performance in Rheumatology - Scorecard - MDSpire

Low-energy Small Language Models with Retrieval-Augmented Generation can Surpass Large-Model Performance in Rheumatology

  • By

  • Felde, Sabine

  • Buchkremer, Rüdiger

  • Chehab, Gamal

  • Thielscher, Christian

  • Distler, Jörg HW

  • Schneider, Matthias

  • Richter, Jutta G

  • April 23, 2026

  • 0 min

Share

Clinical Scorecard: Smaller Language Models Enhanced with Retrieval-Augmented Generation May Outperform Larger Models in Rheumatology Applications

At a Glance

CategoryDetail
ConditionRheumatology applications in clinical decision support
Key MechanismsIntegration of smaller language models with retrieval-augmented generation
Target PopulationPatients requiring rheumatology care
Care SettingClinical decision support environments

Key Highlights

  • Mixtral-8x7b-32768 with RAG achieved the highest diagnostic (72%) and therapeutic (73%) F1 scores.
  • Nemotron-70b demonstrated strong diagnostic capability without RAG (71%).
  • Qwen-Turbo excelled in therapeutic suggestions without retrieval (72%).
  • Mixtral with RAG recorded the highest RAGAS score (81%).
  • Performance varied significantly across models and configurations.

Guideline-Based Recommendations

Diagnosis

  • Utilize models like Mixtral-8x7b-32768 with RAG for improved diagnostic accuracy.

Management

  • Incorporate smaller language models in clinical decision support to enhance therapeutic suggestions.

Monitoring & Follow-up

  • Regularly assess model performance and accuracy in real-world settings.

Risks

  • Clinically pertinent errors were noted, necessitating expert oversight.

Patient & Prescribing Data

Patients with rheumatologic conditions requiring diagnosis and management.

Smaller language models can provide effective clinical decision support with lower computational demands.

Clinical Best Practices

  • Implement expert oversight when utilizing language models for clinical decision support.
  • Validate model performance in real-world clinical scenarios.

Related Resources & Content

    Original Source(s)

    Related Content