Low-energy Small Language Models with Retrieval-Augmented Generation can Surpass Large-Model Performance in Rheumatology - Takeaways - MDSpire

Low-energy Small Language Models with Retrieval-Augmented Generation can Surpass Large-Model Performance in Rheumatology

  • By

  • Felde, Sabine

  • Buchkremer, Rüdiger

  • Chehab, Gamal

  • Thielscher, Christian

  • Distler, Jörg HW

  • Schneider, Matthias

  • Richter, Jutta G

  • April 23, 2026

  • 0 min

Share

  • 1

    Smaller language models enhanced with retrieval-augmented generation may outperform larger models in rheumatology applications.

  • 2

    Mixtral-8x7b-32768 with RAG achieved the highest diagnostic and therapeutic F1 scores at 72% and 73%, respectively.

  • 3

    Nemotron-70b showed strong diagnostic capability without RAG, scoring 71% in F1, while Qwen-Turbo excelled in therapeutic suggestions.

  • 4

    The highest Retrieval-Augmented Generation Assessment Score (RAGAS) of 81% was recorded for Mixtral-8x7b-32768 with RAG.

  • 5

    Despite promising results, all models exhibited clinically relevant errors, underscoring the need for expert oversight and further validation.

Original Source(s)

Related Content