Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

LLM response quality prediction on ID Claude 3.5 Haiku 20241022 (test)

0.45RMSE

MIRT-Router

0.44320.48910.5350.5809Jun 1, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.06
0.450.436267
2025.06
0.450.426268
2025.06
0.620.555034