Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LLM response quality prediction on ID Claude 3.5 Haiku 20241022 (test)

0.45RMSE

MIRT-Router

0.44320.48910.5350.5809Jun 1, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.06
0.450.436267
2025.06
0.450.426268
2025.06
0.620.555034