Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
LLM response quality prediction on ID Claude 3.5 Haiku 20241022 (test)
Loading...
0.45
RMSE
MIRT-Router
0.4432
0.4891
0.535
0.5809
Jun 1, 2025
RMSE
MAE
AUC
Acc
Updated 4d ago
Evaluation Results
Method
Method
Links
RMSE
MAE
AUC
Acc
MIRT-Router
alpha=0.8, Target LLM=...
2025.06
0.45
0.43
62
67
NIRT-Router
alpha=0.8, Target LLM=...
2025.06
0.45
0.42
62
68
RouterBench
alpha=0.8, Target LLM=...
2025.06
0.62
0.55
50
34
Feedback
Search any
task
Search any
task