Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
IRT Parameter Prediction on ELA (5-fold)
Loading...
0.503
Pearson Corr (b, 1PL)
Qwen-14B
0.14628
0.23889
0.3315
0.42411
Jan 5, 2026
Pearson Corr (b, 1PL)
RMSE (b, 1PL)
Pearson Corr (a, 2PL)
RMSE (a, 2PL)
Pearson Corr (b, 2PL)
RMSE (b, 2PL)
Updated 3mo ago
Evaluation Results
Method
Method
Links
Pearson Corr (b, 1PL)
RMSE (b, 1PL)
Pearson Corr (a, 2PL)
RMSE (a, 2PL)
Pearson Corr (b, 2PL)
RMSE (b, 2PL)
Qwen-14B
Model size=14B
2026.01
0.503
0.721
0.446
0.169
0.485
0.936
Qwen-1.7B
Model size=1.7B
2026.01
0.409
0.766
0.238
0.191
0.41
1.041
Qwen-8B
Model size=8B
2026.01
0.408
0.75
0.336
0.191
0.429
1.014
Qwen-4B
Model size=4B
2026.01
0.404
0.764
0.332
0.186
0.391
1.008
ModernBERT
Mode=Fine-tuned baseline
2026.01
0.239
0.868
0.061
0.231
0.132
1.251
Features
Type=Feature-based model
2026.01
0.16
0.827
0.194
0.2
0.04
1.098
Feedback
Search any
task
Search any
task