Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Linguistic Probing on MultiBLiMP (test)
Loading...
93.3
Accuracy (CA)
Full FT
80.82
84.06
87.3
90.54
May 28, 2026
Accuracy (CA)
Accuracy (ET)
Accuracy (MR)
Accuracy (SK)
Accuracy (UK)
Accuracy (UR)
Updated 5d ago
Evaluation Results
Method
Method
Links
Accuracy (CA)
Accuracy (ET)
Accuracy (MR)
Accuracy (SK)
Accuracy (UK)
Accuracy (UR)
Full FT
Adaptation Strategy=Fu...
2026.05
93.3
81.9
81.7
92.7
91.9
96.5
AEFT
Adaptation Strategy=Al...
2026.05
89
74
74.6
89.9
85.5
94.4
SSFT
Adaptation Strategy=Se...
2026.05
85.4
75.3
73.7
88.6
85.1
93.3
SEFT
Adaptation Strategy=Se...
2026.05
82.9
68.4
66.5
86.3
81.7
86.5
M7
Adaptation Strategy=M7...
2026.05
81.3
61.5
68.5
86
80.7
83.3
Feedback
Search any
task
Search any
task