Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Question Answering on DROP
Loading...
0.891
Score
ALBERT
-0.03148
0.20801
0.4475
0.68699
May 2, 2020
Apr 17, 2021
Apr 3, 2022
Mar 20, 2023
Mar 4, 2024
Feb 18, 2025
Feb 4, 2026
Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Score
ALBERT
seen_dataset_during_tr...
2020.05
0.891
PerCE
Model=Qwen3-4B
2026.02
0.66
CE
Model=Qwen3-4B
2026.02
0.62
PerCE
Model=Llama3-8B
2026.02
0.612
CE
Model=Llama3-8B
2026.02
0.568
UnifiedQA
seen_dataset_during_tr...
2020.05
0.325
UnifiedQA [AB]
seen_dataset_during_tr...
2020.05
0.307
UnifiedQA [MC]
seen_dataset_during_tr...
2020.05
0.289
UnifiedQA [EX]
seen_dataset_during_tr...
2020.05
0.246
UnifiedQA [YN]
seen_dataset_during_tr...
2020.05
0.004
Feedback
Search any
task
Search any
task