Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reading Comprehension on DROP

92.2F1 Score

DeepSeek-R1

0.73224.478548.22571.9715May 28, 2020May 16, 2021May 4, 2022Apr 23, 2023Apr 10, 2024Mar 29, 2025Mar 18, 2026
Updated 5d ago

Evaluation Results

MethodLinks
2025.01
92.2
2025.01
91.6
2025.01
90.2
2020.05
89.1
2023.05
88.4
2025.01
88.3
2026.03
88.3
2026.01
86.6
2026.01
86.3
2026.03
85.5
2023.05
85
84.8
2026.01
84.7
2026.03
84.5
2025.01
83.9
2026.03
83.9
2025.01
83.7
2026.03
83.7
2026.01
83.6
2024.07
82.4
2023.05
80.9
2024.07
80.9
79.6
2024.05
79.4
2026.03
78.7
77.5
2026.03
77
2024.05
74.3
2026.03
71.2
2025.12
71.1
2025.12
71
2023.05
70.8
2024.05
69
2025.12
65.4
2025.12
65
2025.12
64.3
2024.05
61
2026.01
60.7
2025.12
60.5
2025.12
60
2024.05
59.6
2024.05
59.5
2024.07
59.5
2024.05
59.4
2024.05
59
2026.01
59
2024.05
58.8
2025.12
58.2
2025.12
57.7
2024.07
56.3
2024.05
56.2
2026.01
55.7
2025.12
55.4
2024.05
54.8
2024.05
54
2025.12
53.9
2024.07
53
2025.12
51.6
2025.12
47
2026.01
41.6
2020.05
36.5
2020.05
34.3
2020.05
23.6
2026.03
15.34
2026.03
15.31
2026.03
15.29
2026.03
15.28
2026.03
15.27
2026.03
15.25
2026.03
14.32
2026.03
14.25
2026.03
12.27
2026.03
4.25