Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Boolean Question Answering on BoolQ

91.26Accuracy

Direct Fine-tuning

75.992879.956483.9287.8836May 11, 2023Nov 3, 2023Apr 27, 2024Oct 20, 2024Apr 14, 2025Oct 7, 2025Apr 1, 2026
Updated 16d ago

Evaluation Results

MethodLinks
2026.02
91.26--
2026.02
91--
2026.02
90.92--
2026.02
90.81--
2026.02
90.79--
2026.02
90.56--
2026.02
90.29--
2025.07
88.69--
2023.07
87.8--
2026.01
86.61--
2025.07
86.61--
2026.04
86.6--
2026.04
86.6--
2025.07
86.45--
2026.02
86.27-0.73
2026.04
86.1--
2026.02
85.86-0.32
2025.07
85.84--
2026.02
85.72-0.18
2026.02
85.54--
2025.07
85.41--
2026.01
85.29--
2026.01
85.11--
2026.04
85.1--
2026.01
84.71--
2026.04
84.7--
2023.07
84.5--
2025.08
84.5--
2026.01
84.22--
2025.08
84.13--
2026.02
83.49--2.05
2026.01
83.3--
2026.04
82.6--
2025.08
82.24--
2026.02
82.22-0.05
2025.05
82.2--
2026.02
82.17--
2026.02
81.35--0.82
2025.12
81.3--
2026.02
81.16--
2025.07
80.99--
2026.02
80.86--1.31
2023.07
80.8--
2024.07
80.7--
2025.08
80.67--
2026.01
80.49--
2026.02
80.4--0.76
2025.07
80.4--
2024.02
80.2--
2026.01
80--
2025.07
79.91--
2025.07
79.88--
2026.02
79.85--1.31
2025.12
79.8--
2026.02
79.79--
2026.01
79.48--
2026.02
79.39--
2026.02
79.3--
2025.12
79.3--
2026.02
79.24--
2026.02
79.2--
2026.02
79.14--
2025.12
78.9--
2025.12
78.9--
2025.12
78.9--
2025.12
78.8--
2025.12
78.8--
2023.05
78.78--
2025.12
78.6--
2025.12
78.6--
2025.12
78.6--
2025.12
78.6--
2025.12
78.5--
2026.01
78.41--
2025.12
78.4--
2026.02
78.35--2.81
2025.12
78.3--
2025.12
78.3--
2025.12
78.2--
2025.12
78.2--
2025.12
78.1--
2025.12
77.8--
2026.02
77.74--
2026.01
77.71--
2024.03
77.5--
2025.12
77.5--
2026.02
77.46--3.7
2026.02
77.3--
2025.12
77.3--
2024.03
77.23--
2025.12
77.2--
2025.12
77.2--
2024.03
77.14--
2026.04
77.1--
2023.05
77.03--
2024.03
76.9--
2026.02
76.73--1.01
2026.02
76.67--1.07
2024.02
76.6--
2026.02
76.58--1.16
Showing 100 of 323 rows