Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Boolean Question Answering on BoolQ

91.26Accuracy

Direct Fine-tuning

74.765679.047883.3387.6122May 11, 2023Oct 26, 2023Apr 11, 2024Sep 26, 2024Mar 13, 2025Aug 28, 2025Feb 12, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
91.26--
2026.02
91--
2026.02
90.92--
2026.02
90.81--
2026.02
90.79--
2026.02
90.56--
2026.02
90.29--
2025.07
88.69--
2023.07
87.8--
2026.01
86.61--
2025.07
86.61--
2025.07
86.45--
2026.02
86.27-0.73
2026.02
85.86-0.32
2025.07
85.84--
2026.02
85.72-0.18
2026.02
85.54--
2025.07
85.41--
2026.01
85.29--
2026.01
85.11--
2026.01
84.71--
2023.07
84.5--
2026.01
84.22--
2026.02
83.49--2.05
2026.01
83.3--
2026.02
82.22-0.05
2025.05
82.2--
2026.02
82.17--
2026.02
81.35--0.82
2025.12
81.3--
2026.02
81.16--
2025.07
80.99--
2026.02
80.86--1.31
2023.07
80.8--
2024.07
80.7--
2026.01
80.49--
2026.02
80.4--0.76
2025.07
80.4--
2024.02
80.2--
2026.01
80--
2025.07
79.91--
2025.07
79.88--
2026.02
79.85--1.31
2025.12
79.8--
2026.02
79.79--
2026.01
79.48--
2026.02
79.39--
2026.02
79.3--
2025.12
79.3--
2026.02
79.24--
2026.02
79.2--
2026.02
79.14--
2025.12
78.9--
2025.12
78.9--
2025.12
78.9--
2025.12
78.8--
2025.12
78.8--
2023.05
78.78--
2025.12
78.6--
2025.12
78.6--
2025.12
78.6--
2025.12
78.6--
2025.12
78.5--
2026.01
78.41--
2025.12
78.4--
2026.02
78.35--2.81
2025.12
78.3--
2025.12
78.3--
2025.12
78.2--
2025.12
78.2--
2025.12
78.1--
2025.12
77.8--
2026.02
77.74--
2026.01
77.71--
2024.03
77.5--
2025.12
77.5--
2026.02
77.46--3.7
2026.02
77.3--
2025.12
77.3--
2024.03
77.23--
2025.12
77.2--
2025.12
77.2--
2024.03
77.14--
2023.05
77.03--
2024.03
76.9--
2026.02
76.73--1.01
2026.02
76.67--1.07
2024.02
76.6--
2026.02
76.58--1.16
2024.03
76.43--
2023.05
76.24--
2023.05
76.21--
2023.05
76.18--
2025.07
76.12--
2025.12
76.1--
2026.02
76.02--1.72
2025.12
75.8--
2023.05
75.64--
2025.12
75.5--
2024.02
75.4--
Showing 100 of 307 rows