Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Boolean Question Answering on BoolQ (test)

86.7Accuracy (Avg)

QWEN3-14B

51.901660.935869.9779.0042Dec 20, 2022Jun 27, 2023Jan 3, 2024Jul 10, 2024Jan 16, 2025Jul 24, 2025Jan 30, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.01
86.7---
2026.01
86.42---
2026.01
84.92---
2026.01
83.43---
2026.01
83.33---
2026.01
83.09---
2026.01
78.87---
2026.01
75.47---
2026.01
74.04---
2022.12
69.42.162.8-
2022.12
69.33.857.3-
2022.12
692.661.5-
2026.01
68.65---
2022.12
68.32.362.7-
2022.12
682.559.8-
2026.01
66.21---
2022.12
66.23.454.6-
2026.01
65.69---
2022.12
65.54.951.8-
2022.12
65.55.250.4-
2022.12
65.20.963.4-
2022.12
65.25.649.7-
2022.12
65.11.661.1-
2022.12
64.85.349.3-
2022.12
64.76.449.3-
2022.12
63.82.756.4-
2022.12
63.72.256-
2022.12
63.56.351-
2022.12
62.63.355.6-
2022.12
62.3354.3-
2022.12
61.23.950.4-
2022.12
61.2451.1-
2022.12
61.23.351.9-
2022.12
613.849.7-
2022.12
60.83.549.6-
2022.12
604.349.5-
2026.01
57.4---
2026.01
53.24---
2026.03
---74.7
2026.03
---83.8
2026.03
---55.48
2026.03
---96.91
2026.03
---58.82
2026.03
---78.56
2026.03
---56.79
2026.03
---89.85
2026.03
---83.65
2026.03
---85.05
2026.03
---58.24
2026.03
---64.22
2026.03
---50.31
2026.03
---68.72
2026.03
---66.5
2026.03
---77.77
2026.03
---55.77
2026.03
---63.85