Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Question Answering on BIGBENCH II
Loading...
100
True WS Score
PromptCOS
7.44
31.47
55.5
79.53
Sep 3, 2025
True WS Score
False WS Score
Min Distance (WS)
Updated 9d ago
Evaluation Results
Method
Method
Links
True WS Score
False WS Score
Min Distance (WS)
PromptCOS
Model=TinyLlama-chat
2025.09
100
20
0.64
PromptCOS
Model=Deepseek-d-qwen
2025.09
100
24
0.64
PCG
Model=Gemma2-it
2025.09
100
22
0.71
PromptCOS
Model=Gemma2-it
2025.09
95
29
0.6
PR
Model=TinyLlama-chat
2025.09
83
75
-0.01
PR
Model=Gemma2-it
2025.09
81
71
0.05
PR
Model=Deepseek-d-qwen
2025.09
73
49
0.13
PCG
Model=TinyLlama-chat
2025.09
67
54
0.06
PC*
Model=Gemma2-it
2025.09
65
2
0.61
PC*
Model=TinyLlama-chat
2025.09
64
8
0.52
PC*
Model=Deepseek-d-qwen
2025.09
58
5
0.51
PCG
Model=Deepseek-d-qwen
2025.09
11
3
0.02
Feedback
Search any
task
Search any
task