Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Entity counting on Entity counting 200 prompts Qwen3-8B (test)
Loading...
98.7
Accuracy
Hard DPS
10.3
33.25
56.2
79.15
May 5, 2026
Accuracy
Updated 28d ago
Evaluation Results
Method
Method
Links
Accuracy
Hard DPS
Intervention type=Hard...
2026.05
98.7
Probe-round
Intervention type=Prob...
2026.05
98.7
9-row repair
Intervention type=9-ro...
2026.05
60.7
Fullvocab repair
Intervention type=Full...
2026.05
60.3
Baseline
Intervention type=None...
2026.05
13.7
Feedback
Search any
task
Search any
task