Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
English Reading on NarrativeQA
Loading...
53.35
F1 Score
Llama-3.3-70B-Instruct
24.594
32.0595
39.525
46.9905
Apr 30, 2026
F1 Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
F1 Score
Llama-3.3-70B-Instruct
Shots=3
2026.04
53.35
Qwen3-14B
Shots=3
2026.04
35.75
SecGPT-14B
Shots=3
2026.04
35.35
Llama-3.1-8B-Instruct
Shots=3
2026.04
33.2
Qwen3.5-9B
Shots=3
2026.04
30.1
Qwen3-8B
Shots=3
2026.04
29.85
XekRung-8B
Shots=3
2026.04
27.9
Llama-Primus-Reasoning-8B
Shots=3
2026.04
26.55
Foundation-Sec-8B-Reasoning
Shots=3
2026.04
25.7
Feedback
Search any
task
Search any
task