Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Document-level Information Extraction on MUC
Loading...
46.48
F1 Score
THINKTWICE Qwen 3 (oracle)
17.568
25.074
32.58
40.086
Jan 26, 2026
F1 Score
Updated 4d ago
Evaluation Results
Method
Method
Links
F1 Score
THINKTWICE Qwen 3 (oracle)
Selector=oracle, Backb...
2026.01
46.48
THINKTWICE Llama R1 (oracle)
Selector=oracle, Backb...
2026.01
42.32
THINKTWICE Qwen 3
Selector=Majority, Bac...
2026.01
26.18
THINKTWICE Qwen 3
Selector=F1 Voting, Ba...
2026.01
24.82
Greedy Qwen 3
Selector=X, Backbone=Q...
2026.01
22.99
ChatGPT 3.5
Selector=X, Backbone=G...
2026.01
22.41
THINKTWICE Llama R1
Selector=Majority, Bac...
2026.01
21.96
THINKTWICE Llama R1
Selector=F1 Voting, Ba...
2026.01
21.23
Greedy Llama R1
Selector=X, Backbone=L...
2026.01
18.68
Feedback
Search any
task
Search any
task