Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Long-context answering with citations on GovReport
Loading...
82.8
Citation Recall
GLM-4
2.616
23.433
44.25
65.067
Sep 4, 2024
Citation Recall
Citation Precision
Citation F1
Updated 4d ago
Evaluation Results
Method
Method
Links
Citation Recall
Citation Precision
Citation F1
GLM-4
Strategy=LAC-S
2024.09
82.8
93.4
87.1
Claude-3-sonnet
Strategy=LAC-S
2024.09
77.4
93.9
84.1
LongCite-8B
Strategy=LAC-S
2024.09
74
86.6
78.5
GPT-4o
Strategy=LAC-S
2024.09
73.4
90.4
79.8
Mistral-Large-Instruct
Strategy=LAC-S
2024.09
67.9
79.6
72.5
LongCite-9B
Strategy=LAC-S
2024.09
63.4
76.5
68.2
Llama-3.1-70B-Instruct
Strategy=LAC-S
2024.09
53.4
77.5
60.7
Llama-3.1-8B-Instruct
Strategy=LAC-S
2024.09
16.2
25.3
16.8
GLM-4-9B-chat
Strategy=LAC-S
2024.09
5.7
8.2
6.3
Feedback
Search any
task
Search any
task