Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Long-context answering with citations on Dureader
Loading...
73.4
Citation Recall
GLM-4
19.944
33.822
47.7
61.578
Sep 4, 2024
Citation Recall
Citation Precision
Citation F1
Updated 4d ago
Evaluation Results
Method
Method
Links
Citation Recall
Citation Precision
Citation F1
GLM-4
Strategy=LAC-S
2024.09
73.4
82.3
75
LongCite-8B
Strategy=LAC-S
2024.09
68.3
85.6
73.1
Claude-3-sonnet
Strategy=LAC-S
2024.09
67.7
89.2
75.5
LongCite-9B
Strategy=LAC-S
2024.09
67.6
89.2
74.4
GPT-4o
Strategy=LAC-S
2024.09
65.6
74.2
67.4
Mistral-Large-Instruct
Strategy=LAC-S
2024.09
58.3
67
60.1
GLM-4-9B-chat
Strategy=LAC-S
2024.09
45.4
48.3
40.9
Llama-3.1-70B-Instruct
Strategy=LAC-S
2024.09
38.2
46
35.4
Llama-3.1-8B-Instruct
Strategy=LAC-S
2024.09
22
25.1
17
Feedback
Search any
task
Search any
task