Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Conversational Question Answering on QuAC-2 2,000
Loading...
58.05
Accuracy
SINKTRACK
36.47
42.0725
47.675
53.2775
Apr 11, 2026
Accuracy
Macro-F1
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Macro-F1
SINKTRACK
Model=Llama3.1-8B-Inst...
2026.04
58.05
54.25
Direct
Model=Llama3.1-8B-Inst...
2026.04
54.25
51.31
SINKTRACK
Model=Qwen2.5-7B-Instr...
2026.04
52.98
48.08
Direct
Model=Qwen2.5-7B-Instr...
2026.04
52
48.06
CoT
Model=Llama3.1-8B-Inst...
2026.04
51.75
38.21
SINKTRACK
Model=MiniCPM3-4B, Pro...
2026.04
50.12
50.98
Direct
Model=MiniCPM3-4B, Pro...
2026.04
49.97
50.94
CoT
Model=MiniCPM3-4B, Pro...
2026.04
41.03
43.35
CoT
Model=Qwen2.5-7B-Instr...
2026.04
37.3
39.42
Feedback
Search any
task
Search any
task