Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Dialogue Summarization on ToFuEval
Loading...
83.12
ToFuEval Score
NWCAD
21.4064
37.4282
53.45
69.4718
Apr 17, 2026
ToFuEval Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
ToFuEval Score
NWCAD
Backbone=Llama-3.1-70B
2026.04
83.12
AdaCAD
Backbone=Llama-3.1-70B
2026.04
81.4
CAD
Backbone=Llama-3.1-70B
2026.04
81.04
CoCoA
Backbone=Llama-3.1-70B
2026.04
81.04
With-context
Backbone=Llama-3.1-70B
2026.04
78.84
NWCAD
Backbone=Llama-3.1-8B
2026.04
77.44
CAD
Backbone=Llama-3.1-8B
2026.04
77.4
AdaCAD
Backbone=Llama-3.1-8B
2026.04
75.75
CoCoA
Backbone=Llama-3.1-8B
2026.04
75.09
With-context
Backbone=Llama-3.1-8B
2026.04
74.75
NWCAD
Backbone=Ministral-3
2026.04
70.99
AdaCAD
Backbone=Ministral-3
2026.04
69.86
With-context
Backbone=Ministral-3
2026.04
68.88
CAD
Backbone=Ministral-3
2026.04
64.57
CoCoA
Backbone=Ministral-3
2026.04
61.49
Baseline
Backbone=Llama-3.1-8B
2026.04
40.44
Baseline
Backbone=Llama-3.1-70B
2026.04
37.65
Baseline
Backbone=Ministral-3
2026.04
23.78
Feedback
Search any
task
Search any
task