Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Flight Recommendation on Flight Recommendation 1st Round
Loading...
55.5
Accuracy
Bayesian Teaching
29.188
36.019
42.85
49.681
Apr 5, 2026
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Bayesian Teaching
Base Model=Llama 3 8B
2026.04
55.5
Bayesian Teaching
Base Model=Gemma 2 9B
2026.04
54.1
ADAPTFUSE
Base Model=Gemma 2 9B
2026.04
53.1
Bayesian Teaching
Base Model=Qwen 2.5 7B
2026.04
52.5
ADAPTFUSE
Base Model=Llama 3 8B
2026.04
52.1
ADAPTFUSE
Base Model=Qwen 2.5 7B
2026.04
50.4
Oracle Learning
Base Model=Gemma 2 9B
2026.04
48.6
Oracle Learning
Base Model=Llama 3 8B
2026.04
45.1
Oracle Learning
Base Model=Qwen 2.5 7B
2026.04
40.6
Self-consistency
Base Model=Gemma 2 9B,...
2026.04
40.4
Self-consistency
Base Model=Llama 3 8B,...
2026.04
38.1
Self-consistency
Base Model=Qwen 2.5 7B...
2026.04
37.2
CoT
Base Model=Gemma 2 9B
2026.04
36.2
CoT
Base Model=Llama 3 8B
2026.04
35.1
CoT
Base Model=Qwen 2.5 7B
2026.04
34.5
Direct Prompting
Base Model=Gemma 2 9B
2026.04
31.3
Direct Prompting
Base Model=Qwen 2.5 7B
2026.04
31.2
Direct Prompting
Base Model=Llama 3 8B
2026.04
30.2
Feedback
Search any
task
Search any
task