Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Alignment on MT-Bench
Loading...
8.51
MT-Bench Score (1-turn)
PSFT → DPO
6.7628
7.2164
7.67
8.1236
Aug 25, 2025
MT-Bench Score (1-turn)
MT-Bench Score
MT-Bench Score (2-turn)
Updated 5d ago
Evaluation Results
Method
Method
Links
MT-Bench Score (1-turn)
MT-Bench Score
MT-Bench Score (2-turn)
PSFT → DPO
Backbone=Qwen3-4B-Base...
2025.08
8.51
-
6.95
SFT → DPO
Backbone=Qwen3-4B-Base...
2025.08
7.91
-
6
SFT
Backbone=Qwen3-4B-Base...
2025.08
7.64
-
5.71
PSFTprolong → DPO
Backbone=Qwen3-4B-Base...
2025.08
7.63
-
6.74
PSFTprolong
Backbone=Qwen3-4B-Base...
2025.08
7.41
-
5.84
PSFT
Backbone=Qwen3-4B-Base...
2025.08
6.83
-
4.86
Mixtral-8x22B-Instruct
Type=Instruction-tuned
2024.07
-
8.66
-
Llama-3-70B-Instruct
Type=Instruction-tuned
2024.07
-
8.95
-
Qwen1.5-72B-Chat
Type=Instruction-tuned
2024.07
-
8.61
-
Qwen1.5-110B-Chat
Type=Instruction-tuned
2024.07
-
8.88
-
Qwen2-72B-Instruct
Type=Instruction-tuned
2024.07
-
9.12
-
Feedback
Search any
task
Search any
task