Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Response Quality Evaluation on MT-Bench
Loading...
8.71
Average Response Quality
No defense
5.4444
6.2922
7.14
7.9878
Feb 26, 2024
May 11, 2024
Jul 25, 2024
Oct 8, 2024
Dec 22, 2024
Mar 7, 2025
May 22, 2025
Average Response Quality
Updated 4d ago
Evaluation Results
Method
Method
Links
Average Response Quality
No defense
Target Model=GPT-3.5-t...
2024.02
8.71
Backtranslation
Target Model=GPT-3.5-t...
2024.02
8.6
Response Check
Target Model=GPT-3.5-t...
2024.02
8.58
Paraphrase
Target Model=GPT-3.5-t...
2024.02
8.43
No defense
Target Model=Llama-2-1...
2024.02
7.36
SmoothLLM
Target Model=GPT-3.5-t...
2024.02
7.35
Response Check
Target Model=Llama-2-1...
2024.02
7.3
Backtranslation
Target Model=Llama-2-1...
2024.02
7.26
Paraphrase
Target Model=Llama-2-1...
2024.02
7.23
No defense
Target Model=Vicuna-13B
2024.02
6.8
MTSA-T3
Target Model=Zephyr-7B...
2025.05
6.78
Baseline
Target Model=Zephyr-7B...
2025.05
6.76
Response Check
Target Model=Vicuna-13B
2024.02
6.74
Paraphrase
Target Model=Vicuna-13B
2024.02
6.69
Backtranslation
Target Model=Vicuna-13B
2024.02
6.34
SmoothLLM
Target Model=Vicuna-13B
2024.02
5.89
SmoothLLM
Target Model=Llama-2-1...
2024.02
5.81
Baseline
Target Model=Llama2-7B...
2025.05
5.64
MTSA-T3
Target Model=Llama2-7B...
2025.05
5.57
Feedback
Search any
task
Search any
task