Share your thoughts, 1 month free Claude Pro on usSee more

Pairwise Comparison on DeepfakeJudge Meta-Human

99.4Pairwise Accuracy

Qwen-3-VL-235B-Instruct

Updated 5mo ago

Evaluation Results

Method	Links
Qwen-3-VL-235B-Instruct 2026.02		99.4
DeepfakeJudge-7B 2026.02		98.9
Qwen-3-VL-30B-Thinking 2026.02		97.7
DeepfakeJudge-3B 2026.02		96.6
Qwen-3-VL-30B-Instruct 2026.02		96.3
Qwen-3-VL-235B-Thinking 2026.02		95.5
Gemini-Flash-2.5 2026.02		94.2
Qwen-3-VL-8B-Thinking 2026.02		93.2
GPT-4o-Mini 2026.02		89.8
Qwen-3-VL-8B-Instruct 2026.02		88.6
Qwen-3-VL-4B-Instruct 2026.02		72.7
Qwen-3-VL-2B-Instruct 2026.02		65.1