Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Forgery Reasoning on TFR Cross-Language
Loading...
46.2
Avg Reasoning Score
TextShield-R1
0.128
12.089
24.05
36.011
Feb 23, 2026
Avg Reasoning Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Avg Reasoning Score
TextShield-R1
Fine-tuning=Full train...
2026.02
46.2
Qwen2.5-VL-7B
Fine-tuning=Full train...
2026.02
43.1
SIDA*
Fine-tuning=Full train...
2026.02
43
FakeShield*
Fine-tuning=Full train...
2026.02
42.9
InternVL3-8B
Fine-tuning=Full train...
2026.02
42
MiniCPM_V_2.6
Fine-tuning=Full train...
2026.02
40.3
Qwen2.5-VL-3B
Fine-tuning=Full train...
2026.02
39.7
InternVL3-2B
Fine-tuning=Full train...
2026.02
39.5
FakeShield
Fine-tuning=Full train...
2026.02
34.8
SIDA
Fine-tuning=Full train...
2026.02
25
InternVL3-8B
Fine-tuning=None
2026.02
18.1
GPT4o
Fine-tuning=None
2026.02
14.2
Qwen2.5-VL-7B
Fine-tuning=None
2026.02
10.4
Qwen2.5-VL-3B
Fine-tuning=None
2026.02
7.9
InternVL3-2B
Fine-tuning=None
2026.02
6.9
MiniCPM_V_2.6
Fine-tuning=None
2026.02
1.9
Feedback
Search any
task
Search any
task