Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Forgery Reasoning on TFR (test)
Loading...
58.8
Avg Reasoning Score (Cosine/Rouge-L/BLEU)
TextShield-R1
0.976
15.988
31
46.012
Feb 23, 2026
Avg Reasoning Score (Cosine/Rouge-L/BLEU)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Avg Reasoning Score (Cosine/Rouge-L/BLEU)
TextShield-R1
Fine-tuning=Full train...
2026.02
58.8
Qwen2.5-VL-3B
Fine-tuning=Full train...
2026.02
42.9
Qwen2.5-VL-7B
Fine-tuning=Full train...
2026.02
42.9
SIDA*
Fine-tuning=Full train...
2026.02
42.9
FakeShield*
Fine-tuning=Full train...
2026.02
42.8
InternVL3-8B
Fine-tuning=Full train...
2026.02
41.7
MiniCPM_V_2.6
Fine-tuning=Full train...
2026.02
41.1
InternVL3-2B
Fine-tuning=Full train...
2026.02
40.6
SIDA
Fine-tuning=Full train...
2026.02
35.7
FakeShield
Fine-tuning=Full train...
2026.02
35.6
GPT4o
Fine-tuning=None
2026.02
19.4
InternVL3-8B
Fine-tuning=None
2026.02
17.9
Qwen2.5-VL-3B
Fine-tuning=None
2026.02
9.5
Qwen2.5-VL-7B
Fine-tuning=None
2026.02
9.5
InternVL3-2B
Fine-tuning=None
2026.02
8.5
MiniCPM_V_2.6
Fine-tuning=None
2026.02
3.2
Feedback
Search any
task
Search any
task