Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
AI-text explanation quality evaluation on AI-text detection 5000 samples (test)
Loading...
78.3
Win Rate
TELL
65.82
69.06
72.3
75.54
May 27, 2026
Win Rate
95% CI
Updated 6d ago
Evaluation Results
Method
Method
Links
Win Rate
95% CI
TELL
Judge=GPT-5.4-mini
2026.05
78.3
73.9
TELL
Judge=DeepSeek V4 Flash
2026.05
75.3
70.8
TELL
Judge=GPT-OSS 120B
2026.05
74.1
69.5
TELL
Judge=Panel mean
2026.05
72.3
68.3
TELL
Judge=Gemma 4 26B
2026.05
67.5
62.6
TELL
Judge=Nemotron Super
2026.05
66.3
61.5
Feedback
Search any
task
Search any
task