Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Self-doubt detection on Original Source Datasets
Loading...
84.06
Self-Doubt AUROC
SELFDOUBT
68.096
72.2405
76.385
80.5295
Apr 7, 2026
Self-Doubt AUROC
Updated 9d ago
Evaluation Results
Method
Method
Links
Self-Doubt AUROC
SELFDOUBT
Model=gpt-oss-120b
2026.04
84.06
SELFDOUBT
Model=Claude Sonnet 4.6
2026.04
83.58
SELFDOUBT
Model=Grok 4.1 Fast
2026.04
81.01
SELFDOUBT
Model=gpt-oss-20b
2026.04
79.08
SELFDOUBT
Model=Qwen3-14B
2026.04
78.76
SELFDOUBT
Model=Qwen3
2026.04
77.45
SELFDOUBT
Model=Gemini 2.5 Flash
2026.04
68.71
Feedback
Search any
task
Search any
task