Share your thoughts, 1 month free Claude Pro on usSee more

Claim Verification on LLMAggreFact (test)

78.1Binary Accuracy

ThinknCheck

Updated 15d ago

Evaluation Results

Method	Links
ThinknCheck 2026.04		78.1
MiniCheck 2026.04		77.4
Claude-Sonnet-3.5 2026.04		77.2
GPT-4o 2026.04		75.9
GPT-4 2026.04		75.3
AlignScore 2026.04		70.4
ThinknCheck-nothink 2026.04		57.5
Gemma3 2026.04		55.7
Gemma3 + CoT 2026.04		51.4