Share your thoughts, 1 month free Claude Pro on usSee more

Discrimination on PsyCLIENT-CP vanilla

83.1Accuracy (A)

Claude-Sonet-3.5

Updated 4mo ago

Evaluation Results

Method	Links
Claude-Sonet-3.5 2026.01		83.1	0.622
DeepSeek-R1 2026.01		27.8	0.158
Qwen3-235B-A22B 2026.01		10.8	0.111
DeepSeek-V3-0324 2026.01		7.2	0.039
GPT-4o 2026.01		5	0.072
Qwen2.5-72B-Instruct 2026.01		1.4	0.008