Share your thoughts, 1 month free Claude Pro on usSee more

Response Generation on HiCUPID

63.9Accuracy

DeepSeek-R1-671B

Updated 4mo ago

Evaluation Results

Method	Links
DeepSeek-R1-671B 2026.01		63.9
Qwen3-32Bthinking 2026.01		63.44
ALIGNXPLORE+ 2026.01		62.42
GPT-OSS-20B 2026.01		62
ALIGNXPLORE+ 2026.01		60.51
DeepSeek-R1-671B 2026.01		60.32
Qwen3-8Bthinking 2026.01		60.05
DS-R1-Distill-Qwen-7B 2026.01		60.01
GPT-OSS-20B 2026.01		59.93
Qwen3-32Bthinking 2026.01		59.83
DS-R1-Distill-Qwen-7B 2026.01		59.29
Qwen3-8Bthinking 2026.01		59.17
ALIGNXPLORE 2026.01		53.5
ALIGNXPLORE 2026.01		50.34
TALLRec 2026.01		47.41
Qwen3-8Bnon-thinking 2026.01		47.02