Logic reasoning

Benchmarks

Dataset Name	SOTA Method	Metric
Tracking Shuffled Objects BBH	Role-Play Prompting	Accuracy71.33	59	2mo ago
ZebraLogic		Accuracy96	49	1mo ago
Zebralogic	Qwen 3 VL 32B Think	Score96.1	42	4mo ago
Causal Judgement	Self-discover	Accuracy36	30	4mo ago
K&K Logic Puzzles OOD		Score Threshold 2 (OOD)99	25	4mo ago
K&K Logic Puzzles In-domain		Accuracy (Level 3)98	25	4mo ago
LogicVista	Qwen-8B-DeltaThinker	LogicVista Accuracy61.97	16	1mo ago
Autologic en	DARL	Score0.439	16	4mo ago
Autologic cn	DARL	Score40.3	16	4mo ago
ZebraLogic	NPR	Avg Accuracy @10.817	11	4mo ago
ARC (eval)	NSA (ours)	Tasks Solved75	10	1mo ago
ARC (train)	Ainooson Brute Force [2]	Tasks Solved26	9	1mo ago
Sudoku 8B Instruct (test)		Accuracy71.7	9	3mo ago
MMStar	BAGEL+Ours	MMStar Score67.9	8	2mo ago
Zebra riddles (test)	Glauber-UL2 (N=3)	Accuracy98.7	7	2mo ago
Riddle 1.0 (test)	INMS	F1 Score69	7	4mo ago
Pun 1.0 (test)		F1 Score41	7	4mo ago
Puzzle 1.0 (test)		F1 Score19	7	4mo ago
Enigmata	RL_RACES	Reasoning Score49.2	6	1mo ago
Logic Reasoning Suite LogicVista, VisuLogic	Uni-OPD	Accuracy on LogicVista54	6	2mo ago
Zebralogic	SUPERNOVA-4B	Pass@877	6	3mo ago
Visu Logic	Uni-OPD	Visu Logic Score28	4	2mo ago
ARC-Challenge & LogiQA OpenCompass (test)	CRITIQ	ARC-C Accuracy38.31	4	4mo ago
Large-scale model pool Logic Reasoning 15 LLMs	RouteMoA	Accuracy95.6	3	4mo ago
CommonsenseQA	MIG	Pass@169.8	3	4mo ago

Showing 25 of 28 rows