Share your thoughts, 1 month free Claude Pro on usSee more

Compositional Reasoning Dataset

43.3Correction Score (C)

CREME

Updated 3mo ago

Evaluation Results

Method	Links
CREME 2024.02		43.3	23.71	3.61	1.24
CREME 2024.02		17	7.99	1.27	0.86
OpenAlpaca-3B 2024.02		7.2	7	13.5	0.6
LLAMA-2-7B 2024.02		3.2	2.3	13.1	0.3
Memory Injection 2024.02		2.21	0.3	0.32	26.72
CoT-PatchScopes 2024.02		1.2	-	-	-
Memory Injection 2024.02		0.98	0.45	0.75	2.93
CoT-PatchScopes 2024.02		0.91	-	-	-