Share your thoughts, 1 month free Claude Pro on usSee more

Logical Reasoning on BBH Web of Lies

98Accuracy

evaluation-instructed prompt optimization

Updated 1mo ago

Evaluation Results

Method	Links
evaluation-instructed prompt optimization 2025.11		98
LLM only 2025.11		96
Pro-Refine 2025.11		96
TextGrad 2025.11		96
APE 2025.11		95
Self-Refine 2025.11		94
TextGrad 2025.11		74
TextGrad 2025.11		73
APE 2025.11		72
Pro-Refine 2025.11		71
APE 2025.11		71
Pro-Refine 2025.11		70
evaluation-instructed prompt optimization 2025.11		69
LLM only 2025.11		69
evaluation-instructed prompt optimization 2025.11		68
Self-Refine 2025.11		67
LLM only 2025.11		66
Self-Refine 2025.11		66