Share your thoughts, 1 month free Claude Pro on usSee more

Instruction Following on Multi-IF PT

88Accuracy

Gemini-3 Pro

Updated 4mo ago

Evaluation Results

Method	Links
Gemini-3 Pro 2026.03		88
gpt-5.2 2026.03		87.2
Gemini-3 Pro 2026.03		86
kimi-k2 2026.03		86
gpt-5-mini 2026.03		85.8
Qwen3 2026.03		84.4
gpt-5.2 2026.03		83.7
gpt-4.1 2026.03		82.7
sabia-4 2026.03		82
gpt-oss-120b 2026.03		82
deepseek 2026.03		81.5
sabiazinho-4 2026.03		81
gemini-2.5-flash-lite 2026.03		80.8
sabia-3.1 2026.03		80.7
gpt-4.1-mini 2026.03		79.6