Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-turn instruction following on MultiIF

68.93Normalized Score

Qwen3-Max-Thinking

24.542836.066447.5959.1136Mar 24, 2026
Updated 24d ago

Evaluation Results

MethodLinks
2026.03
68.9343
2026.03
64.2343
2026.03
31.7643
2026.03
26.9443
2026.03
26.2543