Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Instruction Generation on Eval-400 In-house (test)

66Correctness

Gemini-3-Pro

35.8443.6751.559.33Apr 9, 2026
Updated 9d ago

Evaluation Results

MethodLinks
6621121
2026.04
662310.40.6
2026.04
572913.250.75
2026.04
48.7542.258.250.75
2026.04
41.7547.758.52
38.651.29.21
2026.04
3751.5110.5