Share your thoughts, 1 month free Claude Pro on usSee more

Multi-turn Interaction-based Problem Solving on MINT-Bench 1.0 (test)

11.76Code Generation Score

LLAMA PRO - INSTRUCT

Updated 4mo ago

Evaluation Results

Method	Links
LLAMA PRO - INSTRUCT 2024.01		11.76	29.1	9.81	14.68
Mistral-Instruct-v0.1 2024.01		6.62	34.33	8.54	13.99
CodeLLaMA-7B-Instruct 2024.01		2.21	17.16	7.91	8.7
AgentLM-7B 2024.01		1.47	9.7	8.86	7.34
LLaMA2-7B-Chat 2024.01		0	0	13.61	7.34