MHumanEval

Benchmarks

Task Name	Dataset Name	SOTA Result
Hallucination Assessment	MHumanEval	Response Rate72.6	20
Code Generation	mHumanEval	Pass@10.94	13
Object Hallucination Evaluation	MHumanEval	Hallucination Rate (%)56	12
Multi-type Hallucination Evaluation	MHumanEval	Object Hallucination Rate21.9	9

Showing 4 of 4 rows