MCEval

Benchmarks

Task Name	Dataset Name	SOTA Result
Factual Knowledge Retrieval	MCEval mLAMA 8K (test)	Accuracy79.5	14
Hallucination Evaluation	MCEval HaluEval 8K (test)	Accuracy82.2	14
Commonsense Question Answering	MCEval CSQA 8K (test)	Accuracy84.6	14
Paraphrase Identification	MCEval PAWS 8K (test)	Accuracy88.9	14
Topic Classification	MCEval Agnews 8K (test)	Accuracy88.3	14
Named Entity Recognition	MCEval NER 8K (test)	Accuracy0.877	14
Image Captioning Evaluation	MCEval 1.0 (test)	Real Style Score87.8	12
Code Infilling	McEval Multi-line	JavaScript Pass@140	10
Code Infilling	McEval Single-line	JavaScript Pass@178.8	10

Showing 9 of 9 rows