Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

EVAL

Benchmarks

Task NameDataset NameSOTA ResultTrend
Instruction Synthesis for Image EditingEval-400 In-house
S Score4.712
11
Automatic Speech RecognitionEval2000 Fisher-Switchboard 2300-h (test)
WER10.9
9
Instruction GenerationEval-400 In-house (test)
Correctness66
7
Spoken Dialogue System (SDS) Semantic Quality EvaluationEval2000 (test)
ROUGE-L12.1
6
Audio Quality EvaluationEval2000
UTMOS3.34
6
Speaking Style ConsistencyEval2000 (test)
Emotion Rank4.92
5
Intelligibility EvaluationEval2000
WER1
4
Automatic Speech RecognitionEval2000 Switchboard (SW) 300-hour (test)
WER12.5
4
3D Human Pose EstimationEVAL cross-view
Head0.939
2
Showing 9 of 9 rows