Fisher

Benchmarks

Task Name	Dataset Name	SOTA Result
Conditional Dialogue Generation	Fisher (test)	GPT-4o Score7.3	64
Speech-to-speech translation	Fisher Spanish-English (test)	BLEU (Speech Input)90.5	55
Speech-to-speech translation	Fisher Spanish-English (dev)	BLEU (Speech)88.5	48
Speech-to-speech translation	Fisher Spanish-English (dev2)	ASR BLEU89.4	36
Unconditional Dialogue Generation	Fisher (test)	GPT-4o Score9.33	32
Speech Translation	Fisher Monolingual (test)	BLEU35.87	11
Speech Translation	Fisher Code-Switching (test)	BLEU37.51	11
Speaker-Attributed Automatic Speech Recognition	Fisher (test)	WDER0.9	11
Speech-to-Speech Translation	Fisher Es→En (test)	ASR chrF70.2	10
Speech-to-Speech Translation	Fisher Es→En (dev)	ASR chrF69.5	10
Conditional Turn-taking Evaluation	Fisher (test)	Occurrence Proportion58	7
Unconditional Turn-taking Evaluation	Fisher (test)	Occurrence Rate60	7
Backchanneling	Fisher	Init Rate97.8	5
Window-level Turn-taking	Fisher	Onset MAE0.69	5
Dialogue Generation	Fisher	M-MOS4.25	4
ES-to-EN AST	Fisher (test)	BLEU64.7	4
Speaker-Attributed Automatic Speech Recognition	Fisher Global Meeting-level	DER15.21	4
Speaker-Attributed Automatic Speech Recognition	Fisher (local setting)	DER8.18	4
Cross-lexical backchannel similarity	Fisher	Proportion of Correct Selections66.3	3
Prosodic backchannel similarity	Fisher	Proportion Correct Selections69.7	3
Fine-grained Score Accuracy	Fisher	Exact Accuracy64.76	1
Binary classification (Human vs Machine speech)	Fisher (Human-Human) OOD (test)	Accuracy98.44	1

Showing 22 of 22 rows