Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Full-Duplex-Bench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Interruption HandlingFull-Duplex-Bench
GPT-4o Score4.59
18
Turn TakingFull-Duplex-Bench
TOR99.2
17
Pause HandlingFull-Duplex-Bench Candor
TOR1
13
BackchannelingFull-Duplex-Bench
TOR100
11
Pause HandlingFull-Duplex-Bench Synthetic
TOR99
11
Full-duplex Speech Interaction Latency AnalysisFull-Duplex-Bench v1.5
Stop Latency (Mean)0.68
8
Duplex Dialogue Turn-TakingFull-Duplex-Bench
Synthetic TOR for Pause Handling0.058
8
Full-Duplex Speech InteractionFull-Duplex-Bench Background Speech 1.5
Respond Rate93
7
Full-Duplex Speech InteractionFull-Duplex-Bench 1.5 (Talking to Other)
Response Rate91
7
Full-Duplex Speech InteractionFull-Duplex-Bench User Backchannel 1.5
Respond Rate7
7
Full-Duplex Speech InteractionFull-Duplex-Bench User Interruption 1.5
Response Rate78
7
Voice Cloning Speaker SimilarityFull-Duplex-Bench
SSIM57
5
Dialog NaturalnessFull-Duplex-Bench
DMOS3.9
5
User InterruptionFull-Duplex-Bench 1.0
TOR1
2
BackchannelFull-Duplex-Bench 1.0
TOR1
2
Overlap Handling EvaluationFull-Duplex-Bench User Interruption v1.5
STOI0.97
2
Overlap Handling EvaluationFull-Duplex-Bench User Backchannel v1.5
STOI91
2
Overlap Handling EvaluationFull-Duplex-Bench Talking to Other v1.5
STOI0.96
2
Overlap Handling EvaluationFull-Duplex-Bench Background Speech v1.5
STOI0.98
2
Turn TakingFull-Duplex-Bench Bilingual Chinese
TOR99.4
2
Turn TakingFull-Duplex-Bench EN
Latency (ms)205
1
Dialog NaturalnessFull-Duplex-Bench User Interruption category
Metric-
0
Showing 22 of 22 rows