Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Interruption Handling on Full-Duplex-Bench

4.59GPT-4o Score

Qwen-2.5-Omni

0.02441.20972.3953.5803Jan 14, 2026Jan 29, 2026Feb 13, 2026Feb 28, 2026Mar 15, 2026Mar 30, 2026Apr 14, 2026
Updated 4d ago

Evaluation Results

MethodLinks
4.59-2.74
2026.03
4.22920.36
4.2110.4
2026.04
4.190.950.52
2026.03
4.08890.35
2026.04
3.750.851.02
3.6286.71.41
3.620.871.41
3.6150.8671.409
3.3889.11.18
2026.04
3.380.891.18
2026.01
3.3760.8911.183
2026.03
0.771000.26
0.7710.26
2026.01
0.76510.257
2026.01
0.2010.9172.531
2026.03
0.291.72.53
2026.04
0.20.922.53