Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Ubuntu

Benchmarks

Task NameDataset NameSOTA ResultTrend
Response SelectionUbuntu (test)
Recall@1 (Top 10)0.886
58
Response SelectionUbuntu v2 (test)
MRR91.9
20
Dialogue Response SelectionUbuntu (test)
R@1 (R10)0.867
18
Multi-party dialogue generationUbuntu IRC-16
BLEU-116.04
12
Answer RankingUbuntu v2 (test)
Recall@1 (1/2 Pool)91.5
11
Dialogue RetrievalUbuntu v2
R@186.3
9
Conversation DisentanglementUbuntu (test)
VI91.5
8
Non-membership detectionUbuntu
P-Value0.0001
5
Response Selection (Subtask 1)Ubuntu (dev)
Recall@1 (R100)0.534
2
Dialogue Thread Reconstruction (Subtask 4)Ubuntu Track 2 DSTC 8 (test)
Precision44.3
1
Response Selection with No Correct Response (Subtask 2)Ubuntu Track 2 DSTC 8 (test)
Recall@150.6
1
Response Selection (Subtask 1)Ubuntu Track 2 DSTC 8 (test)
Recall@164.9
1
Response Selection (Subtask 1)Ubuntu (test)
Recall@1 (R100)60.8
1
Showing 13 of 13 rows