Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SuperNI

Benchmarks

Task NameDataset NameSOTA ResultTrend
Multi-Task Instruct-TuningSuperNI (test)
ROUGE Score57.35
72
Continual LearningSuperNI (Order 2)
AP49.26
20
Continual LearningSuperNI (Order 1)
AP49.48
20
Instruction FollowingSuperNI Hold-In v1.0 (test)
ROUGE-L Score62.47
18
Instruction FollowingSuperNI Hold-Out v1.0 (test)
ROUGE-L53.53
18
Continual LearningSuperNI Benchmark
Average Score50.9
14
Continual LearningSuperNI Large Number of Tasks (test)
Average Performance82.1
13
Continual LearningSuperNI Standard CL Benchmark (test)
Average Performance81.9
13
Continual LearningSuperNI
AP56.95
13
Continual LearningSuperNI (test)
AP56.23
13
Instruction FollowingSuperNI Unseen
ROUGE-L37.97
9
Instruction FollowingSuperNI In-domain
ROUGE-L52.26
9
Continual LearningSuperNI
FWT (O1)1.87
9
Unimodal Language GenerationSuperNI Order 2
AP51.54
5
Unimodal Language GenerationSuperNI (Order 1)
AP50.84
5
Continual LearningSuperNI (unseen tasks)
Dialog Score11.56
4
Continual LearningSuperNI Benchmark
Metric-
0
Showing 17 of 17 rows