Share your thoughts, 1 month free Claude Pro on us
See more
Feedback
Search any
task
Search any
task
SOTA Generation benchmarks and papers with code | Wizwand
Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Tasks
Generation
Benchmarks
Dataset Name
SOTA Method
Dataset Name
SOTA Method
Metric
Trend
Results
Last Updated
SQuAD
QZO
F1 Score
88.3
44
1mo ago
DVisRAG (Real)
RobustVisRAG
Accuracy
69.03
32
1mo ago
DVisRAG Synthetic
RobustVisRAG
Accuracy
57.87
32
1mo ago
VisRAG Clean
RobustVisRAG
Accuracy
67.33
32
1mo ago
DROP
SubZero-GV (Prefix)
F1 Score
32.9
29
1mo ago
ViDoRe AI v1 (test)
MO-PGD
ASR-G Similarity (mean) @-1
1
23
17d ago
Big-Bench Hard (test)
FLAN-PaLM 540B
Exact Match
57.9
17
1mo ago
UniICL-Bench
UniICL
Perception
86.5
15
22d ago
VPTT-Bench 1.0 (test)
Comb. (BRAG + VPRAG)
VPTT Score (Novelty Adjusted)
0.644
15
1mo ago
ViDoRe AI Targeted Setting III V1
MO-PGD
ASR-G Sim^-1 (Mean)
93
10
17d ago
CIFAR 16x16
NegBio-VAE
FID (5k)
40.2788
8
9d ago
MNIST
NegBio-VAE
FID@5k
79.6727
8
9d ago
Torus RNA (test)
RMF-MT
MMD
0.07
7
1mo ago
Torus Protein PrePro (test)
G-LSD
MMD
0.05
7
1mo ago
Torus Protein Proline (test)
G-LSD
MMD
0.04
7
1mo ago
Torus Protein Glycine (test)
G-LSD
MMD
0.03
7
1mo ago
Torus Protein General (test)
RCT
MMD
0.01
7
1mo ago
MLLMU-Bench (Forget Set)
Vanilla
Rouge Score
64.5
7
1mo ago
SQuAD
CurvZO
Accuracy
0.837
6
25d ago
SAGEO Arena (test)
StageAware
Cite Score
0.58
6
1mo ago
Urdu Generation
Alif-1.0-8B-Inst.
Urdu Generation Score
90.2
5
1mo ago
LegalBench Rule-Application
Oracle (Upper Bound)
Exact Match
59
4
4d ago
TruthfulQA
Oracle
Truthfulness x Informativeness Score
81.5
4
4d ago
DVisRAG (real-degradation)
RobustVisRAG
Top-1 Accuracy
55.39
4
1mo ago
DVisRAG (synthetic-degradation)
RobustVisRAG
Top-1 Accuracy
48.02
4
1mo ago
Showing 25 of 34 rows
25 / page
50 / page
100 / page
1
2
Search any
task
Search any
task
Privacy Policy
Terms of Service
FAQs
Swarm Docs