Share your thoughts, 1 month free Claude Pro on us
See more
Feedback
Search any
task
Search any
task
SOTA Model Merging benchmarks and papers with code | Wizwand
Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Tasks
Model Merging
Benchmarks
Dataset Name
SOTA Method
Dataset Name
SOTA Method
Metric
Trend
Results
Last Updated
8 Vision tasks (test)
Individual Fine-tuned
Accuracy
95.8
77
20d ago
Average of 8 benchmarks
Pico
Average Accuracy
52.79
72
1mo ago
GLUE CoLA, MRPC, RTE, SST-2
Task arithmetic
Absolute Accuracy
75.9
60
3mo ago
20-task vision merging scenario (test)
Individual Fine-tuned
Accuracy
94.7
44
20d ago
14-task vision merging scenario (test)
Individual Fine-tuned
Accuracy
94.3
44
20d ago
Language Benchmarks 5-task
Indiv.
Score
0.63
44
20d ago
Large-scale tasks
DOGE AM
Average Normalized Accuracy
98.2
36
1mo ago
Sustainability to large-scale tasks
DOGE AM
Average Normalized Accuracy
91.4
24
1mo ago
Sustainability to large-scale tasks 2 tasks
DOGE AM
Average Normalized Accuracy
101.2
24
1mo ago
7 NLP tasks (test)
EXPERTS
Accuracy
79.2
22
2mo ago
Large-scale tasks 16 tasks merged
DOGE AM
Average Normalized Accuracy
91.5
12
1mo ago
Large-scale tasks 12 tasks merged
DOGE AM
Avg Normalized Acc
94.3
12
1mo ago
Sustainability to large-scale tasks (20 tasks)
RegMean++
Average Normalized Accuracy
82.9
12
1mo ago
Sustainability to large-scale 8 tasks
DOGE AM
Avg Normalized Accuracy
94.8
12
1mo ago
Sustainability to large-scale tasks 4 tasks
DOGE AM
Average Accuracy
98.3
12
1mo ago
LLM Evaluation Suite
KARCHER
Normalized Score
0.401
12
2mo ago
Vision, Language, and Multi-modal tasks
Multiple Models
Parameters
8
11
14d ago
Qwen3-4B-Base Transfer 8 benchmarks
Pico
Math Accuracy
32.65
6
1mo ago
LLM Benchmark Family MMLU, TruthfulQA, BBQ, CNN/DailyMail
SA-Merging
MMLU Score
69.87
5
1d ago
Showing 19 of 19 rows
25 / page
50 / page
100 / page
1
Search any
task
Search any
task
Privacy Policy
Terms of Service
FAQs
Swarm Docs