Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Model Merging on Large-scale tasks (2, 4, 8, 12, 16, 20 tasks merged)

98.2Average Normalized Accuracy

DOGE AM

41.41656.15870.985.642Aug 5, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.08
98.2------
2025.08
97------
2025.08
96.7------
2025.08
96.5------
2025.08
96.4------
2025.08
96.3------
2025.08
96.2------
2025.08
95.5------
2025.08
95.4------
2025.08
95.3------
2025.08
95------
2025.08
94.1------
2025.08
94------
2025.08
92.9------
2025.08
91.7------
2025.08
91.3------
2025.08
90.8------
2025.08
89.6------
2025.08
87.7------
2025.08
87.7------
2025.08
86.9------
2025.08
85.4------
2025.08
84.6------
2025.08
83.8------
2025.08
83.7------
2025.08
83.1------
2025.08
82.4------
2025.08
81.6------
2025.08
80.2------
2025.08
79.2------
2025.08
78.2------
2025.08
73.5------
2025.08
71------
2025.08
70.9------
2025.08
66.3------
2025.08
43.6------
2025.08
-97.893.284.581.88476.1
2025.08
-94.293.98574.76038.6
2025.08
-91.292.188.785.180.967.4
2025.08
-99.59895.99594.188.9
2025.08
-98.696.893.891.989.779.4
2025.08
-98.598.597.696.693.988.5
2025.08
-99.198.998.396.895.592
2025.08
-98.491.48783.182.274.6
2025.08
-99.397.795.794.593.786.9
2025.08
-99.397.896.394.393.888
2025.08
-99.598.196.593.393.784.9
2025.08
-99.998.99897.696.892.4