Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-task model alignment and mixing on Math, Chat, IF, and General QA tasks Llama-3.1-8B (test)

36Math Accuracy

Mod. Surgery

34.9635.2335.535.77Feb 2, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.02
3630.53033.132.6
2026.02
35.824.231.130.330.3
2026.02
3522.925.425.227.3