Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Math, Chat, IF, and General QA tasks

Benchmarks

Task NameDataset NameSOTA ResultTrend
Multi-task model alignment and mixingMath, Chat, IF, and General QA tasks Llama-3.1-8B (test)
Math Accuracy36
3
Showing 1 of 1 rows