Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Multi-task Language Understanding on CEval

44.7Accuracy

DeepSeek Chat 7B

24.73229.91635.140.284Jan 11, 2024Mar 16, 2024May 21, 2024Jul 26, 2024Sep 29, 2024Dec 4, 2024Feb 8, 2025
Updated 3d ago

Evaluation Results

MethodLinks
2024.01
44.7
2025.02
44
2024.01
40.6
2024.01
40.3
2024.01
40
2024.01
37.1
2025.02
36.1
2024.01
35.1
2024.01
33.9
32.8
2025.02
27.2
2024.01
26.2
2025.02
25.5