Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Multi-domain language model evaluation on ODA benchmark suite (test)

71.2General Accuracy

ODA-Mixture-500k

44.47251.41158.3565.289Dec 30, 2025
Updated 3d ago

Evaluation Results

MethodLinks
2025.12
71.277.27369.772.80.042
2025.12
65.979.759.563.267.10.028
2025.12
64.971.85963.664.80.168
2025.12
64.864.975.859.366.20.045
2025.12
64.577.263.665.867.80.023
2025.12
63.472.866.759.665.60.039
2025.12
61.74652.754.153.60.49
2025.12
61.177.373.264.7690.177
2025.12
60.74457.953.854.19.92
2025.12
59.575.456.166.664.40.107
2025.12
58.751.252.450.653.2-
2025.12
57.777.439.544.854.80.016
2025.12
56.871.264.451.5610.149
2025.12
55.878.368.166670.043
2025.12
55.564.438.851.952.70.084
2025.12
527126.351.550.20.006
2025.12
51.439.850.142.746-
2025.12
51.369.840.158.9550.086
2025.12
49.952.368.744.453.80.024
2025.12
47.171.247.657.255.80.027
2025.12
45.571.86754.359.60.011