Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

General Language Modeling on BIG-Bench (test)

83.6Accuracy

Best Model

76.11278.0568081.944Oct 26, 2025
Updated 22d ago

Evaluation Results

MethodLinks
2025.10
83.65-14.4
2025.10
76.49-28