Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Common

Benchmarks

Task NameDataset NameSOTA ResultTrend
Classification ProbingCommon (test)
Probe Accuracy (Best Layer)76.9
21
Commonsense ReasoningCommon
Accuracy65.69
4
Showing 2 of 2 rows