Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

GUESS

Benchmarks

Task NameDataset NameSOTA ResultTrend
Confidence EstimationGUESS
Accuracy0.1858
20
Guess task with pre-loaded memoryGuess Hard
Success Rate85
2
Guess task with pre-loaded memoryGuess Easy
Success Rate95
2
Showing 3 of 3 rows