Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MemoTrap

Benchmarks

Task NameDataset NameSOTA ResultTrend
completion taskMemoTrap (test)
Accuracy61.83
30
Knowledge RetrievalMemoTrap
Proverb Score42.5
12
Hallucination PredictionMemoTrap
Accuracy65
6
Knowledge Conflict ResolutionMemoTrap
Micro Accuracy77.35
4
Showing 4 of 4 rows