Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

epbench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Episodic MemoryEpBench 200-Chapters Book (test)
Average Cost ($)0.009
6
Episodic Memory RetrievalEpbench 200-Chapters Book (Overall)
Precision86.5
6
Episodic Memory RetrievalEpbench 6+ Cues 200-Chapters Book
Precision94
6
Episodic Memory RetrievalEpbench 3-5 Cues 200-Chapters Book
Precision87.8
6
Episodic Memory RetrievalEpbench 2 Cues 200-Chapters Book
Precision81.7
6
Episodic Memory RetrievalEpbench Chapters Book 200 (1 Cue)
Precision75.5
6
Episodic Memory RetrievalEpbench 0 Cues 200-Chapters Book
Precision97.8
6
Long-context Question Answeringepbench ep_news
F1 Score51.23
6
Long-context Question Answeringepbench ep_scifi
F1 Score52.04
6
Long-context Question Answeringepbench ep_default
F1 Score53.25
6
Episodic Memory RecallEpbench Book 2000-Chapters (Overall)
Precision83
5
Episodic Memory RecallEpbench 6+ Cues 2000-Chapters
Precision91.1
5
Episodic Memory RecallEpbench 3-5 Cues 2000-Chapters
Precision84.1
5
Episodic Memory RecallEpbench 2 Cues 2000-Chapters
Precision84.5
5
Episodic Memory RecallEpbench 2000-Chapters Book 1 Cue
Precision76.1
5
Episodic Memory RecallEpbench 0 Cues 2000-Chapters Book
Precision94.3
5
Episodic MemoryEpbench 2000-Chapters Book (test)
Precision83
5
Showing 17 of 17 rows