Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Formal Verification on Memory Scheduler
Loading...
29
Assertion Count
Saarthi
6.12
12.06
18
23.94
Mar 3, 2026
Assertion Count
1st Generation Attempts
Fix Attempts Count
% Proven
% Coverage
Updated 1mo ago
Evaluation Results
Method
Method
Links
Assertion Count
1st Generation Attempts
Fix Attempts Count
% Proven
% Coverage
Saarthi
Backbone=GPT-5, Pass l...
2026.03
29
-
1
32.01
48.86
Saarthi
Backbone=GPT-5, Pass l...
2026.03
28
-
0
32.14
58.21
Saarthi
Backbone=GPT-4.1, Pass...
2026.03
24
-
2
35.71
42.93
Saarthi
Backbone=GPT-4.1, Pass...
2026.03
23
-
1
21.74
46.39
Saarthi
Backbone=GPT-5, Pass l...
2026.03
22
-
2
40.91
54.73
Saarthi
Backbone=Llama3.3, Pas...
2026.03
17
-
0
17.65
37.3
Saarthi
Backbone=GPT-4.1, Pass...
2026.03
16
-
0
50
43.32
Saarthi
Backbone=Llama3.3, Pas...
2026.03
13
-
3
30.77
39.47
Saarthi
Backbone=Llama3.3, Pas...
2026.03
7
-
0
71.43
44.2
Feedback
Search any
task
Search any
task