Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SAFIM

Benchmarks

Task NameDataset NameSOTA ResultTrend
Block InfillingSAFIM
Pass@169.47
6
Code CompletionSAFIM
Pass@154.1
3
Reasoning failure prediction and recoverySAFIM L3
Accuracy81
2
Reasoning failure prediction and recoverySAFIM L1
Accuracy79
2
Showing 4 of 4 rows