Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

BiPO

Benchmarks

Task NameDataset NameSOTA ResultTrend
Behavior selectionBiPO (test)
Hallucination Pair Accuracy64
14
Behavior GenerationBiPO
Hallucination Score1.59
5
Showing 2 of 2 rows