Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

VISE

Benchmarks

Task NameDataset NameSOTA ResultTrend
Sycophancy EvaluationVISE
Strong Bias58.04
9
Sycophancy EvaluationVISE 1.0 (test)
Strong Bias64.84
3
Consistent Response ScoreVISE
Strong Bias Score3.64
3
Showing 3 of 3 rows