Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

FURINA-Bench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Role-playing Dialogue EvaluationFURINA-Bench English
Context Reliance42.99
15
Role-playing Dialogue EvaluationFURINA-Bench Chinese
Context Reliance71.39
12
Showing 2 of 2 rows