Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MuST-Bench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Style Translation Fidelity EvaluationMuST-Bench
GPT-4V Fidelity (EN)3.89
4
Stylized Text GenerationMuST-Bench Korean
Style Fidelity40.69
4
Stylized Text GenerationMuST-Bench Chinese
Style Fidelity0.4042
4
Stylized Text GenerationMuST-Bench English
Style Fidelity0.4054
4
Visual Text GenerationMuST-Bench Korean
OCR Accuracy65.77
4
Visual Text GenerationMuST-Bench Chinese
OCR Accuracy74.81
4
Visual Text GenerationMuST-Bench English
OCR Accuracy71.63
4
Showing 7 of 7 rows