Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MPT

Benchmarks

Task NameDataset NameSOTA ResultTrend
Latent Preference ModelingMPT Context-Free Average
F1 Score58.5
19
Latent Preference ModelingMPT Context-Free, Preference Transfer
Precision30.92
19
Latent Preference ModelingMPT Context-Free Preference Induction
Precision0.5487
19
Latent Preference ModelingMPT Context-Free, Preference Recall
Precision76.1
19
Preference-driven Tool CallingMPT Context-Guided Average
OA-F167.18
19
Preference-driven Tool CallingMPT Context-Guided, Preference Transfer
P-EM26.19
19
Preference-driven Tool CallingMPT Context-Guided, Preference Induction
P-EM37.95
19
Preference-driven Tool CallingMPT Context-Guided, Preference Recall
P-EM64.88
19
Automated Speech RecognitionMPT (test)
WER0.2507
6
Jailbreak AttackMPT-7B
Attack Success Rate15
4
Showing 10 of 10 rows