DeepSeek

Benchmarks

Task Name	Dataset Name	SOTA Result
Jailbreak Attack	DeepSeek	NR Score0	20
Jailbreak attack	DeepSeek-7b five finetuned variants	Average ASR3.8	16
Jailbreak Attack	deepseek-7b v1 (pretrained)	ASR (%)100	13
Constrained LLM Decoding	DeepSeek-V2-Lite-Chat 15.7B	Inference Time (ms)49.91	10
Jailbreaking	DeepSeek V3.2	Attack Success Rate78.5	9
Detection of paraphrased text	DeepSeek Paraphrased V3	ROC AUC (1% FPR)0.4178	8
Watermarking Detection	DeepSeek-7B	AUC100	7
Watermark Detection	DeepSeek	Detection Rate99.3	7
Contribution and Evidence Generation	DeepSeek-V4-Pro generated SFT targets	Entity Fidelity0.977	6
Output conformance to revised specification	DeepSeek-V3 primary grid (1,008 balanced runs)	Quality Score3.79	5
Policy Corruption Evaluation	DeepSeek V3	Compliance4.12	5
Training Throughput	DeepSeek-V2-Lite workload	Training Throughput (tokens/s)114,600	3
CPU Inference Performance Evaluation	DeepSeek Lite V2	Memory Usage (GB)8.8	3
Weight Reconstruction Fidelity	DeepSeek-V3 Weights	Weight ΔW L2 Distance0	3
Optimizer state memory measurement	DeepSeek-V2-Lite (16B) (train)	Average Optimizer State Memory (MB)55.3	2

Showing 15 of 15 rows