Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Data Contamination Detection on K&K
Loading...
70
F1 Score
Min-K%
5.52
22.26
39
55.74
Oct 10, 2025
F1 Score
AUC
Updated 1mo ago
Evaluation Results
Method
Method
Links
F1 Score
AUC
Min-K%
Target Model=Qwen2.5-7...
2025.10
70
54
Self-Critique
Target Model=Qwen2.5-7...
2025.10
69
66
Entropy-Noise
Target Model=Qwen2.5-7...
2025.10
68
52
PPL
Target Model=Qwen2.5-7...
2025.10
67
47
Min-K%++
Target Model=Qwen2.5-7...
2025.10
67
40
Recall
Target Model=Qwen2.5-7...
2025.10
67
47
CDD
Target Model=Qwen2.5-7...
2025.10
67
47
Min-K%
Target Model=DeepSeek-...
2025.10
67
46
Self-Critique
Target Model=DeepSeek-...
2025.10
60
63
Entropy-Temp
Target Model=Qwen2.5-7...
2025.10
59
49
Entropy-Noise
Target Model=DeepSeek-...
2025.10
48
52
Recall
Target Model=DeepSeek-...
2025.10
39
54
PPL
Target Model=DeepSeek-...
2025.10
34
54
Entropy-Temp
Target Model=DeepSeek-...
2025.10
23
43
Min-K%++
Target Model=DeepSeek-...
2025.10
15
47
CDD
Target Model=DeepSeek-...
2025.10
8
48
Feedback
Search any
task
Search any
task