Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Data Contamination Detection on DETCON Logical Reasoning
Loading...
70.6
Accuracy
CDD
48.864
54.507
60.15
65.793
Feb 24, 2024
Accuracy
F1 Score
AUC
Average Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
F1 Score
AUC
Average Score
CDD
2024.02
70.6
76.5
84.6
77.3
N-gram
level=token-level
2024.02
65.6
49.8
-
57.7
Embedding similarity
2024.02
59.2
64.5
66.8
63.5
N-gram
level=char-level
2024.02
56.4
67
-
61.7
Min-k% Prob
2024.02
52.7
67.7
69.8
63.4
LLM Decontaminator
note=needs additional...
2024.02
50.9
43.3
-
47.1
Perplexity
2024.02
49.7
66.4
69.9
62
Feedback
Search any
task
Search any
task