Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Medical Logical Reasoning (Rule 1: is_malignant) on HAM 5000 samples (test)
Loading...
100
F1-score (NS-CL)
GroundTruth (Oracle)
-4
23
50
77
May 5, 2026
F1-score (NS-CL)
F1-score (Decision Tree)
Updated 22d ago
Evaluation Results
Method
Method
Links
F1-score (NS-CL)
F1-score (Decision Tree)
GroundTruth (Oracle)
Supervision (%)=Oracle/GT
2026.05
100
100
GlobalVAE
Supervision (%)=75%
2026.05
92.9
92.8
SlotVAE (k=1)
Supervision (%)=75%
2026.05
75.9
75.1
DINOv2
Supervision (%)=1%
2026.05
37
29.8
ResNet+SA (k=2)
Supervision (%)=1%
2026.05
29
18.8
GlobalVAE
Supervision (%)=1%
2026.05
16.6
13.7
SlotVAE (k=1)
Supervision (%)=1%
2026.05
13.6
12.5
SlotVAE (k=2)
Supervision (%)=1%
2026.05
8.3
15.7
SlotVAE (k=10)
Supervision (%)=1%
2026.05
0.1
1.7
SlotVAE (k=5)
Supervision (%)=1%
2026.05
0
11.7
Feedback
Search any
task
Search any
task