Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Cross-task Generalization on Painting (test)
Loading...
0.6503
Similarity Score
OOWM 3-Stage
0.571988
0.592319
0.61265
0.632981
Feb 25, 2026
Similarity Score
Precision
Recall
F1 Score
Updated 5d ago
Evaluation Results
Method
Method
Links
Similarity Score
Precision
Recall
F1 Score
OOWM 3-Stage
Training Configuration...
2026.02
0.6503
18.92
27.69
17.15
OOWM 2-Stage
Training Configuration...
2026.02
0.6156
15.66
14.98
15.31
Unstructured Baseline
2026.02
0.604
14.71
36.65
20.87
Hybrid Strategy
2026.02
0.575
17.5
15.55
16.43
Feedback
Search any
task
Search any
task