Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

T3

Benchmarks

Task NameDataset NameSOTA ResultTrend
SafetyT3
T3 Score85.1
21
ClusteringT3
Clustering Accuracy (CA)58.63
12
ClusteringT3
ARI0.0338
12
Stitched image rectanglingT3 (test)
PSNR25.1
4
Research AssistantT3 Research 1.0 (test)
Task Completion Rate88
4
Task T3T3
Token Usage (Input + Output)2,156
4
Predictive ModelingT3
Loss0.063
3
Showing 7 of 7 rows