Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

GPT-5-pro

Benchmarks

Task NameDataset NameSOTA ResultTrend
Text DetectabilityGPT-5-pro Experiment 2 averaged over n ∈ {3, ..., 10} 2025-10-06
BERT Score99
5
Agreement analysis of causal graph metricsGPT-5-pro Experiment 3 2025-10-06
Pearson Correlation Coefficient0.929
4
Showing 2 of 2 rows