Share your thoughts, 1 month free Claude Pro on usSee more

Scene understanding on 40 harbor scenes 1.0 (test)

5.69Latency (s)

gpt-4.1

Updated 5mo ago

Evaluation Results

Method	Links
gpt-4.1 2025.12		5.69	83	83	83	84
gpt-5-minimal 2025.12		7.07	83.3	82	82	86
gpt-5-low 2025.12		17.21	84.5	85	81	86
gpt-5-medium 2025.12		30.04	86.6	87	85	88
gpt-5-high 2025.12		59.22	86.6	87	85	87