Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Scene understanding on 40 harbor scenes 1.0 (test)
Loading...
5.69
Latency (s)
gpt-4.1
3.5488
18.0019
32.455
46.9081
Dec 30, 2025
Latency (s)
Awareness
Hazard Recognition
Implication
Action
Updated 4d ago
Evaluation Results
Method
Method
Links
Latency (s)
Awareness
Hazard Recognition
Implication
Action
gpt-4.1
Model=gpt-4.1
2025.12
5.69
83
83
83
84
gpt-5-minimal
Model=gpt-5, variant=m...
2025.12
7.07
83.3
82
82
86
gpt-5-low
Model=gpt-5, variant=low
2025.12
17.21
84.5
85
81
86
gpt-5-medium
Model=gpt-5, variant=m...
2025.12
30.04
86.6
87
85
88
gpt-5-high
Model=gpt-5, variant=high
2025.12
59.22
86.6
87
85
87
Feedback
Search any
task
Search any
task