Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-modal Reasoning on NuPlanQA EVAL

68.5Traffic Light Accuracy

GPT-4o

38.23646.09353.9561.807Oct 20, 2024
Updated 22d ago

Evaluation Results

MethodLinks
2024.10
68.591.79585.179.676.567.774.686.885.182.484.881.5
2024.10
66.592.294.484.481.27768.575.686.486.681.484.881.6
2024.10
64.593.595.884.669.178.673.273.673.873.880.47678.1
2024.10
64.593.195.884.571.179.672.774.579.676.780.979.179.4
2024.10
61.189.489.68078.575.568.274.179.183.283.481.978.7
2024.10
58.187.184.976.775.17566.872.37681.281.479.576.2
2024.10
54.259.493.869.170.778.666.471.972.382.781.478.873.3
2024.10
53.289.496.479.777.977.673.274.375.476.282.977.577.8
2024.10
52.293.596.480.772.97468.671.872.372.379.974.875.8
2024.10
51.771.969.364.349.22330.934.430.439.610.126.741.8
2024.10
51.270.562.561.44716.823.22927.240.612.126.639
2024.10
50.264.193.869.471.379.664.171.769.684.281.478.473.2
2024.10
48.841.563.551.351.950.541.848.156.560.469.862.253.9
2024.10
47.874.793.872.169.17463.668.972.869.376.97371.3
2024.10
46.873.391.770.670.771.958.266.971.268.882.974.370.6
2024.10
44.346.572.454.457.55244.151.257.665.871.464.956.8
2024.10
40.970.575.562.348.65044.547.757.15560.857.655.9
2024.10
39.468.772.460.254.748.541.448.262.358.958.86056.1