Temporal Transfer Learning for Traffic Optimization with Coarse-grained Advisory Autonomy

About

The recent development of connected and automated vehicle (CAV) technologies has spurred investigations to optimize dense urban traffic to maximize vehicle speed and throughput. This paper explores advisory autonomy, in which real-time driving advisories are issued to the human drivers, thus achieving near-term performance of automated vehicles. Due to the complexity of traffic systems, recent studies of coordinating CAVs have resorted to leveraging deep reinforcement learning (RL). Coarse-grained advisory is formalized as zero-order holds, and we consider a range of hold duration from 0.1 to 40 seconds. However, despite the similarity of the higher frequency tasks on CAVs, a direct application of deep RL fails to be generalized to advisory autonomy tasks. To overcome this, we utilize zero-shot transfer, training policies on a set of source tasks--specific traffic scenarios with designated hold durations--and then evaluating the efficacy of these policies on different target tasks. We introduce Temporal Transfer Learning (TTL) algorithms to select source tasks for zero-shot transfer, systematically leveraging the temporal structure to solve the full range of tasks. TTL selects the most suitable source tasks to maximize the performance of the range of tasks. We validate our algorithms on diverse mixed-traffic scenarios, demonstrating that TTL more reliably solves the tasks than baselines. This paper underscores the potential of coarse-grained advisory autonomy with TTL in traffic flow optimization.

Jung-Hoon Cho, Sirui Li, Jeongyun Kim, Cathy Wu• 2023

Related benchmarks

Task	Dataset	Result
Traffic Signal Control	Traffic Signal Speed Limit variation	Normalized Reward88.74	6
Advisory autonomy	Advisory Autonomy Highway ramp (Acceleration guidance)	Normalized Reward0.657	6
Dynamic eco-driving	Eco-Driving Inflow variation	Normalized Reward0.5299	6
Advisory autonomy	Advisory Autonomy Single lane ring (Speed guidance)	Normalized Reward0.9819	6
Advisory autonomy	Advisory Autonomy Highway ramp (Speed guidance)	Normalized Reward64.61	6
Dynamic eco-driving	Eco-Driving Penetration Rate variation	Normalized Reward0.5992	6
Dynamic eco-driving	Eco-Driving Green Phase variation	Normalized Reward0.4678	6
Traffic Signal Control	Traffic Signal Inflow variation	Normalized Reward0.8682	6
Advisory autonomy	Advisory Autonomy Single lane ring (Acceleration guidance)	Normalized Reward90.21	6
Traffic Signal Control	Traffic Signal Road Length variation	Normalized Reward0.9349	6

Showing 10 of 10 rows

Other info

Follow for update

@wizwand_team Discord