Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

ROMAN: Reward-Orchestrated Multi-Head Attention Network for Autonomous Driving System Testing

About

Automated Driving System (ADS) acts as the brain of autonomous vehicles, responsible for their safety and efficiency. Safe deployment requires thorough testing in diverse real-world scenarios and compliance with traffic laws like speed limits, signal obedience, and right-of-way rules. Violations like running red lights or speeding pose severe safety risks. However, current testing approaches face significant challenges: limited ability to generate complex and high-risk law-breaking scenarios, and failing to account for complex interactions involving multiple vehicles and critical situations. To address these challenges, we propose ROMAN, a novel scenario generation approach for ADS testing that combines a multi-head attention network with a traffic law weighting mechanism. ROMAN is designed to generate high-risk violation scenarios to enable more thorough and targeted ADS evaluation. The multi-head attention mechanism models interactions among vehicles, traffic signals, and other factors. The traffic law weighting mechanism implements a workflow that leverages an LLM-based risk weighting module to evaluate violations based on the two dimensions of severity and occurrence. We have evaluated ROMAN by testing the Baidu Apollo ADS within the CARLA simulation platform and conducting extensive experiments to measure its performance. Experimental results demonstrate that ROMAN surpassed state-of-the-art tools ABLE and LawBreaker by achieving 7.91% higher average violation count than ABLE and 55.96% higher than LawBreaker, while also maintaining greater scenario diversity. In addition, only ROMAN successfully generated violation scenarios for every clause of the input traffic laws, enabling it to identify more high-risk violations than existing approaches.

Jianlei Chi, Yuzhen Wu, Jiaxuan Hou, Xiaodong Zhang, Ming Fan, Suhui Sun, Weijun Dai, Bo Li, Jianguo Sun, Jun Sun• 2026

Related benchmarks

TaskDatasetResultRank
Scenario GenerationScenario S2
Inference Time (s)47
3
Scenario GenerationScenario S3
Inference Time (s)52
3
Scenario GenerationScenario S1
Inference Time (s)44
3
Scenario GenerationScenario S4
Inference Time (s)53
3
Violation Scenario GenerationScenario S1
Mean Score9.86
3
Violation Scenario GenerationScenario S2
Mean Violation8.07
3
Violation Scenario GenerationScenario S3
Mean Score6.41
3
Violation Scenario GenerationScenario S4
Mean Violation7.15
3
Showing 8 of 8 rows

Other info

Follow for update