Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Scheduling That Speaks: An Interpretable Programmatic Reinforcement Learning Framework

About

Deep reinforcement learning (DRL) has recently emerged as a promising approach to solve combinatorial optimization problems such as job shop scheduling. However, the policies learned by DRL are typically represented by deep neural networks (DNNs), whose opaque neural architectures and non-interpretable policy decisions can lead to critical trust and usability concerns for human decision makers. In addition, the computational requirements of DNNs can further hinder practical deployment in resource constrained environments. In this work, we propose ProRL, a novel interpretable programmatic reinforcement learning framework that achieves high-performance scheduling with human-readable and editable programmatic policies (i.e., programs). We first introduce a domain-specific language for scheduling (DSL-S) to represent scheduling strategies as structured programs. ProRL then explores the program space defined by DSL-S using local search to identify incomplete programs, which are subsequently completed by learning their parameters via Bayesian optimization. ProRL learns which scheduling heuristic rules to select, and hence, it naturally incorporates existing heuristics already used in industrial scenarios. Experiments on widely used benchmark instances demonstrate the strong performance of ProRL against existing heuristics and DRL baselines. Furthermore, ProRL performs well under strongly constrained computational resources, such as training with only 100 episodes. Our code is available at https://github.com/HcPlu/ProRL.

Chengpeng Hu, Yingqian Zhang, Hendrik Baier• 2026

Related benchmarks

TaskDatasetResultRank
Job-Shop Scheduling ProblemDMU benchmark of JSSP
Average Gap (Instance-wise)9.34
104
Job Shop Schedulingta
Gap to BKS1.02
72
Job-Shop Scheduling ProblemTaillard 30 x 15
Optimality Gap (%)11.29
26
Job-Shop Scheduling ProblemTaillard 15 x 15
Optimality Gap9.14
17
Job-Shop Scheduling ProblemTaillard 20 x 15
Optimality Gap (%)12
17
Job-Shop Scheduling ProblemTaillard 20 x 20
Gap (%)11.32
17
Job-Shop Scheduling ProblemTaillard 50 x 15
Optimality Gap (%)5.81
17
Job Shop SchedulingFisher-Thompson 10 × 10 FT10 (test)
Optimality Gap (%)8.28
16
Job Shop SchedulingTA 100 x 20
Optimality Gap (BKS)1.02
9
Job Shop SchedulingABZ 10 × 10
Makespan Gap3.56
9
Showing 10 of 43 rows

Other info

Follow for update