Learning to Dispatch for Job Shop Scheduling via Deep Reinforcement Learning
About
Priority dispatching rule (PDR) is widely used for solving real-world Job-shop scheduling problem (JSSP). However, the design of effective PDRs is a tedious task, requiring a myriad of specialized knowledge and often delivering limited performance. In this paper, we propose to automatically learn PDRs via an end-to-end deep reinforcement learning agent. We exploit the disjunctive graph representation of JSSP, and propose a Graph Neural Network based scheme to embed the states encountered during solving. The resulting policy network is size-agnostic, effectively enabling generalization on large-scale instances. Experiments show that the agent can learn high-quality PDRs from scratch with elementary raw features, and demonstrates strong performance against the best existing PDRs. The learned policies also perform well on much larger instances that are unseen in training.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Job Shop Scheduling | Taillard's benchmark Avg | Performance Gap (PG)20.5 | 20 | |
| Job-Shop Scheduling Problem | JSSP 100 instances 10x10 (test) | Objective Value871.7 | 19 | |
| Job Shop Scheduling | Demirkol's benchmark Avg | PG38.7 | 17 | |
| Job Shop Scheduling | Lawrence's benchmark Avg 40 instances | Average PG17.4 | 17 | |
| Job-Shop Scheduling Problem | JSSP 10x10 (train) | Objective Value871.7 | 14 | |
| Job Shop Scheduling | Stochastic Job-Shop Scheduling Instance 5x10, level 1 | Avg Objective Value701.9 | 10 | |
| Job Shop Scheduling | Stochastic Job-Shop Scheduling Instance 5x15, level 2 | Average Objective Value974.5 | 10 | |
| Job Shop Scheduling | Stochastic Job-Shop Scheduling Instance 5x20, level 3 | Avg Objective Value1.26e+3 | 10 | |
| Job Shop Scheduling | Stochastic Job-Shop Scheduling Instance 10x10, level 1 | Average Objective Value931 | 10 | |
| Job Shop Scheduling | Stochastic Job-Shop Scheduling Instance 10x15, level 2 | Avg Objective1.18e+3 | 10 |