Boolean Satisfiability via Imitation Learning
About
We propose ImitSAT, a branching policy for conflict-driven clause learning (CDCL) solvers based on imitation learning for the Boolean satisfiability problem (SAT). Unlike previous methods that predict instance-level signals to improve CDCL branching indirectly, or rely on reinforcement learning and insufficient CDCL information to enhance branching, ImitSAT learns from expert KeyTrace that collapses a full run into the sequence of surviving decisions. Replaying a KeyTrace on the same instance is nearly conflict-free, providing dense decision-level supervision and directly reducing propagations -- the dominant contributor to wall-clock time. This prefix-conditioned supervision enables ImitSAT to reproduce high-quality branches without exploration, yielding faster convergence, stable training, and seamless integration into CDCL. Extensive experiments demonstrate that ImitSAT reduces propagation counts and runtime, outperforming state-of-the-art learned approaches. We released the source code and trained model at https://github.com/zewei-Zhang/ImitSAT
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| SAT solving | JNH structured SAT family | MRPP r-tilde1.46 | 15 | |
| SAT solving | PARITY structured SAT family | MRPP r-tilde0.9 | 15 | |
| SAT solving | random 3-SAT 100 | MRPP r~0.83 | 12 | |
| SAT solving | random 3-SAT 50 | MRPP r~0.74 | 12 | |
| SAT solving | random 3-SAT 61–100 | MRPP r~0.8 | 12 | |
| SAT solving | random 3-SAT 5–15 | MRPP r~75 | 12 | |
| SAT solving | random 3-SAT (16–30) | MRPP r~0.83 | 12 | |
| SAT solving | random 3-SAT 31–60 | MRPP r~75 | 12 | |
| SAT solving | PHOLE | MRPP r˜1 | 9 | |
| SAT solving | AIM | MRPP r˜0.88 | 9 |