Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Epsilon-Lexicase Selection for Regression

About

Lexicase selection is a parent selection method that considers test cases separately, rather than in aggregate, when performing parent selection. It performs well in discrete error spaces but not on the continuous-valued problems that compose most system identification tasks. In this paper, we develop a new form of lexicase selection for symbolic regression, named epsilon-lexicase selection, that redefines the pass condition for individuals on each test case in a more effective way. We run a series of experiments on real-world and synthetic problems with several treatments of epsilon and quantify how epsilon affects parent selection and model performance. epsilon-lexicase selection is shown to be effective for regression, producing better fit models compared to other techniques such as tournament selection and age-fitness Pareto optimization. We demonstrate that epsilon can be adapted automatically for individual test cases based on the population performance distribution. Our experiments show that epsilon-lexicase selection with automatic epsilon produces the most accurate models across tested problems with negligible computational overhead. We show that behavioral diversity is exceptionally high in lexicase selection treatments, and that epsilon-lexicase selection makes use of more fitness cases when selecting parents than lexicase selection, which helps explain the performance improvement.

William La Cava, Lee Spector, Kourosh Danai• 2019

Related benchmarks

TaskDatasetResultRank
Symbolic RegressionSRBench black-box (test)
R^20.7372
28
Symbolic RegressionFeynman Dataset epsilon=0.01 (test)
R20.991
20
Symbolic RegressionFeynman Dataset epsilon=0.1 (test)
R2 Score0.9901
20
Symbolic RegressionFeynman Dataset ϵ = 0.0 (test)
R^20.9869
20
Symbolic RegressionFeynman Dataset epsilon=0.001 (test)
R298.66
20
Symbolic RegressionStrogatz Dataset epsilon=0.001 (test)
R2 Score0.8488
20
Symbolic RegressionStrogatz Dataset epsilon=0.01 (test)
R2 Score0.8562
20
Symbolic RegressionStrogatz Dataset epsilon=0.1 (test)
R288.22
20
Symbolic RegressionStrogatz Dataset ϵ = 0.0 (test)
R^20.8125
20
Showing 9 of 9 rows

Other info

Follow for update