Safe Neuro-symbolic Automated Decision Making with Mathematical Optimisation

Primary supervisor

Buser Say

Co-supervisors

Michael Burke

Research area

Data Science and Artificial Intelligence

Planning is the reasoning side of acting in Artificial Intelligence. Planning automates the selection and the organisation of actions to reach desired states of the world as best as possible. For many real-world planning problems however, it is difficult to obtain a transition model that governs state evolution with complex dynamics. Fortunately as visualised in Figure 1, recent works [1,2,3,4,5,6] have shown that the unknown transition models can be accurately approximated as neuro-symbolic (deep) neural networks which then can be compiled into mathematical optimisation models (e.g., MILP, Weighted Partial MaxSAT, pseudo-Boolean optimisation etc.) over a fixed horizon, and solved optimally using off-the-shelf solvers.

One important limitation of this learning and planning framework is the optimiser’s curse, which is the suboptimal decision-making that is a result of optimal planning with respect to an incorrectly learned neuro-symbolic (neural network) model. In this project, the following two fundamental research questions that are in the core of overcoming the optimiser’s curse will be studied. Namely:

1. How can we robustly plan with respect to the predictions errors (i.e., interpolation and extrapolation prediction errors) of the learned neuro-symbolic (neural network) models?

2. How can the neuro-symbolic (neural network) models be trained that are robust to the optimiser’s curse?

Figure 1: Visualization of the learning and planning framework presented in [5] where red circles represent action variables, blue circles represent state variables, gray circles represent the activation units and w's represent the weights of the neural network. — Figure 1: Visualisation of the neuro-symbolic learning and planning framework presented in [1] where red circles represent action variables, blue circles represent state variables, grey circles represent the activation units and w's represent the weights of the neural network.

References:

[1] Buser Say, Ga Wu, Yu Qing Zhou and Scott Sanner. Nonlinear hybrid planning with deep net learned transition models and mixed-integer linear programming. In 26th IJCAI, pages 750–756, 2017.

[2] Ga Wu, Buser Say and Scott Sanner. Scalable planning with Tensorflow for hybrid nonlinear domains. In 31st NeurIPS, pages 6273-6283, 2017.

[3] Ga Wu, Buser Say and Scott Sanner. Scalable planning with deep neural network learned transition models. In JAIR, Volume 68, pages 571-606, 2020.

[4] Buser Say and Scott Sanner. Planning in Factored State and Action Spaces with Learned Binarized Neural Network Transition Models. In 27th IJCAI, pages 4815-4821, 2018.

[5] Buser Say and Scott Sanner. Compact and Efficient Encodings for Planning in Factored State and Action Spaces with Learned Binarized Neural Network Transition Models. In AIJ, Volume 285, Article 103291, 2020.

[6] Buser Say, Jo Devriendt, Jakob Nordström and Peter Stuckey. Theoretical and Experimental Results for Planning with Learned Binarized Neural Network Transition Models. In CP 2020, pages 917-934, 2020.

Required knowledge

A successful candidate should have strong programming skills (e.g., in Python) as well as background in at least one of the
following:

(deep) neural networks, and/or
automated planning, and/or
mathematical optimisation.

Safe Neuro-symbolic Automated Decision Making with Mathematical Optimisation

Primary supervisor

Co-supervisors

Research area

Required knowledge

Primary supervisor

Buser Say

Supervisor Connect

Browse

Recently added