LB4TL: Smooth Semantics for Temporal Logic for Scalable Training of Neural Feedback Controllers.
Published in IFAC-PapersOnline, 2024
This paper presents a model-based framework for training neural network (NN)-based feedback controllers for autonomous agents with deterministic nonlinear dynamics to satisfy task objectives and safety constraints expressed in discrete-time Signal Temporal Logic (DT-STL). The DT-STL framework uses a robustness function to quantify the degree of satisfaction of a given DT-STL formula by a system trajectory. This robustness function serves a dual purpose: system verification through robustness minimization and control synthesis by maximizing robustness. However, the complexity of the robustness function due to its recursive definition, non-convexity, and non-differentiability poses challenges when it is used to train neural network (NN)-based feedback controllers. This paper introduces a smooth computation graph to encode DT-STL robustness in a neurosymbolic framework. The computation graph represents a smooth approximation of the robustness enabling the use of powerful optimization algorithms based on stochastic gradient descent and backpropagation to train NN feedback controllers. Our approximation guarantees that it lower bounds the robustness value of a given DT-STL formula, and shows orders of magnitude improvement over existing smooth approximations in the literature when applied to the NN feedback control synthesis task. We demonstrate the efficacy of this approach across various planning applications that require satisfaction of complex spatio-temporal and sequential tasks, and show that it scales well with formula complexity and size of the state-space.
Recommended citation: Hashemi, Navid, et al. "Scaling Learning based Policy Optimization for Temporal Tasks via Dropout." arXiv preprint arXiv:2403.15826 (2024).
Download Paper