skip to content

Neural Network Training and Inversion with a Bregman Learning Framework

Abstract: Deep Neural Networks (DNNs) are powerful computing systems that have revolutionised a wide range of research domains and have achieved remarkable success in various realworld applications over the past decade. Despite their significant recent advancements, training DNNs still remains a challenging task due to the non-convex and (potentially) non-smooth nature of the objective function. Back-propagation in combination with gradient-based minimisation approaches has been the predominant strategy for training DNNs for decades. Yet the popular error backpropagation algorithm is susceptible to potential drawbacks and limitations, for example its non-parallelisablity and biological implausibility, and vanishing or exploding gradients issues, etc. Inverting DNNs to infer likely inputs of the system from given outputs, is the other side of the same coin. Early ideas of DNNs inversion trace back to the 1990s, but research interests in more generic network inversion problems have been rekindled and primarily driven due to the rapid advancements in generative modelling in recent years. While several approaches for the inversion of DNNs have been proposed, the stability of the inversion is an often neglected crucial aspect. The neural network inversion problem is ill-posed as the solution does not depend continuously on the input datum hence can be highly sensitive to perturbations. The core theme of this thesis is at the training of DNNs. Built up on distributed optimisation approaches, this work contributes to both the learning problems and the inversion problems of DNNs. In particular, we propose a lifted Bregman learning framework that goes beyond the classical back-propagation approach, and aims to address unresolved and overlooked issues in training and the inversion of DNNs. More specifically, we propose a family of loss (penalty) functions that are based on a tailored Bregman distance. We provide detailed mathematical analysis on the derived Bregman learning framework and propose a whole range of deterministic and stochastic optimisation strategies to enable solving the learning problem. Bringing techniques and tools from Inverse Problems and Regularisation Theory, we provide theoretical guarantees as well as computational optimisation strategies for the stable, model-based inversion of neural networks.