## Inertial, corrected, primal--dual proximal splitting

Abstract: We study inertial versions of primal--dual proximal splitting, also known as the Chambolle--Pock method. Our starting point is the preconditioned proximal point formulation of this method. By adding correctors corresponding to the anti-symmetric part of the relevant monotone operator, using a FISTA-style gap unrolling argument, we are able to derive gap estimates instead of merely ergodic gap estimates. Moreover, based on adding a diagonal component to this corrector, we are able to combine strong convexity based acceleration with inertial acceleration. We test our proposed method on image processing and inverse problems problems, obtaining convergence improvements for sparse Fourier inversion and Positron Emission Tomography.

## A bilevel learning approach for optimal observation placement in variational data assimilation

Abstract: In this paper we propose a bilevel optimization approach for the placement of observations in variational data assimilation problems. Within the framework of supervised learning, we consider a bilevel problem where the lower level task is the variational reconstruction of the initial condition of the system, and the upper level problem solves the optimal placement with help of a sparsity inducing norm. Due to the pointwise nature of the observations, an optimality system with regular Borel measures on the right-hand side is obtained as necessary and sufficient optimality condition for the lower level problem. The latter is then considered as constraint for the upper level instance, yielding an optimization problem constrained by time-dependent PDE's with measures. After proving some extra regularity results, we demonstrate the existence of Lagrange multipliers and derive a necessary optimality system characterizing the optimal solution of the bilevel problem. The numerical solution is carried out also on two levels. The lower level problem is solved using a standard BFGS method, while the upper level one is solved by means of a projected BFGS algorithm based on the estimation of e-active sets. A penalty function is also considered for enhancing sparsity of the location weights. Finally, some numerical experiments are presented to illustrate the main features of our approach.

## Testing and non-linear preconditioning of the proximal point method

Abstract: Employing the ideas of non-linear preconditioning and testing of the classical proximal point method, we formalise common arguments in convergence rate and convergence proofs of optimisation methods to the verification of a simple iteration-wise inequality. When applied to fixed point operators, the latter can be seen as a generalisation of firm non-expansivity or the $\alpha$-averaged property. The main purpose of this work is to provide the abstract background theory for our companion paper "Block-proximal methods with spatially adapted acceleration". In the present account we demonstrate the effectiveness of the general approach on several classical algorithms, as well as their stochastic variants. Besides, of course, the proximal point method, these method include the gradient descent, forward--backward splitting, Douglas--Rachford splitting, Newton's method, as well as several methods for saddle-point problems, such as the Alternating Directions Method of Multipliers, and the Chambolle--Pock method.

## Interior-proximal primal-dual methods

Abstract: We study preconditioned proximal point methods for a class of saddle point problems, where the preconditioner decouples the overall proximal point method into an alternating primal--dual method. This is akin to the Chambolle--Pock method or the ADMM. In our work, we replace the squared distance in the dual step by a barrier function on a symmetric cone, while using a standard (Euclidean) proximal step for the primal variable. We show that under non-degeneracy and simple linear constraints, such a hybrid primal--dual algorithm can achieve linear convergence on originally strongly convex problems involving the second-order cone in their saddle point form. On general symmetric cones, we are only able to show an O(1/N) rate. These results are based on estimates of strong convexity of the barrier function, extended with a penalty to the boundary of the symmetric cone.

## ERROR ESTIMATES FOR THE FEM APPROXIMATION OF STATE–CONSTRAINED ELLIPTIC OPTIMAL CONTROL PROBLEMS WITH SPARSITY IN A FINITE–DIMENSIONAL CONTROL SPACE

Abstract: In this work, we derive an a–priori error estimate of order $h^2|log(h)|$ for the finite element approximation of an sparse optimal control problem governed by an elliptic equation, which is controlled in a finite dimensional space. Furthermore, box–constrains on the control are considered and finitely many pointwise state–constrains are imposed in specific points in the domain. With this choice for the control space, the achieved order of approximation for the optimal control is optimal, in the sense that the order of the error for the optimal control is of the same order of the approximation fro the state equation.

## Acceleration and global convergence of a first-order primal--dual method for nonconvex problems

Abstract: The primal--dual hybrid gradient method (PDHGM, also known as the Chambolle--Pock method) has proved very successful for convex optimization problems involving linear operators arising in image processing and inverse problems. In this paper, we analyze an extension to nonconvex problems that arise if the operator is nonlinear. Based on the idea of testing, we derive new step length parameter conditions for the convergence in infinite-dimensional Hilbert spaces and provide acceleration rules for suitably (locally and/or partially) monotone problems. Importantly, we prove linear convergence rates as well as global convergence in certain cases. We demonstrate the efficacy of these step length rules for PDE-constrained optimization problems.

## Preconditioned proximal point methods and notions of partial subregularity

Abstract: Based on the needs of convergence proofs of preconditioned proximal point methods, we introduce notions of partial strong submonotonicity and partial (metric) subregularity of set-valued maps. We study relationships between these two concepts, neither of which is generally weaker or stronger than the other one. For our algorithmic purposes, the novel submonotonicity turns out to be easier to employ than more conventional error bounds obtained from subregularity. Using strong submonotonicity, we demonstrate the a posteriori linear convergence of the Primal--Dual Hybrid Gradient Method (Modified) to some strictly complementary solutions of example problems from image processing and data science. This is without the conventional assumption that all the objective functions of the involved saddle point problem are strongly convex.

## Total Generalized Variation Regularization in Variational Data Assimilation for Burgers' Equation

Abstract: We propose a second-order total generalized variation (TGV) regularization for the reconstruction of the initial condition in variational data assimilation problems. After showing the equivalence between TGV regularization and the Bayesian method for the MAP estimator, we focus on the detailed study of the inviscid Burgers' data assimilation problem. Due to the difficult structure of the governing hyperbolic conservation law, we consider a discretize-then-optimize approach and derive first-order optimality conditions for the problem. For the numerical solution, we propose a globalized reduced Newton-type method and prove convergence of the algorithm to stationary points. The paper finishes with some numerical experiments where among others, the performance of TGV-regularization compared to the TV-regularization is tested.