Iterative soft-thresholding from a statistical learning perspective

Schnoor, Ekkehard; Rauhut, Holger; Führ, Hartmut
doi:44349
%0 Thesis
%A Schnoor, Ekkehard
%T Iterative soft-thresholding from a statistical learning perspective
%I RWTH Aachen University
%V Dissertation
%C Aachen
%M RWTH-2025-04943
%P 1 Online-Ressource : Illustrationen
%D 2024
%Z Veröffentlicht auf dem Publikationsserver der RWTH Aachen University 2025
%Z Dissertation, RWTH Aachen University, 2024
%X This dissertation explores connections between the areas of compressive sensing and machine learning. It is centered around the so-called iterative soft-thresholding algorithm (ISTA), an iterative algorithm to solve the l1 -regularized least squares problem also known as LASSO (least absolute shrinkage and selection operator) that has various applications in statistics and signal processing. We will investigate two statistical learning problems that can be regarded as two different interpretations of the same underlying optimization problem and its solution through ISTA. While both are different, in common they have a generalization perspective, i.e., we aim for finding performance guarantees at inference, that is when applying the trained model to new data samples that have not been used for training, but can be regarded as samples from the same underlying (but in practice, typically unknown) distribution. Thus, the contribution of this thesis lies in providing novel investigations of the iterative soft-thresholding algorithm from the viewpoint of statistical learning theory. We heavily rely on tools from high-dimensional probability theory to prove our results. The first of the problems we consider deals with an interpretation of ISTA as a neural network, a topic which attracted attention with the rise of deep learning in the past decade. As a first step to introduce trainable parameters, we address a rather simple model, where a dictionary is learned implicitly. Then, we extend our results to a greatly generalized setup including a variety of ISTA-inspired neural networks, ranging from recurrent ones to architectures more similar to feedforward neural networks. Based on estimates of the Rademacher complexity of the corresponding hypothesis classes, we derive the first generalization error bounds for such specific neural network architectures and compare our theoretical findings to numerical experiments. While previous works strongly focused on generalization of deep learning in the context of classification tasks, we provide theoretical results in the context of inverse problems, which is much less explored in the literature. The second problem considers the application of LASSO in a classification context, where the solution found through ISTA plays the role of a sparse linear classifier. Under realistic assumptions on the training data, we show that this induces a concentration on the distribution over the corresponding hypothesis class. This enables us to derive an algorithm to predict the classification accuracy based solely on statistical properties of the training data, which we confirm with the help of numerical experiments.
%F PUB:(DE-HGF)11
%9 Dissertation / PhD Thesis
%R 10.18154/RWTH-2025-04943
%U https://publications.rwth-aachen.de/record/1012246
h1

h2

h3

h4

h5

h6

RWTH

Kontakt

RWTH Publications

Allgemeines