Iterative soft-thresholding from a statistical learning perspective

Schnoor, Ekkehard; Rauhut, Holger; Führ, Hartmut
doi:44349
% IMPORTANT: The following is UTF-8 encoded.  This means that in the presence
% of non-ASCII characters, it will not work with BibTeX 0.99 or older.
% Instead, you should use an up-to-date BibTeX implementation like “bibtex8” or
% “biber”.

@PHDTHESIS{Schnoor:1012246,
      author       = {Schnoor, Ekkehard},
      othercontributors = {Rauhut, Holger and Führ, Hartmut},
      title        = {{I}terative soft-thresholding from a statistical learning
                      perspective},
      school       = {RWTH Aachen University},
      type         = {Dissertation},
      address      = {Aachen},
      publisher    = {RWTH Aachen University},
      reportid     = {RWTH-2025-04943},
      pages        = {1 Online-Ressource : Illustrationen},
      year         = {2024},
      note         = {Veröffentlicht auf dem Publikationsserver der RWTH Aachen
                      University 2025; Dissertation, RWTH Aachen University, 2024},
      abstract     = {This dissertation explores connections between the areas of
                      compressive sensing and machine learning. It is centered
                      around the so-called iterative soft-thresholding algorithm
                      (ISTA), an iterative algorithm to solve the l1 -regularized
                      least squares problem also known as LASSO (least absolute
                      shrinkage and selection operator) that has various
                      applications in statistics and signal processing. We will
                      investigate two statistical learning problems that can be
                      regarded as two different interpretations of the same
                      underlying optimization problem and its solution through
                      ISTA. While both are different, in common they have a
                      generalization perspective, i.e., we aim for finding
                      performance guarantees at inference, that is when applying
                      the trained model to new data samples that have not been
                      used for training, but can be regarded as samples from the
                      same underlying (but in practice, typically unknown)
                      distribution. Thus, the contribution of this thesis lies in
                      providing novel investigations of the iterative
                      soft-thresholding algorithm from the viewpoint of
                      statistical learning theory. We heavily rely on tools from
                      high-dimensional probability theory to prove our results.
                      The first of the problems we consider deals with an
                      interpretation of ISTA as a neural network, a topic which
                      attracted attention with the rise of deep learning in the
                      past decade. As a first step to introduce trainable
                      parameters, we address a rather simple model, where a
                      dictionary is learned implicitly. Then, we extend our
                      results to a greatly generalized setup including a variety
                      of ISTA-inspired neural networks, ranging from recurrent
                      ones to architectures more similar to feedforward neural
                      networks. Based on estimates of the Rademacher complexity of
                      the corresponding hypothesis classes, we derive the first
                      generalization error bounds for such specific neural network
                      architectures and compare our theoretical findings to
                      numerical experiments. While previous works strongly focused
                      on generalization of deep learning in the context of
                      classification tasks, we provide theoretical results in the
                      context of inverse problems, which is much less explored in
                      the literature. The second problem considers the application
                      of LASSO in a classification context, where the solution
                      found through ISTA plays the role of a sparse linear
                      classifier. Under realistic assumptions on the training
                      data, we show that this induces a concentration on the
                      distribution over the corresponding hypothesis class. This
                      enables us to derive an algorithm to predict the
                      classification accuracy based solely on statistical
                      properties of the training data, which we confirm with the
                      help of numerical experiments.},
      cin          = {112710 / 110000},
      ddc          = {510},
      cid          = {$I:(DE-82)112710_20190828$ / $I:(DE-82)110000_20140620$},
      pnm          = {DFG project G:(GEPRIS)404374102 - Strukturiertes
                      Compressive Sensing mittels gelernten neuronalen Netzen
                      (SCoSNeL) (404374102) / SPP 1798: Compressed Sensing in der
                      Informationsverarbeitung (255450733)},
      pid          = {G:(GEPRIS)404374102 / G:(GEPRIS)255450733},
      typ          = {PUB:(DE-HGF)11},
      doi          = {10.18154/RWTH-2025-04943},
      url          = {https://publications.rwth-aachen.de/record/1012246},
}
h1

h2

h3

h4

h5

h6

RWTH

Kontakt

RWTH Publications

Allgemeines