Robustness analysis of deep neural networks in the presence of adversarial perturbations and noisy labels

Balda Canizares, Emilio Rafael; Mathar, Rudolf; Leibe, Bastian
doi:38869
% IMPORTANT: The following is UTF-8 encoded.  This means that in the presence
% of non-ASCII characters, it will not work with BibTeX 0.99 or older.
% Instead, you should use an up-to-date BibTeX implementation like “bibtex8” or
% “biber”.

@PHDTHESIS{BaldaCanizares:780519,
      author       = {Balda Canizares, Emilio Rafael},
      othercontributors = {Mathar, Rudolf and Leibe, Bastian},
      title        = {{R}obustness analysis of deep neural networks in the
                      presence of adversarial perturbations and noisy labels; 1.
                      {A}uflage},
      school       = {RWTH Aachen University},
      type         = {Dissertation},
      address      = {Aachen},
      publisher    = {Apprimus},
      reportid     = {RWTH-2020-00698},
      isbn         = {978-3-86359-802-0},
      series       = {Elektro- und Informationstechnik},
      pages        = {1 Online-Ressource (vi, 125 Seiten) : Illustrationen,
                      Diagramme},
      year         = {2019},
      note         = {Druckausgabe: 2019. - Auch veröffentlicht auf dem
                      Publikationsserver der RWTH Aachen University 2020;
                      Dissertation, RWTH Aachen University, 2019},
      abstract     = {In this thesis, we study the robustness and generalization
                      properties of Deep Neural Networks (DNNs) under various
                      noisy regimes, due to corrupted inputs or labels. Such
                      corruptions can be either random or intentionally crafted to
                      disturb the target DNN. Inputs corrupted by maliciously
                      designed perturbations are known as adversarial examples and
                      have been shown to severely degrade the performance of DNNs.
                      However, due to the non-linearity of DNNs, crafting such
                      perturbations is non-trivial. We first address the problem
                      of designing algorithms for generating adversarial examples,
                      known as adversarial attacks. We start with a general
                      formulation of this problem and, through successive convex
                      relaxations, propose a framework for computing adversarial
                      examples under various desired constraints. Using this
                      approach, we derive novel methods that consistently
                      outperform existing algorithms in tested scenarios. In
                      addition, new algorithms are also formulated for regression
                      problems. We show that adversarial vulnerability is also an
                      issue in various regression tasks, a problem that has so far
                      been overlooked in the literature. While there has been a
                      vast amount of work on the design and understanding of DNNs
                      resistant to these attacks, their generalization properties
                      are less understood. How well does adversarial robustness
                      generalize from the training set to unseen data? We use
                      Statistical Learning Theory (SLT) to bound the so-called
                      adversarial risk of DNNs. Proving SLT bounds for deep
                      learning is on-going research with various existing
                      frameworks. Among these SLT frameworks, we choose a
                      compression-based technique that established state of the
                      art results for DNNs in the non-adversarial regime. Our
                      bound leverages the sparsity structures induced by
                      adversarial training and has no explicit dependence on the
                      input dimension or the number of classes. This result
                      constitutes an improvement over existing bounds. To complete
                      this work, we shift our focus from perturbed inputs to noisy
                      labels and analyze how DNNs learn when a portion of the
                      inputs is incorrectly labeled. In this setup, we use
                      information theory to characterize the behavior of
                      classifiers. Under noisy labels, we study the trajectory of
                      DNNs in the information plane, formed by the entropy of
                      estimated labels and the conditional entropy between given
                      and estimated labels. We analyze the trajectory in the
                      information plane and show the de-noising capabilities of
                      DNNs. Under simplified scenarios, we are able to
                      analytically characterize these trajectories for one-layer
                      neural networks trained with stochastic gradient descent.
                      This result shows a trajectory for properly trained networks
                      that seems to be consistent among DNNs in real image
                      classification tasks. In addition, we show that underfitted,
                      overfitted and well-trained DNNs exhibit significantly
                      different trajectories in the information plane. Such
                      phenomena are not visible when considering only training and
                      validation error. These results show that
                      information-theoretic quantities provide a richer view of
                      the learning process than standard training and validation
                      error.},
      cin          = {613410},
      ddc          = {621.3},
      cid          = {$I:(DE-82)613410_20140620$},
      typ          = {PUB:(DE-HGF)11 / PUB:(DE-HGF)3},
      doi          = {10.18154/RWTH-2020-00698},
      url          = {https://publications.rwth-aachen.de/record/780519},
}
h1

h2

h3

h4

h5

h6

RWTH

Kontakt

RWTH Publications

Allgemeines