On data and knowledge transfer efficiency in deep learning

Quercia, Alessio; Morrison, Abigail Joanna Rhodes; Assent, Ira; Scharr, Hanno
doi:44878
% IMPORTANT: The following is UTF-8 encoded.  This means that in the presence
% of non-ASCII characters, it will not work with BibTeX 0.99 or older.
% Instead, you should use an up-to-date BibTeX implementation like “bibtex8” or
% “biber”.

@PHDTHESIS{Quercia:1023964,
      author       = {Quercia, Alessio},
      othercontributors = {Morrison, Abigail Joanna Rhodes and Assent, Ira and Scharr,
                          Hanno},
      title        = {{O}n data and knowledge transfer efficiency in deep
                      learning},
      school       = {RWTH Aachen University},
      type         = {Dissertation},
      address      = {Aachen},
      publisher    = {RWTH Aachen University},
      reportid     = {RWTH-2025-10885},
      pages        = {1 Online-Ressource : Illustrationen},
      year         = {2025},
      note         = {Veröffentlicht auf dem Publikationsserver der RWTH Aachen
                      University 2026; Dissertation, RWTH Aachen University, 2025},
      abstract     = {Deep Learning (DL) has seen rapid advancements,
                      characterized by the development of increasingly larger
                      models and datasets. This trend is driven by the belief that
                      bigger models yield better performance. However, the pursuit
                      of larger models and datasets has overshadowed
                      considerations of energy efficiency, leading to a race where
                      secondary factors, such as environmental impact, are
                      neglected. Efforts are underway to address these energy
                      inefficiencies. This work presents novel data and knowledge
                      transfer efficient training procedures alleviating the
                      energy inefficiencies introduced by the scaling up of models
                      and datasets in Computer Vision applications. In particular,
                      we introduce a data-efficient method that biases SGD towards
                      samples that are found to be more important after a few
                      training epochs. Compared to state-of-the-art methods, our
                      method does not require any additional overhead to estimate
                      sample importance. Moreover, we extend it to
                      super-resolution of Computer Tomography (CT) scans, recorded
                      at low resolution to avoid exposing patients to high
                      radiation and to reduce the costs. The super-resolved CT
                      images can then be used to predict the flow in the nasal
                      cavity through simulations. We explore ways to efficiently
                      transfer and re-use knowledge between similar vision tasks.
                      We propose an alternating training scheme leveraging
                      auxiliary non-MDE datasets from related vision tasks to
                      boost the MDE downstream task. This improves the MDE
                      performance by weighting MDE steps more than auxiliary ones.
                      Lastly, we propose ILoRA (Feature-Integral Low-Rank
                      Adaptation), a compute, parameter and memory efficient
                      fine-tuning method which uses the feature integral as fixed
                      compression and a single trainable vector as decompression.
                      Differently from state-of-the-art methods, ILoRA uses fewer
                      parameters per layer, reducing the memory footprint and the
                      computational cost. On one hand, this work shows that it is
                      possible to reduce the reliance on big datasets by using
                      carefully designed data efficient procedures, resulting in
                      faster model training with no performance drop. On the other
                      hand, it shows that training procedures can be boosted by
                      efficiently transferring knowledge from pre-trained
                      foundation models and by using additional auxiliary tasks.
                      Overall, efficient data and knowledge transfer strategies,
                      alongside advancements in hardware, can significantly reduce
                      the energy inefficiencies of scaling deep learning models
                      and datasets. By prioritizing these efficiencies, we can
                      develop sustainable AI systems that balance high performance
                      with minimal environmental impact.},
      cin          = {124920},
      ddc          = {004},
      cid          = {$I:(DE-82)124920_20200227$},
      typ          = {PUB:(DE-HGF)11},
      doi          = {10.18154/RWTH-2025-10885},
      url          = {https://publications.rwth-aachen.de/record/1023964},
}
h1

h2

h3

h4

h5

h6

RWTH

Kontakt

RWTH Publications

Allgemeines