On data and knowledge transfer efficiency in deep learning

Quercia, Alessio; Morrison, Abigail Joanna Rhodes; Assent, Ira; Scharr, Hanno

doi:44878

TY - THES
AU - Quercia, Alessio
TI - On data and knowledge transfer efficiency in deep learning
PB - RWTH Aachen University
VL - Dissertation
CY - Aachen
M1 - RWTH-2025-10885
SP - 1 Online-Ressource : Illustrationen
PY - 2025
N1 - Veröffentlicht auf dem Publikationsserver der RWTH Aachen University 2026
N1 - Dissertation, RWTH Aachen University, 2025
AB - Deep Learning (DL) has seen rapid advancements, characterized by the development of increasingly larger models and datasets. This trend is driven by the belief that bigger models yield better performance. However, the pursuit of larger models and datasets has overshadowed considerations of energy efficiency, leading to a race where secondary factors, such as environmental impact, are neglected. Efforts are underway to address these energy inefficiencies. This work presents novel data and knowledge transfer efficient training procedures alleviating the energy inefficiencies introduced by the scaling up of models and datasets in Computer Vision applications. In particular, we introduce a data-efficient method that biases SGD towards samples that are found to be more important after a few training epochs. Compared to state-of-the-art methods, our method does not require any additional overhead to estimate sample importance. Moreover, we extend it to super-resolution of Computer Tomography (CT) scans, recorded at low resolution to avoid exposing patients to high radiation and to reduce the costs. The super-resolved CT images can then be used to predict the flow in the nasal cavity through simulations. We explore ways to efficiently transfer and re-use knowledge between similar vision tasks. We propose an alternating training scheme leveraging auxiliary non-MDE datasets from related vision tasks to boost the MDE downstream task. This improves the MDE performance by weighting MDE steps more than auxiliary ones. Lastly, we propose ILoRA (Feature-Integral Low-Rank Adaptation), a compute, parameter and memory efficient fine-tuning method which uses the feature integral as fixed compression and a single trainable vector as decompression. Differently from state-of-the-art methods, ILoRA uses fewer parameters per layer, reducing the memory footprint and the computational cost. On one hand, this work shows that it is possible to reduce the reliance on big datasets by using carefully designed data efficient procedures, resulting in faster model training with no performance drop. On the other hand, it shows that training procedures can be boosted by efficiently transferring knowledge from pre-trained foundation models and by using additional auxiliary tasks. Overall, efficient data and knowledge transfer strategies, alongside advancements in hardware, can significantly reduce the energy inefficiencies of scaling deep learning models and datasets. By prioritizing these efficiencies, we can develop sustainable AI systems that balance high performance with minimal environmental impact.
LB - PUB:(DE-HGF)11
DO - DOI:10.18154/RWTH-2025-10885
UR - https://publications.rwth-aachen.de/record/1023964
ER -

h1

h2

h3

h4

h5

h6

RWTH

Kontakt

RWTH Publications

Allgemeines