h1

h2

h3

h4

h5
h6
TY  - THES
AU  - Zhang, Shutao
TI  - Memory-centric architectures for energy-efficient cryptographic hardware in edge applications
PB  - Rheinisch-Westfälische Technische Hochschule Aachen
VL  - Dissertation
CY  - Aachen
M1  - RWTH-2026-00599
SP  - 1 Online-Ressource : Illustrationen
PY  - 2026
N1  - Veröffentlicht auf dem Publikationsserver der RWTH Aachen University
N1  - Dissertation, Rheinisch-Westfälische Technische Hochschule Aachen, 2026
AB  - The rapid proliferation of digital systems has intensified security challenges and exposed critical vulnerabilities. This evolving threat landscape is further aggravated by the rise of quantum computing, which jeopardizes the security of traditional cryptographic algorithms, particularly those used in asymmetric cryptography. To ensure the security of digital systems in the quantum era, modern security protocols must integrate classical symmetric cryptography and hash functions with emerging post-quantum asymmetric schemes. In parallel, the shift toward resource-constrained edge devices demands highly energy-efficient hardware solutions. Notably, memory operations account for a substantial portion of total energy consumption in digital systems. Therefore, this dissertation focuses on the investigation and development of memory-centric architectures for cryptographic algorithms, optimized for high energy efficiency in edge applications. This dissertation investigates symmetric ciphers, hash functions and lattice-based cryptographic algorithms, while systematically reviewing memory-centric architectures. Key dataflow patterns of these algorithms are extracted to guide the design of energy-efficient memory-centric hardware. Symmetric ciphers and hash functions often incorporate dedicated permutation networks, such as diffusion layers in Substitution Permutation Network (SPN) block ciphers and Linear Feedback Shift Registers (LFSR)-based state updates in stream ciphers and hash functions. Traditional hardware designs rely heavily on homogeneous memory structures, such as shift registers or centralized register files with complex access patterns, which incur significant energy overhead. To address this, this dissertation proposes distributed memory organizations tailored to the unique dataflows of these cryptographic algorithms, which is validated through Application Specific Integrated Circuit (ASIC) implementations of Advanced Encryption Algorithm (AES) and Secure Hash Algorithm-256 (SHA-256). For AES, the accelerator utilizes distributed scratchpad memories to enhance data locality and provide sufficient memory bandwidth to improve hardware utilization. This design achieves a throughput of 1432 Mbps and an energy efficiency of 400 Gbps/W. In the case of SHA-256, the accelerator integrates shift registers and First-In-First-Out (FIFO) buffers to balance write and read costs. This results in a throughput of 31.6 MHash/s and an energy efficiency of 2.7 GHash/J. For post-quantum lattice-based schemes, where polynomial multiplication is the dominant operation, two distinct techniques are introduced. For schemes where Number Theoretic Transform (NTT) is applicable, a Local Horizontal Folding (LHF) algorithm is proposed to map the NTT workload onto a Compute-near-Memory (CNM) architecture with minimal local buffer, exploiting the regularity of butterfly networks. The LHF-NTT approach reduces memory accesses by at least 50
LB  - PUB:(DE-HGF)11
DO  - DOI:10.18154/RWTH-2026-00599
UR  - https://publications.rwth-aachen.de/record/1025210
ER  -