TY - THES AU - Nestler, Sandra TI - A statistical perspective on learning of time series in neural networks PB - RWTH Aachen University VL - Dissertation CY - Aachen M1 - RWTH-2024-02247 SP - 1 Online-Ressource : Illustrationen PY - 2024 N1 - Veröffentlicht auf dem Publikationsserver der RWTH Aachen University N1 - Dissertation, RWTH Aachen University, 2024 AB - The notion of neural networks encompasses two major domains: In neuroscience, their source of inspiration is the brain, where billions of neurons combine external and internal information to continuously solve tasks, and in machine learning, they are used to process vast amounts of data to solve complex tasks. The goal of this thesis is to provide a statistical perspective on learning in neural networks, while considering their biological inspiration and machine learning application. A statistical perspective implies stochasticity, which in our case comes from two sources: Firstly, biological neural activity is intrinsically noisy, and highly complex background activity influences the processing of stimuli. Secondly, natural stimuli themselves inherit stochastic features. Here, we target both scenarios using tools from statistical physics. While we take up the viewpoints of both biological and machine learning, we focus on processing of stimuli as present in the brain; to wit, everything is time-dependent. In the brain, signals reverberate due to the recurrent neural connections, creating natural interactions between inputs stemming from different time points. Because of the non-linear nature of those interactions, however, understanding their dynamical state forms an intricate challenge. For weakly non-linear interactions, we develop a method to unfold the recurrent dynamics into an effective feed-forward system. We thereby obtain an analytically tractable approximation of the network state distribution using perturbation theory. We utilize the solution of the network dynamics to find the optimal input and readout projections for classification in a random recurrent reservoir, improving the network performance. The optimal classifier in this framework changes, however, when independent background activity is present. For linear interactions, we derive the empirical risk minimizer for the input and output mapping with noisy dynamics. We find that the optimal solution employs a trade-off between stability and performance and we compare it to the noise-free case. But how does the non-linearity of the interactions shape the statistical processing of stimuli? We employ a single-layer feedforward model to answer this question, and connect the statistical features of the input and output layer. Creating a classifier with trainable gain function, we find a direct relation between the non-linearity and representation and processing of higher-order statistics. To conclude, we move from learning individual statistical features to the data distribution itself. Using an invertible type of feedforward neural network, we learn the non-linear manifold from samples and extract the most informative modes from the data. In this way, we obtain a fully adaptable mechanism to uncover structure, dimensionality, and meaningful latent features at once in an unsupervised fashion. LB - PUB:(DE-HGF)11 DO - DOI:10.18154/RWTH-2024-02247 UR - https://publications.rwth-aachen.de/record/980418 ER -