Novel jet flavour tagging algorithms exploiting adversarial deep learning techniques with efficient computing methods and preparation of open data for robustness studies

Stein, Annika; Schmidt, Alexander; Krämer, Michael
doi:10.18154/RWTH-2024-07840
000991721 001__ 991721
000991721 005__ 20240925101043.0
000991721 0247_ $$2HBZ$$aHT030831382
000991721 0247_ $$2Laufende Nummer$$a43546
000991721 0247_ $$2datacite_doi$$a10.18154/RWTH-2024-07840
000991721 037__ $$aRWTH-2024-07840
000991721 041__ $$aEnglish
000991721 082__ $$a530
000991721 1001_ $$0P:(DE-82)IDM05609$$aStein, Annika$$b0$$urwth
000991721 245__ $$aNovel jet flavour tagging algorithms exploiting adversarial deep learning techniques with efficient computing methods and preparation of open data for robustness studies$$cvorgelegt von Annika Stein, M.Sc. RWTH$$honline
000991721 260__ $$aAachen$$bRWTH Aachen University$$c2024
000991721 300__ $$a1 Online-Ressource : Illustrationen
000991721 3367_ $$02$$2EndNote$$aThesis
000991721 3367_ $$0PUB:(DE-HGF)11$$2PUB:(DE-HGF)$$aDissertation / PhD Thesis$$bphd$$mphd
000991721 3367_ $$2BibTeX$$aPHDTHESIS
000991721 3367_ $$2DRIVER$$adoctoralThesis
000991721 3367_ $$2DataCite$$aOutput Types/Dissertation
000991721 3367_ $$2ORCID$$aDISSERTATION
000991721 502__ $$aDissertation, RWTH Aachen University, 2024$$bDissertation$$cRWTH Aachen University$$d2024$$gFak01$$o2024-06-20
000991721 500__ $$aVeröffentlicht auf dem Publikationsserver der RWTH Aachen University
000991721 5203_ $$aAlgorithmen des Maschinellen Lernens sind ein nicht mehr wegzudenkendes Werkzeug für die Wissenschaft. Präzisionstests des Standardmodells der Teilchenphysik und die Suche nach Prozessen, in welchen elementare Teilchen involviert sind, werden durch neue Rekonstruktionsalgorithmen erleichtert, die komplexe Architekturen Neuronaler Netzwerke nutzen. Solche Anwendungen hängen allerdings oft von simulierten Prozessen ab, wobei die Identifikation von Quark-Flavour oder Gluonen, welche Teilchenjets initiieren (Jet Flavour Tagging) ein Beispiel darstellt. Neben anderen experimentellen Unsicherheiten kann die Unsicherheit der Effizienz bei der Objektidentifizierung mithilfe Neuronaler Netzwerke in hohem Maße zu finalen Resultaten beitragen, welche zum Beispiel mit Signalstärken ausgedrückt werden. Tests mit Kontrollregionen machen Unterschiede in der Performance für Stichproben, die durch Simulation gewonnen wurden, und solche, die aus Detektordaten stammen, sichtbar, woraus sich die Notwendigkeit der Kalibrierung ergibt. Dies im Sinn zielt diese Arbeit darauf ab, nicht nur effiziente Mittel bereitzustellen, die die Performancelücke zwischen Daten und Simulation von Grund auf schließen (besonders dann, wenn die Performance für die Simulation sehr gut ist), als auch Konzepte zu entwickeln, die helfen zu verstehen, weshalb die Ansätze wirken. Ausgehend von früheren Versionen für ``adversarial'' (feindliche) Attacks und Defenses, wird ein neuer Algorithmus, genannt Normed Gradient Method (NGM), eingeführt und zum ersten Mal für eine physikalische Anwendung verwendet. Dies stellt ebenso die Einführung modernster Transformer-Architekturen für Jets mit kleinem Radius in CMS dar. In Kombination mit NGM werden die momentan besten Performance-Metriken für diese Aufgabe in CMS erreicht, besser als mit früheren Algorithmen. Das Netzwerk behält hohe Performance auch dann bei, wenn es systematischen Modifikationen der Eingaben ausgesetzt wird. Es ist das erste Mal, dass ein (adversarial) robuster Algorithmus in der offiziellen Rekonstruktionssoftware eines Teilchendetektors in der Hochenergiephysik integriert wird. Der Durchsatz an Ereignissen steigt im Vergleich zu einem Algorithmus mit etwas geringerer Performance. Die effiziente Integration wurde signifikant durch eine neue, speziell auf Jet Flavour Tagging spezialisierte Software-Umgebung erleichtert, welche Performance-Studien mit Detektor-Daten bereits während der Entwicklungs- (Trainings-)Phase erlaubt. Die Zeit, die vom Training bis zur Verfügbarkeit von zuverlässigen Performance-Studien vergeht, wird signifikant reduziert. Dies ist möglich, da das Entwicklungssystem auf nur einer Stufe der Datenrepräsentation aufbaut, die mehreren Zwecken zur Untersuchung des Neuronalen Netzwerks dient. Neben der Arbeit im Kontext eines Experiments, ist ein anderer Fokus die Vorbereitung und Nutzung von CERN Open Data für Robustheits-Studien. Dieser letzte Teil der Arbeit widmet sich der Umwandlung bereits vorhandener offener Datensätze, welche ausschließlich mit Experiment-spezifischer Software verwendet werden können, zu Formaten, die sich für Maschinelles Lernen eignen. Das Ergebnis ist der erste offene Datensatz, mit welchem Jet Flavour Tagging Studien für Jets mit geringem Radius mit Simulation und aufgenommenen Detektordaten für ein breiteres Publikum von Data Scientists möglich sind, auch wenn diese nicht notwendigerweise über das Wissen verfügen, die Experiment-Software zu bedienen. Da jedes Experiment individuelle Daten, Hilfsmittel und Problemstellungen bereitstellt, wird das Konzept einer einhüllenden Struktur eingeführt, welche die Anwendung einer Grundmenge von Adversarial-Techniken in einem Werkzeugkasten für verschiedene Nutzungsmöglichkeiten erlaubt.$$lger
000991721 520__ $$aMachine learning algorithms are an indispensable tool for science. Precision tests of the standard model of particle physics and searches for processes involving elementary particles are facilitated with novel reconstruction algorithms that exploit complex neural network architectures. Such applications however oftentimes rely on simulated processes, one example being the identification of the flavour of quarks or gluons initiating particle jets (jet flavour tagging). Besides other experimental sources of uncertainties, the efficiency uncertainties stemming from object identification involving neural networks can contribute significantly to final results, expressed for example as uncertainties in a signal strength. Tests with control regions reveal differences in performance between samples obtained through simulation and those from detector data, meaning that calibration is required. With this in mind, this thesis aims at providing not only efficient measures to mitigate this performance gap between data and simulation from the ground up (especially when the algorithm performs very well on simulation), but also derives concepts that assist in understanding why the proposed approaches work. Building up from early versions of adversarial attacks and defenses, a new algorithm, denoted Normed Gradient Method (NGM), is introduced and adapted for physics applications for the first time. This also marks the introduction of a state-of-the-art transformer architecture for small-radius jets for the CMS experiment. In combination with NGM, the currently best performance metrics for this task at CMS are achieved, improving over previous algorithms. The network maintains high performance even under exposition to systematic modifications of inputs. It is thus the first time an (adversarially) robust algorithm is introduced for the official reconstruction software of a high-energy particle detector. Event throughput improves compared to an algorithm that achieves slightly worse performance. The efficient integration was significantly facilitated by a novel software framework specifically developed for jet flavour tagging that is capable of performance studies with data, although the neural network is still in development (training) stage. The time-to-insight from neural network training to reliable performance studies is significantly reduced. This is possible, because the framework is built around only one data tier that serves multiple purposes to study the neural network. Besides the work within the context of one experiment, another focus is the preparation and utilization of CERN Open Data for robustness studies. This last part of this thesis is dedicated to the conversion of already available open datasets, which can only be used with experiment-specific software, into machine learning-friendly formats. The result is the first open dataset that allows small-radius jet flavour tagging studies with simulation and recorded detector data for a broader audience of data scientists that do not necessarily know how to operate the experiment software. As every experiment provides unique data, tools, and problem statements, the concept of a wrapper structure is introduced, which allows applying a core set of adversarial techniques in a toolbox to the different use cases.$$leng
000991721 588__ $$aDataset connected to Lobid/HBZ
000991721 591__ $$aGermany
000991721 653_7 $$aDaten
000991721 653_7 $$aDeep Learning
000991721 653_7 $$aKI
000991721 653_7 $$aKünstliche Intelligenz
000991721 653_7 $$aRobustheit
000991721 653_7 $$aTeilchenphysik
000991721 653_7 $$aadversarial
000991721 653_7 $$acomputing
000991721 653_7 $$adata
000991721 653_7 $$aflavor
000991721 653_7 $$aflavour
000991721 653_7 $$ajet
000991721 653_7 $$amachine learning
000991721 653_7 $$aparticle physics
000991721 7001_ $$0P:(DE-82)IDM03668$$aSchmidt, Alexander$$b1$$eThesis advisor$$urwth
000991721 7001_ $$0P:(DE-82)IDM00267$$aKrämer, Michael$$b2$$eThesis advisor$$urwth
000991721 8564_ $$uhttps://publications.rwth-aachen.de/record/991721/files/991721.pdf$$yOpenAccess
000991721 8564_ $$uhttps://publications.rwth-aachen.de/record/991721/files/991721_source.zip$$yRestricted
000991721 909CO $$ooai:publications.rwth-aachen.de:991721$$pdnbdelivery$$pdriver$$pVDB$$popen_access$$popenaire
000991721 915__ $$0StatID:(DE-HGF)0510$$2StatID$$aOpenAccess
000991721 9141_ $$y2024
000991721 9101_ $$0I:(DE-588b)36225-6$$6P:(DE-82)IDM05609$$aRWTH Aachen$$b0$$kRWTH
000991721 9101_ $$0I:(DE-588b)36225-6$$6P:(DE-82)IDM03668$$aRWTH Aachen$$b1$$kRWTH
000991721 9101_ $$0I:(DE-588b)36225-6$$6P:(DE-82)IDM00267$$aRWTH Aachen$$b2$$kRWTH
000991721 9201_ $$0I:(DE-82)133920_20180228$$k133920 ; 133910$$lLehrstuhl für Experimentalphysik III A$$x0
000991721 9201_ $$0I:(DE-82)130000_20140620$$k130000$$lFachgruppe Physik$$x1
000991721 961__ $$c2024-09-24T12:42:56.170123$$x2024-08-22T16:37:03.097439$$z2024-09-24T12:42:56.170123
000991721 9801_ $$aFullTexts
000991721 980__ $$aI:(DE-82)130000_20140620
000991721 980__ $$aI:(DE-82)133920_20180228
000991721 980__ $$aUNRESTRICTED
000991721 980__ $$aVDB
000991721 980__ $$aphd
h1

h2

h3

h4

h5

h6

RWTH

Kontakt

RWTH Publications

Allgemeines