TY - THES AU - Schuster, Daniel TI - Incremental process discovery PB - RWTH Aachen University VL - Dissertation CY - Aachen M1 - RWTH-2024-06483 SP - 1 Online-Ressource : Illustrationen PY - 2024 N1 - Veröffentlicht auf dem Publikationsserver der RWTH Aachen University N1 - Dissertation, RWTH Aachen University, 2024 AB - Many organizational processes rely on information systems to support operational functions such as administration, finance, production, and logistics. These systems track process executions in great detail, generating event data that contain valuable information about process executions. Process mining analyzes these event data and yields crucial insights into the processes, such as process models, conformance diagnostics, and performance metrics. Process analysts and owners can use the derived insights to understand how processes are executed in practice and ultimately optimize them, for example, by reducing cycle times, improving resource allocation, and enhancing conformity. Overall, process mining aims to improve processes through data-driven approaches. Process discovery is concerned with learning process models from event data and is a fundamental task within process mining. However, most existing process discovery algorithms are fully automated, i.e., they operate as black boxes from the users’ perspective, discover process models in a one-shot fashion, devoid of user interaction, and often discover subpar models, particularly when applied to real-world data. Moreover, these process discovery algorithms fail to exploit domain knowledge beyond event data. This thesis presents a framework for incremental process discovery that allows users to learn and refine process models from event data iteratively. Thereby, users can observe intermediate process models learned so far. Further, users can manually edit intermediate process models before they are fed back into the incremental process discovery framework for further learning. Moreover, users can selectively incorporate process behaviors from event data. In short, we propose an incremental process discovery framework that allows users to interact and steer the discovery phase of a process model. We further extend the incremental process discovery framework as follows. First, we allow the gradual addition of process execution fragments alongside complete process executions. Most automated process discovery algorithms assume complete process executions that span the process from start to end. In contrast, process execution fragments describe a small part of an entire process execution. The second extension allows for the freezing of model components, which allows users to constrain the incremental discovery approach by preventing it from altering frozen model parts during incremental process discovery. Given users' pivotal role in gradually selecting process behaviors for inclusion in the process model, we introduce novel visualizations for process execution variants. Central to process mining, these variants group individual process executions that have identical arrangements of the activities executed. Considering that activities within a process can run concurrently and overlap, yielding partially ordered event data, we propose visualizations to illustrate such activity relationships. Additionally, this thesis contributes to the field of process querying. We propose a query language for process execution variants that allow the specification of complex control flow patterns among activities. When executing a query, process execution variants satisfying the specified constraints are returned. In short, the proposed query language supports the handling of large event data volumes, enhances the filtering and selection of process execution variants, and, thus, facilitates users during incremental process discovery. Next to process discovery and event data handling, this thesis contributes to conformance checking, a further fundamental process mining task. Conformance checking techniques are used to compare observed with modeled process behavior and are crucial to incremental process discovery, providing information and diagnostics on how well the so-far learned process model aligns with the provided event data. We extend the concept of alignments, i.e., a state-of-the-art conformance checking technique, to accommodate process execution fragments. We define infix and postfix alignments and show their computation. Infix and postfix alignments are critical as they enable incremental process discovery with trace fragments. Moreover, we present Cortado, an open-source process mining software tool that implements the algorithms and techniques proposed in this thesis in an integrated and comprehensive fashion. Through Cortado, we showcase how the methods and algorithms presented in this thesis serve the overall goal of incremental process discovery. Finally, this thesis presents a case study applying Cortado and, therefore, the various contributions of this thesis in a real-life scenario. LB - PUB:(DE-HGF)11 DO - DOI:10.18154/RWTH-2024-06483 UR - https://publications.rwth-aachen.de/record/988919 ER -