TY - THES AU - Samsonov, Vladimir TI - Deep reinforcement learning for ad-hoc optimization on process and manufacturing level PB - Rheinisch-Westfälische Technische Hochschule Aachen VL - Dissertation CY - Aachen M1 - RWTH-2022-08469 SP - 1 Online-Ressource : Illustrationen, Diagramme PY - 2022 N1 - Veröffentlicht auf dem Publikationsserver der RWTH Aachen University N1 - Dissertation, Rheinisch-Westfälische Technische Hochschule Aachen, 2022 AB - Nowadays, it is not enough to set up and fine-tune manufacturing once to keep an edge in competition on global markets and satisfy increasingly strict sustainability regulations. Well-working processes on all production levels turn into constantly moving competing targets conditioned on the current goals and challenges of the given manufacturing company. This brings the need for automated optimization approaches deployed on different levels of the manufacturing chain, capable of learning from the experience, accommodating repeatedly changing problem conditions, and updating solutions at speed surpassing the pace of changes in the manufacturing system. This thesis investigates the capabilities of new optimization methods relying on Deep Reinforcement Learning (DRL) for ensuring high levels of manufacturing efficiency and flexibility. Two practical Use Cases (UCs) representing optimization tasks of different nature in manufacturing are considered: order release and sequencing in a job shop manufacturing and workpiece setup for 5-axis milling. Resulting DRL-based solutions serve as a demonstration of performance levels such methods are capable of, their flexibility, and production readiness. Within this work, a number of novel action-space, state-space, and reward design solutions are developed to achieve state-of-the-art performance in both applications. The resulting DRL solution for scheduling and order release is demonstrated to learn solution strategies capable of scaling to much larger problem instances than the ones used for training while demonstrating the ability to quickly generate new solutions within seconds and, therefore, giving the possibility to react to production deviations quickly. The DRL-based optimization method developed for the workpiece setup optimization is capable of working with complex 3D concepts related to the milling strategy and resulting workpiece geometry. It can generate near-optimal solutions within a few minutes of computation time for unseen workpieces, demonstrating the ability to speed up the setup process for complex milling applications. To facilitate the work conducted in this thesis and foster future research, an experimentation meta-framework is developed aimed to ensure the comparability between multiple studies with considerable differences in reported evaluation procedures, test problems, and performance metrics. As a result, this work contributes towards the systematic understanding of possible advantages, challenges, and implementation approaches for learning-based optimization in manufacturing. A set of well-structured implementations with isolated design elements can serve as building blocks for future optimization methods in the manufacturing domain and facilitate the transfer of DRL-based solutions to new optimization contexts in manufacturing. LB - PUB:(DE-HGF)11 DO - DOI:10.18154/RWTH-2022-08469 UR - https://publications.rwth-aachen.de/record/852921 ER -