Deep reinforcement learning for ad-hoc optimization on process and manufacturing level

Samsonov, Vladimir; Schmitt, Robert H.; Meisen, Tobias
doi:HT021475974
TY  - THES
AU  - Samsonov, Vladimir
TI  - Deep reinforcement learning for ad-hoc optimization on process and manufacturing level
PB  - Rheinisch-Westfälische Technische Hochschule Aachen
VL  - Dissertation
CY  - Aachen
M1  - RWTH-2022-08469
SP  - 1 Online-Ressource : Illustrationen, Diagramme
PY  - 2022
N1  - Veröffentlicht auf dem Publikationsserver der RWTH Aachen University
N1  - Dissertation, Rheinisch-Westfälische Technische Hochschule Aachen, 2022
AB  - Nowadays, it is not enough to set up and fine-tune manufacturing once to keep an edge in competition on global markets and satisfy increasingly strict sustainability regulations. Well-working processes on all production levels turn into constantly moving competing targets conditioned on the current goals and challenges of the given manufacturing company. This brings the need for automated optimization approaches deployed on different levels of the manufacturing chain, capable of learning from the experience, accommodating repeatedly changing problem conditions, and updating solutions at speed surpassing the pace of changes in the manufacturing system. This thesis investigates the capabilities of new optimization methods relying on Deep Reinforcement Learning (DRL) for ensuring high levels of manufacturing efficiency and flexibility. Two practical Use Cases (UCs) representing optimization tasks of different nature in manufacturing are considered: order release and sequencing in a job shop manufacturing and workpiece setup for 5-axis milling. Resulting DRL-based solutions serve as a demonstration of performance levels such methods are capable of, their flexibility, and production readiness. Within this work, a number of novel action-space, state-space, and reward design solutions are developed to achieve state-of-the-art performance in both applications. The resulting DRL solution for scheduling and order release is demonstrated to learn solution strategies capable of scaling to much larger problem instances than the ones used for training while demonstrating the ability to quickly generate new solutions within seconds and, therefore, giving the possibility to react to production deviations quickly. The DRL-based optimization method developed for the workpiece setup optimization is capable of working with complex 3D concepts related to the milling strategy and resulting workpiece geometry. It can generate near-optimal solutions within a few minutes of computation time for unseen workpieces, demonstrating the ability to speed up the setup process for complex milling applications. To facilitate the work conducted in this thesis and foster future research, an experimentation meta-framework is developed aimed to ensure the comparability between multiple studies with considerable differences in reported evaluation procedures, test problems, and performance metrics. As a result, this work contributes towards the systematic understanding of possible advantages, challenges, and implementation approaches for learning-based optimization in manufacturing. A set of well-structured implementations with isolated design elements can serve as building blocks for future optimization methods in the manufacturing domain and facilitate the transfer of DRL-based solutions to new optimization contexts in manufacturing.
LB  - PUB:(DE-HGF)11
DO  - DOI:10.18154/RWTH-2022-08469
UR  - https://publications.rwth-aachen.de/record/852921
ER  -
h1

h2

h3

h4

h5

h6

RWTH

Kontakt

RWTH Publications

Allgemeines