Reinforcement learning-based optimization of the job shop problem with transportation resources

Gannouni, Aymen; Schmitt, Robert H.; Kowalski, Julia

doi:10.18154/RWTH-2025-06208

Reinforcement learning-based optimization of the job shop problem with transportation resources = Reinforcement Learning-basierte Optimierung des Job Shop Problems mit Transportressourcen

Gannouni, Aymen^RWTH*

2025

Verantwortlichkeitsangabevorgelegt von Aymen Gannouni

ImpressumAachen : RWTH Aachen University 2025

Umfang1 Online-Ressource : Illustrationen

Dissertation, Rheinisch-Westfälische Technische Hochschule Aachen, 2025

Veröffentlicht auf dem Publikationsserver der RWTH Aachen University

Genehmigende Fakultät
Fak04

Hauptberichter/Gutachter
Schmitt, Robert H. (Thesis advisor)^RWTH* ; Kowalski, Julia (Thesis advisor)^RWTH*

Tag der mündlichen Prüfung/Habilitation
2025-05-22

Online
DOI: 10.18154/RWTH-2025-06208
URL: https://publications.rwth-aachen.de/record/1015030/files/1015030.pdf

Einrichtungen

Inhaltliche Beschreibung (Schlagwörter)
digital twins (frei) ; job shop problem with transportation resources (frei) ; optimization (frei) ; production planning and control (frei) ; reinforcement learning (frei)

Thematische Einordnung (Klassifikation)
DDC: 620

Kurzfassung
Die Entwicklung der Produktionsparadigmen durch die industriellen Revolutionen hat zu einer individualisierteren Produktion geführt. Dieser Wandel wird durch den zunehmenden Einsatz von Industrierobotern für die Auftragsbearbeitung und Intralogistik vorangetrieben. Folglich hat sich der Bedarf an einer kombinierten Optimierung der Produktions- und Transportplanung verstärkt, getrieben durch den Bedarf für eine resilientere und kosteneffizientere Produktion. Die kombinierte Optimierung von Produktions- und Transportaufgaben in mehrstufigen Fertigungsumgebungen stellt komplexe Herausforderungen dar, insbesondere im Kontext des Job Shop Problems mit Transportressourcen (JSPTR). Das JSPTR ist ein NP-schweres kombinatorisches Optimierungsproblem, das das Job Shop Problem (JSP) aus der Produktionsplanung mit kombinatorischen Routing-Problemen aus der Multi-Roboter-Auftragszuweisung kombiniert. Konventionelle Ansätze zur Lösung kombinatorischer Optimierungsprobleme, einschließlich des JSPTR, verlassen sich häufig auf exakte Methoden oder Heuristiken. Diese Ansätze sind in ihrer Fähigkeit zur Generalisierung begrenzt und erfordern eine erneute Anwendung, wenn sich die Problemstellungen ändern. Im Gegensatz dazu hat sich das Reinforcement Learning (RL) als vielversprechende Alternative herausgestellt, die wettbewerbsfähige Lösungszeiten während der Inferenz und Generalisierbarkeit auf unbekannte Problemvariationen bietet. Aktuelle Ansätze zur Optimierung von JSPTR verlassen sich stark auf Benchmark-Instanzen mit festen Routen für Transportroboter und vernachlässigen den Einfluss moderner Intralogistik mit autonomen mobilen Robotern (AMRs). Diese Arbeit schließt diese Lücke, indem sie eine Simulationsumgebung modelliert, die dynamisches Routing von AMRs ermöglicht und als RL Umgebung für die Optimierung von JSPTR dient. Die Tests der trainierten RL Agenten zeigen die Effektivität von RL, indem klassische Heuristiken, wie beispielsweise Prioritätsregeln, übertroffen wurden. Letztendlich bildet die Integration von realitätsnaher Simulation durch dynamisches Routing von AMRs und RL-basierte Optimierung eine starke Grundlage für die weitere Entwicklung digitaler Zwillinge von Produktionssystemen. Dieser Fortschritt unterstützt Entscheidungsträger wie Produktionsplaner und Flottenmanager dabei, schneller auf dynamische Veränderungen reagieren zu können.

The evolution of manufacturing paradigms through the industrial revolutions has led to an increasingly individualized production. This shift is characterized by a growing trend of automation, driven by the use of industrial robots for both manufacturing and material handling. Consequently, the need for simultaneously optimizing production and transportation scheduling has intensified due to the urge for more resilient and cost-efficient production. The joint optimization of production and transportation in multi-stage manufacturing environments poses highly complex optimization challenges, particularly in the job shop problem with transportation resources (JSPTR). The JSPTR is an NP-hard combinatorial optimization problem that combines the job shop problem (JSP) from production scheduling with combinatorial routing problems, such as the multiple traveling salesmen problem (mTSP), from multi-robot task allocation. Conventional approaches to solving combinatorial optimization problems, including the JSPTR, often rely on exact methods or heuristics. These approaches are limited in their ability to generalize and require reapplication when problem settings change. In contrast, reinforcement learning (RL) has emerged as a promising alternative, offering competitive solution times during inference and generalizability to unseen problem variations. Current state-of-the-art optimization approaches for the JSPTR heavily depend on benchmark instances with fixed routes for transportation robots, neglecting the influence of modern intralogistics involving autonomous mobile robots (AMRs). This work addresses the gap by modeling a simulation environment that enables dynamic routing of AMRs, serving the RL-based optimization of the JSPTR. The research gap is further addressed by training RL agents to optimize the JSPTR and testing them in the created simulation environment, demonstrating the effectiveness of RL in outperforming classical heuristics, such as priority dispatching rules. Ultimately, integrating higher fidelity simulation through dynamic routing of AMRs and RL-based optimization lays a strong foundation for further developing digital twins of production systems. This advancement supports stakeholders, such as production planners and fleet managers, in promptly reacting to dynamic changes.

OpenAccess:
PDF
(additional files)