Reinforcement learning-based optimization of the job shop problem with transportation resources

Gannouni, Aymen; Schmitt, Robert H.; Kowalski, Julia
doi:10.18154/RWTH-2025-06208
% IMPORTANT: The following is UTF-8 encoded.  This means that in the presence
% of non-ASCII characters, it will not work with BibTeX 0.99 or older.
% Instead, you should use an up-to-date BibTeX implementation like “bibtex8” or
% “biber”.

@PHDTHESIS{Gannouni:1015030,
      author       = {Gannouni, Aymen},
      othercontributors = {Schmitt, Robert H. and Kowalski, Julia},
      title        = {{R}einforcement learning-based optimization of the job shop
                      problem with transportation resources},
      school       = {Rheinisch-Westfälische Technische Hochschule Aachen},
      type         = {Dissertation},
      address      = {Aachen},
      publisher    = {RWTH Aachen University},
      reportid     = {RWTH-2025-06208},
      pages        = {1 Online-Ressource : Illustrationen},
      year         = {2025},
      note         = {Veröffentlicht auf dem Publikationsserver der RWTH Aachen
                      University; Dissertation, Rheinisch-Westfälische Technische
                      Hochschule Aachen, 2025},
      abstract     = {The evolution of manufacturing paradigms through the
                      industrial revolutions has led to an increasingly
                      individualized production. This shift is characterized by a
                      growing trend of automation, driven by the use of industrial
                      robots for both manufacturing and material handling.
                      Consequently, the need for simultaneously optimizing
                      production and transportation scheduling has intensified due
                      to the urge for more resilient and cost-efficient
                      production. The joint optimization of production and
                      transportation in multi-stage manufacturing environments
                      poses highly complex optimization challenges, particularly
                      in the job shop problem with transportation resources
                      (JSPTR). The JSPTR is an NP-hard combinatorial optimization
                      problem that combines the job shop problem (JSP) from
                      production scheduling with combinatorial routing problems,
                      such as the multiple traveling salesmen problem (mTSP), from
                      multi-robot task allocation. Conventional approaches to
                      solving combinatorial optimization problems, including the
                      JSPTR, often rely on exact methods or heuristics. These
                      approaches are limited in their ability to generalize and
                      require reapplication when problem settings change. In
                      contrast, reinforcement learning (RL) has emerged as a
                      promising alternative, offering competitive solution times
                      during inference and generalizability to unseen problem
                      variations. Current state-of-the-art optimization approaches
                      for the JSPTR heavily depend on benchmark instances with
                      fixed routes for transportation robots, neglecting the
                      influence of modern intralogistics involving autonomous
                      mobile robots (AMRs). This work addresses the gap by
                      modeling a simulation environment that enables dynamic
                      routing of AMRs, serving the RL-based optimization of the
                      JSPTR. The research gap is further addressed by training RL
                      agents to optimize the JSPTR and testing them in the created
                      simulation environment, demonstrating the effectiveness of
                      RL in outperforming classical heuristics, such as priority
                      dispatching rules. Ultimately, integrating higher fidelity
                      simulation through dynamic routing of AMRs and RL-based
                      optimization lays a strong foundation for further developing
                      digital twins of production systems. This advancement
                      supports stakeholders, such as production planners and fleet
                      managers, in promptly reacting to dynamic changes.},
      cin          = {417510 / 417200},
      ddc          = {620},
      cid          = {$I:(DE-82)417510_20140620$ / $I:(DE-82)417200_20140620$},
      typ          = {PUB:(DE-HGF)11},
      doi          = {10.18154/RWTH-2025-06208},
      url          = {https://publications.rwth-aachen.de/record/1015030},
}
h1

h2

h3

h4

h5

h6

RWTH

Kontakt

RWTH Publications

Allgemeines