TY  - JOUR
A1  - Panzer, Marcel
A1  - Bender, Benedict
T1  - Deep reinforcement learning in production systems
BT  - a systematic literature review
JF  - International Journal of Production Research
N2  - Shortening product development cycles and fully customizable products pose major challenges for production systems. These not only have to cope with an increased product diversity but also enable high throughputs and provide a high adaptability and robustness to process variations and unforeseen incidents. To overcome these challenges, deep Reinforcement Learning (RL) has been increasingly applied for the optimization of production systems. Unlike other machine learning methods, deep RL operates on recently collected sensor-data in direct interaction with its environment and enables real-time responses to system changes. Although deep RL is already being deployed in production systems, a systematic review of the results has not yet been established. The main contribution of this paper is to provide researchers and practitioners an overview of applications and to motivate further implementations and research of deep RL supported production systems. Findings reveal that deep RL is applied in a variety of production domains, contributing to data-driven and flexible processes. In most applications, conventional methods were outperformed and implementation efforts or dependence on human experience were reduced. Nevertheless, future research must focus more on transferring the findings to real-world systems to analyze safety aspects and demonstrate reliability under prevailing conditions.
KW  - Machine learning
KW  - reinforcement learning
KW  - production control
KW  - production planning
KW  - manufacturing processes
KW  - systematic literature review
Y1  - 2021
U6  - https://doi.org/10.1080/00207543.2021.1973138
SN  - 1366-588X
SN  - 0020-7543
VL  - 13
IS  - 60
PB  - Taylor & Francis
CY  - London
ER  - 
TY  - JOUR
A1  - Panzer, Marcel
A1  - Bender, Benedict
A1  - Gronau, Norbert
T1  - A deep reinforcement learning based hyper-heuristic for modular production control
JF  - International journal of production research
N2  - In nowadays production, fluctuations in demand, shortening product life-cycles, and highly configurable products require an adaptive and robust control approach to maintain competitiveness. This approach must not only optimise desired production objectives but also cope with unforeseen machine failures, rush orders, and changes in short-term demand. Previous control approaches were often implemented using a single operations layer and a standalone deep learning approach, which may not adequately address the complex organisational demands of modern manufacturing systems. To address this challenge, we propose a hyper-heuristics control model within a semi-heterarchical production system, in which multiple manufacturing and distribution agents are spread across pre-defined modules. The agents employ a deep reinforcement learning algorithm to learn a policy for selecting low-level heuristics in a situation-specific manner, thereby leveraging system performance and adaptability. We tested our approach in simulation and transferred it to a hybrid production environment. By that, we were able to demonstrate its multi-objective optimisation capabilities compared to conventional approaches in terms of mean throughput time, tardiness, and processing of prioritised orders in a multi-layered production system. The modular design is promising in reducing the overall system complexity and facilitates a quick and seamless integration into other scenarios.
KW  - production control
KW  - modular production
KW  - multi-agent system
KW  - deep reinforcement learning
KW  - deep learning
KW  - multi-objective optimisation
Y1  - 2023
U6  - https://doi.org/10.1080/00207543.2023.2233641
SN  - 0020-7543
SN  - 1366-588X
SN  - 0278-6125
SP  - 1
EP  - 22
PB  - Taylor & Francis
CY  - London
ER  - 
TY  - JOUR
A1  - Panzer, Marcel
A1  - Gronau, Norbert
T1  - Enhancing economic efficiency in modular production systems through deep reinforcement learning
JF  - Procedia CIRP
N2  - In times of increasingly complex production processes and volatile customer demands, the production adaptability is crucial for a company's profitability and competitiveness. The ability to cope with rapidly changing customer requirements and unexpected internal and external events guarantees robust and efficient production processes, requiring a dedicated control concept at the shop floor level. Yet in today's practice, conventional control approaches remain in use, which may not keep up with the dynamic behaviour due to their scenario-specific and rigid properties. To address this challenge, deep learning methods were increasingly deployed due to their optimization and scalability properties. However, these approaches were often tested in specific operational applications and focused on technical performance indicators such as order tardiness or total throughput. In this paper, we propose a deep reinforcement learning based production control to optimize combined techno-financial performance measures. Based on pre-defined manufacturing modules that are supplied and operated by multiple agents, positive effects were observed in terms of increased revenue and reduced penalties due to lower throughput times and fewer delayed products. The combined modular and multi-staged approach as well as the distributed decision-making further leverage scalability and transferability to other scenarios.
KW  - modular production
KW  - production control
KW  - multi-agent system
KW  - deep reinforcement learning
KW  - discrete event simulation
Y1  - 2024
U6  - https://doi.org/10.1016/j.procir.2023.09.229
SN  - 2212-8271
VL  - 121
SP  - 55
EP  - 60
PB  - Elsevier
CY  - Amsterdam
ER  -