Skip to main navigation Skip to search Skip to main content

A multi-agent reinforcement learning (MARL) framework for designing an optimal state-specific hybrid maintenance policy for a series k-out-of-n load-sharing system

  • Sangqi Zhao
  • , YIAN WEI
  • , Yang Li
  • , Yao Cheng*
  • *Corresponding author for this work
  • The University of Hong Kong
  • Singapore University of Social Sciences

Research output: Contribution to journalArticlepeer-review

Abstract

The series k-out-of-n: G load-sharing structure is widely adopted in engineering. During their operations, system components are subject to deterioration that causes system failures and shutdowns. Although maintenance reduces system failure-associated costs, it also requires system shutdown and incurs considerable costs. This calls upon a maintenance policy that minimizes the overall long-term cost rate. When the components have continuous and load-dependent deterioration processes and the maintenance duration is non-negligible, the task becomes especially challenging. In this paper, we propose a Markov decision process (MDP)-based multi-agent reinforcement learning (MARL) framework to obtain an optimal state-specific hybrid maintenance policy that determines the maintenance timing and levels for all components holistically. First, we define the policy that dictates whether each component undergoes imperfect repair or replacement at periodic decision epochs. Second, we establish an MDP-based multi-agent framework to quantify the system’s cost rate by defining the state and action spaces, modeling the stochastic transitions of components’ dependent deterioration processes, and formulating a well-calibrated penalty function. Third, we customize a MARL algorithm which leverages neural networks to handle the large state space and integrates the Branching Dueling Network structure to decompose the high-dimensional action space, thereby improving the scalability. A heuristic-enhanced penalty function is designed to avoid suboptimal policies. A power plant case study demonstrates the effectiveness of the proposed policy and underscores the importance of accounting for maintenance duration in policy design.
Original languageAmerican English
JournalReliability Engineering and System Safety
Volume265
StatePublished - 19 Aug 2025

Fingerprint

Dive into the research topics of 'A multi-agent reinforcement learning (MARL) framework for designing an optimal state-specific hybrid maintenance policy for a series k-out-of-n load-sharing system'. Together they form a unique fingerprint.

Cite this