Multi-task Offline Reinforcement Learning for Online Advertising in Recommender Systems

  • Langming Liu
  • , Wanyu Wang
  • , Chi Zhang
  • , Bo Li
  • , Hongzhi Yin
  • , Xuetao Wei*
  • , Wenbo Su
  • , Bo Zheng
  • , Xiangyu Zhao*
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Online advertising in recommendation platforms has gained significant attention, with a predominant focus on channel recommendation and budget allocation strategies. However, current offline reinforcement learning (RL) methods face substantial challenges when applied to sparse advertising scenarios, primarily due to severe overestimation, distributional shifts, and overlooking budget constraints. To address these issues, we propose MTORL, a novel multi-task offline RL model that targets two key objectives. First, we establish a Markov Decision Process (MDP) framework specific to the nuances of advertising. Then, we develop a causal state encoder to capture dynamic user interests and temporal dependencies, facilitating offline RL through conditional sequence modeling. Causal attention mechanisms are introduced to enhance user sequence representations by identifying correlations among causal states. We employ multi-task learning to decode actions and rewards, simultaneously addressing channel recommendation and budget allocation. Notably, our framework includes an automated system for integrating these tasks into online advertising. Extensive experiments on offline and online environments demonstrate MTORL's superiority over state-of-the-art methods. The code is available online at https://github.com/Applied-Machine-Learning-Lab/MTORL.

Original languageEnglish
Title of host publicationKDD 2025 - Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages4635-4646
Number of pages12
ISBN (Electronic)9798400714542
DOIs
StatePublished - 3 Aug 2025
Externally publishedYes
Event31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2025 - Toronto, Canada
Duration: 3 Aug 20257 Aug 2025

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Volume2
ISSN (Print)2154-817X

Conference

Conference31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2025
Country/TerritoryCanada
CityToronto
Period3/08/257/08/25

Keywords

  • advertising
  • multi-task learning
  • offline reinforcement learning

Fingerprint

Dive into the research topics of 'Multi-task Offline Reinforcement Learning for Online Advertising in Recommender Systems'. Together they form a unique fingerprint.

Cite this