跳到主要导航 跳到搜索 跳到主要内容

Multi-task Offline Reinforcement Learning for Online Advertising in Recommender Systems

  • Langming Liu
  • , Wanyu Wang
  • , Chi Zhang
  • , Bo Li
  • , Hongzhi Yin
  • , Xuetao Wei*
  • , Wenbo Su
  • , Bo Zheng
  • , Xiangyu Zhao*
  • *此作品的通讯作者
  • City University of Hong Kong
  • Southern University of Science and Technology
  • Harbin Engineering University
  • University of Queensland
  • Alibaba Group Holding Ltd.

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Online advertising in recommendation platforms has gained significant attention, with a predominant focus on channel recommendation and budget allocation strategies. However, current offline reinforcement learning (RL) methods face substantial challenges when applied to sparse advertising scenarios, primarily due to severe overestimation, distributional shifts, and overlooking budget constraints. To address these issues, we propose MTORL, a novel multi-task offline RL model that targets two key objectives. First, we establish a Markov Decision Process (MDP) framework specific to the nuances of advertising. Then, we develop a causal state encoder to capture dynamic user interests and temporal dependencies, facilitating offline RL through conditional sequence modeling. Causal attention mechanisms are introduced to enhance user sequence representations by identifying correlations among causal states. We employ multi-task learning to decode actions and rewards, simultaneously addressing channel recommendation and budget allocation. Notably, our framework includes an automated system for integrating these tasks into online advertising. Extensive experiments on offline and online environments demonstrate MTORL's superiority over state-of-the-art methods. The code is available online at https://github.com/Applied-Machine-Learning-Lab/MTORL.

源语言英语
主期刊名KDD 2025 - Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining
出版商Association for Computing Machinery
4635-4646
页数12
ISBN(电子版)9798400714542
DOI
出版状态已出版 - 3 8月 2025
已对外发布
活动31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2025 - Toronto, 加拿大
期限: 3 8月 20257 8月 2025

出版系列

姓名Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
2
ISSN(印刷版)2154-817X

会议

会议31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2025
国家/地区加拿大
Toronto
时期3/08/257/08/25

指纹

探究 'Multi-task Offline Reinforcement Learning for Online Advertising in Recommender Systems' 的科研主题。它们共同构成独一无二的指纹。

引用此