跳到主要导航 跳到搜索 跳到主要内容

Personalize Before Retrieve: LLM-based Personalized Query Expansion for User-Centric Retrieval

  • Yingyi Zhang
  • , Pengyue Jia
  • , Derong Xu
  • , Yi Wen
  • , Xianneng Li*
  • , Yichao Wang*
  • , Wenlin Zhang
  • , Xiaopeng Li
  • , Weinan Gan
  • , Huifeng Guo
  • , Yong Liu
  • , Xiangyu Zhao*
  • *此作品的通讯作者
  • Dalian University of Technology
  • City University of Hong Kong
  • University of Science and Technology of China
  • Huawei Technologies Co., Ltd.

科研成果: 期刊稿件会议文章同行评审

摘要

Retrieval-Augmented Generation (RAG) critically depends on effective query expansion to retrieve relevant information. However, existing expansion methods adopt uniform strategies that overlook user-specific semantics, ignoring individual expression styles, preferences, and historical context. In practice, identical queries in text can express vastly different intentions across users. This representational rigidity limits the ability of current RAG systems to generalize effectively in personalized settings. Specifically, we identify two core challenges for personalization: 1) user expression styles are inherently diverse, making it difficult for standard expansions to preserve personalized intent. 2) user corpora induce heterogeneous semantic structures—varying in topical focus and lexical organization—which hinders the effective anchoring of expanded queries within the user’s corpora space. To address these challenges, we propose Personalize Before Retrieve (PBR), a framework that incorporates user-specific signals into query expansion prior to retrieval. PBR consists of two components: P-PRF, which generates stylistically aligned pseudo feedback using user history for simulating user expression style, and P-Anchor, which performs graph-based structure alignment over user corpora to capture its structure. Together, they produce personalized query representations tailored for retrieval. Experiments on two personalized benchmarks show that PBR consistently outperforms strong baselines, with up to 10% gains on PersonaBench across retrievers. Our findings demonstrate the value of modeling personalization before retrieval to close the semantic gap in user-adaptive RAG systems.

源语言英语
页(从-至)16406-16414
页数9
期刊Proceedings of the AAAI Conference on Artificial Intelligence
40
19
DOI
出版状态已出版 - 2026
已对外发布
活动40th AAAI Conference on Artificial Intelligence, AAAI 2026 - Singapore, 新加坡
期限: 20 1月 202627 1月 2026

指纹

探究 'Personalize Before Retrieve: LLM-based Personalized Query Expansion for User-Centric Retrieval' 的科研主题。它们共同构成独一无二的指纹。

引用此