跳到主要导航 跳到搜索 跳到主要内容

LLM4CGDS: Large language model-based agents for Chinese graded document simplification

  • Dengzhao Fang
  • , Jipeng Qiang*
  • , Wenjie Hou
  • , Yi Zhu
  • , Jingtong Gao
  • , Xiangyu Zhao
  • *此作品的通讯作者
  • Yangzhou University
  • School of Artificial Intelligence
  • City University of Hong Kong

科研成果: 期刊稿件文章同行评审

摘要

Graded reading tailors text difficulty to learners’ proficiency by producing multiple versions of the same content—an approach long embraced in language education but still dependent on labor-intensive, expert-driven adaptation. In this paper, we introduce the task of C hinese G raded D ocument S implification (CGDS) for non-native learners, which seeks to automate the creation of multi-level reading materials in accordance with established proficiency standards. Guided by the three stages of the Hanyu Shuiping Kaoshi (HSK) 3.0 framework (Levels 1–3 for Advanced, Levels 4–6 for Intermediate, and Levels 7–9 for Beginner learners), we propose Large Language Model for Chinese Graded Document Simplification (LLM4CGDS), a rule-guided, large language model (LLM)-based framework that integrates HSK-level readability constraints and external knowledge retrieval to control document-level simplification without requiring supervised fine-tuning. To foster further research, we construct two complementary datasets: J ourney to the W est D ocument S implification (JWDS) and M ulti- D omain D ocument S implification (MDDS) that covering diverse genres and difficulty levels. Experimental evaluation on two datasets demonstrates that LLM4CGDS substantially outperforms direct prompting of state-of-the-art LLMs in both readability control and meaning preservation.

源语言英语
文章编号113905
期刊Engineering Applications of Artificial Intelligence
169
DOI
出版状态已出版 - 1 4月 2026
已对外发布

指纹

探究 'LLM4CGDS: Large language model-based agents for Chinese graded document simplification' 的科研主题。它们共同构成独一无二的指纹。

引用此