TY - GEN
T1 - AI4Reading
T2 - 63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025
AU - Huang, Minjiang
AU - Qiang, Jipeng
AU - Zhu, Yi
AU - Zhang, Chaowei
AU - Zhao, Xiangyu
AU - Yu, Kui
N1 - Publisher Copyright:
©2025 Association for Computational Linguistics.
PY - 2025
Y1 - 2025
N2 - Audiobook interpretations are attracting increasing attention, as they provide accessible and in-depth analyses of books that offer readers practical insights and intellectual inspiration. However, their manual creation process remains time-consuming and resource-intensive. To address this challenge, we propose AI4Reading, a multi-agent collaboration system leveraging large language models (LLMs) and speech synthesis technology to generate podcast-like audiobook interpretations. The system is designed to meet three key objectives: accurate content preservation, enhanced comprehensibility, and a logical narrative structure. To achieve these goals, we develop a framework composed of 11 specialized agents—including topic analysts, case analysts, editors, a narrator, and proofreaders—that work in concert to explore themes, extract real-world cases, refine content organization, and synthesize natural spoken language. By comparing expert interpretations with our system’s output, the results show that although AI4Reading still has a gap in speech generation quality, the generated interpretative scripts are simpler and more accurate.
AB - Audiobook interpretations are attracting increasing attention, as they provide accessible and in-depth analyses of books that offer readers practical insights and intellectual inspiration. However, their manual creation process remains time-consuming and resource-intensive. To address this challenge, we propose AI4Reading, a multi-agent collaboration system leveraging large language models (LLMs) and speech synthesis technology to generate podcast-like audiobook interpretations. The system is designed to meet three key objectives: accurate content preservation, enhanced comprehensibility, and a logical narrative structure. To achieve these goals, we develop a framework composed of 11 specialized agents—including topic analysts, case analysts, editors, a narrator, and proofreaders—that work in concert to explore themes, extract real-world cases, refine content organization, and synthesize natural spoken language. By comparing expert interpretations with our system’s output, the results show that although AI4Reading still has a gap in speech generation quality, the generated interpretative scripts are simpler and more accurate.
UR - https://www.scopus.com/pages/publications/105020387944
U2 - 10.18653/v1/2025.acl-demo.21
DO - 10.18653/v1/2025.acl-demo.21
M3 - 会议稿件
AN - SCOPUS:105020387944
T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics
SP - 211
EP - 220
BT - System Demonstrations
A2 - Mishra, Pushkar
A2 - Muresan, Smaranda
A2 - Yu, Tao
PB - Association for Computational Linguistics (ACL)
Y2 - 27 July 2025 through 1 August 2025
ER -