TY - GEN
T1 - XES3G5M
T2 - 37th Conference on Neural Information Processing Systems, NeurIPS 2023
AU - Liu, Zitao
AU - Liu, Qiongqiong
AU - Guo, Teng
AU - Chen, Jiahao
AU - Huang, Shuyan
AU - Zhao, Xiangyu
AU - Tang, Jiliang
AU - Luo, Weiqi
AU - Weng, Jian
N1 - Publisher Copyright:
© 2023 Neural information processing systems foundation. All rights reserved.
PY - 2023
Y1 - 2023
N2 - Knowledge tracing (KT) is a task that predicts students' future performance based on their historical learning interactions.With the rapid development of deep learning techniques, existing KT approaches follow a data-driven paradigm that uses massive problem-solving records to model students' learning processes.However, although the educational contexts contain various factors that may have an influence on student learning outcomes, existing public KT datasets mainly consist of anonymized ID-like features, which may hinder the research advances towards this field.Therefore, in this work, we present, XES3G5M, a large-scale dataset with rich auxiliary information about questions and their associated knowledge components (KCs)2.The XES3G5M dataset is collected from a real-world online math learning platform, which contains 7, 652 questions, and 865 KCs with 5, 549, 635 interactions from 18, 066 students.To the best of our knowledge, the XES3G5M dataset not only has the largest number of KCs in math domain but contains the richest contextual information including tree structured KC relations, question types, textual contents and analysis and student response timestamps.Furthermore, we build a comprehensive benchmark on 19 state-of-the-art deep learning based knowledge tracing (DLKT) models.Extensive experiments demonstrate the effectiveness of leveraging the auxiliary information in our XES3G5M with DLKT models.We hope the proposed dataset can effectively facilitate the KT research work.
AB - Knowledge tracing (KT) is a task that predicts students' future performance based on their historical learning interactions.With the rapid development of deep learning techniques, existing KT approaches follow a data-driven paradigm that uses massive problem-solving records to model students' learning processes.However, although the educational contexts contain various factors that may have an influence on student learning outcomes, existing public KT datasets mainly consist of anonymized ID-like features, which may hinder the research advances towards this field.Therefore, in this work, we present, XES3G5M, a large-scale dataset with rich auxiliary information about questions and their associated knowledge components (KCs)2.The XES3G5M dataset is collected from a real-world online math learning platform, which contains 7, 652 questions, and 865 KCs with 5, 549, 635 interactions from 18, 066 students.To the best of our knowledge, the XES3G5M dataset not only has the largest number of KCs in math domain but contains the richest contextual information including tree structured KC relations, question types, textual contents and analysis and student response timestamps.Furthermore, we build a comprehensive benchmark on 19 state-of-the-art deep learning based knowledge tracing (DLKT) models.Extensive experiments demonstrate the effectiveness of leveraging the auxiliary information in our XES3G5M with DLKT models.We hope the proposed dataset can effectively facilitate the KT research work.
UR - https://www.scopus.com/pages/publications/85188503890
M3 - 会议稿件
AN - SCOPUS:85188503890
T3 - Advances in Neural Information Processing Systems
BT - Advances in Neural Information Processing Systems 36 - 37th Conference on Neural Information Processing Systems, NeurIPS 2023
A2 - Oh, A.
A2 - Neumann, T.
A2 - Globerson, A.
A2 - Saenko, K.
A2 - Hardt, M.
A2 - Levine, S.
PB - Neural information processing systems foundation
Y2 - 10 December 2023 through 16 December 2023
ER -