the playlists used. Additionally, we addressed ethi-
cal concerns that users must consider while using this
tool, including potential license issues.
For future work, we plan to enhance the tool in
various ways. For example, we aim to incorporate an
automatic license detection feature that informs the
user about the license used for each video, thereby
helping to avoid legal issues. The segmentation of
chapters can also be significantly improved. Specif-
ically, we can integrate a topic detection layer, en-
abling the tool to identify changes in topics and seg-
ment the transcripts accordingly (Vayansky and Ku-
mar, 2020). Additionally, we intend to enhance text
preprocessing and provide users with more options in
choosing how the text should be preprocessed to align
with their specific needs.
ACKNOWLEDGEMENTS
This work has received a French government support
granted to the Labex Cominlabs excellence laboratory
and managed by the National Research Agency in the
“Investing for the Future” program under reference
ANR-10-LABX-07-01.
We extend our sincere appreciation to Omonliwi
Graciela Thoo for her contribution in implementing a
first version of the code.
REFERENCES
Adomavicius, G. and Tuzhilin, A. (2005). Toward the
next generation of recommender systems: A sur-
vey of the state-of-the-art and possible extensions.
IEEE transactions on knowledge and data engineer-
ing, 17(6):734–749.
Bazouzi, A. A., Foursov, M., Le Capitaine, H., and Miklos,
Z. (2023). EMBEDD-ER : EMBEDDing Educational
Resources Using Linked Open Data. In Proceedings
of the 15th International Conference on Computer
Supported Education, page 439–446, Prague, Czech
Republic.
Connes, V., de La Higuera, C., and Le Capitaine, H. (2021).
What should i learn next? ranking educational re-
sources. In 2021 IEEE 45th Annual Computers,
Software, and Applications Conference (COMPSAC),
page 109–114. IEEE.
Devlin, J., Chang, M.-W., Lee, K., and Toutanova,
K. (2019). BERT: Pre-training of deep bidi-
rectional transformers for language understanding.
(arXiv:1810.04805). arXiv:1810.04805 [cs].
Ferreira-Mello, R., Andre, M., Pinheiro, A., Costa, E., and
Romero, C. (2019). Text mining in education. WIREs
Data Mining and Knowledge Discovery, 9(6):e1332.
Hoffart, J., Yosef, M. A., Bordino, I., Furstenau, H., Pinkal,
M., Spaniol, M., Taneva, B., Thater, S., and Weikum,
G. (2011). Robust disambiguation of named entities
in text. In Proceedings of the 2011 conference on em-
pirical methods in natural language processing, page
782–792.
Li, Y. H. and Jain, A. K. (1998). Classification of text doc-
uments. The Computer Journal, 41(8):537–546.
Liang, C., Wu, Z., Huang, W., and Giles, C. L. (2015). Mea-
suring prerequisite relations among concepts. In Pro-
ceedings of the 2015 conference on empirical methods
in natural language processing, page 1668–1674.
Nabizadeh, A. H., Leal, J. P., Rafsanjani, H. N., and Shah,
R. R. (2020). Learning path personalization and rec-
ommendation methods: A survey of the state-of-the-
art. Expert Systems with Applications, 159:113596.
Romero, C. and Ventura, S. (2013). Data mining in educa-
tion. WIREs Data Mining and Knowledge Discovery,
3(1):12–27.
Sun, C., Qiu, X., Xu, Y., and Huang, X. (2019). How
to fine-tune bert for text classification? In Sun, M.,
Huang, X., Ji, H., Liu, Z., and Liu, Y., editors, Chinese
Computational Linguistics, Lecture Notes in Com-
puter Science, page 194–206, Cham. Springer Inter-
national Publishing.
Urdaneta-Ponte, M. C., Mendez-Zorrilla, A., and
Oleagordia-Ruiz, I. (2021). Recommendation
systems for education: Systematic review. Electron-
ics, 10(1414):1611.
Vayansky, I. and Kumar, S. A. P. (2020). A review of topic
modeling methods. Information Systems, 94:101582.
Yang, Y., Liu, H., Carbonell, J., and Ma, W. (2015). Con-
cept graph learning from educational data. In Pro-
ceedings of the Eighth ACM International Conference
on Web Search and Data Mining, page 159–168.
APPENDIX
6.1 Dataset Attributes
Tables 2, 3, and 4 present the attributes of the created
datasets.
ConstrucTED: Constructing Tailored Educational Datasets from Online Courses
651