Article
Machine Learning–Based Behavioral Analysis and Natural Language Mining for Computer Learning Development


This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright
The authors shall retain the copyright of their work but allow the Publisher to publish, copy, distribute, and convey the work.
License
Digital Technologies Research and Applications (DTRA) publishes accepted manuscripts under Creative Commons Attribution 4.0 International (CC BY 4.0). Authors who submit their papers for publication by DTRA agree to have the CC BY 4.0 license applied to their work, and that anyone is allowed to reuse the article or part of it free of charge for any purpose, including commercial use. As long as the author and original source are properly cited, anyone may copy, redistribute, reuse, and transform the content.
Received: 21 January 2026; Revised: 22 February 2026; Accepted: 10 March 2026; Published: 11 May 2026
Programming education continues to face significant challenges, with failure and dropout rates exceeding 30% in many introductory courses. Existing learning analytics approaches largely rely on static behavioral indicators derived from the Felder–Silverman Learning Style Model (FSLSM), which often fail to capture the temporal dynamics of learning and the syntactic complexity involved in programming activities. These limitations are particularly evident in detecting the Sequential/Global learning dimension and understanding how students interact with programming tasks over time. This study aims to address these limitations by proposing CAFNet (Crossmodal Attention Fusion Network), a multimodal learning analytics framework that integrates behavioral machine learning with natural language and code analysis. The proposed architecture combines Temporal Convolutional Networks to model behavioral indicators, CodeBERT for forum discourse representation, and Tree-Transformer models for Abstract Syntax Tree-based code analysis. A hierarchical cross-modal attention mechanism aligns these heterogeneous data sources, while Federated Supervised Contrastive Learning ensures privacy-preserving deployment across institutions under differential privacy constraints (ε = 0.5). The framework was evaluated using three heterogeneous datasets comprising 14,308 learners from programming education environments. Experimental results show that CAFNet achieved 91.7% classification accuracy with an AUC-ROC of 0.947, outperforming classical machine learning and deep learning baselines by 17.5%. The model achieved 94.1% accuracy for the Sequential/Global dimension, representing a major improvement over previous studies. Additionally, early at-risk prediction reached 88.9% accuracy at week four of the course. These findings demonstrate that integrating behavioral, linguistic, and programming data provides a scalable and privacy-compliant approach for intelligent educational systems supporting personalized learning and early academic intervention.
Keywords:
Cross-Modal Learning Behavioral Indicators Felder–Silverman Model Federated Learning Differential Privacy Programming EducationReferences
- Babulak, E. AI-Driven Approaches for Fully Automated Smart Engineering; IGI Global Scientific Publishing: Hershey, PA, USA, 2026.
- Babulak, E. Educational AI Humanoid Computing Devices for Cyber Nomads; IGI Global Scientific Publishing: Hershey, PA, USA, 2025.
- Ferreira, R.; Freitas, E.; Cabral, L.; et al. Words of Wisdom: A Journey through the Realm of Natural Language Processing for Learning Analytics—A Systematic Literature Review. J. Learn. Anal. 2024, 11, 82–105. DOI: https://doi.org/10.18608/jla.2024.8403
- Shaik, T.; Tao, X.; Dann, C.; et al. Sentiment Analysis and Opinion Mining on Educational Data: A Survey. Nat. Lang. Process. 2023, 2, 100003. DOI: https://doi.org/10.1016/j.nlp.2022.100003
- Clavié, B.; Gal, K. EduBERT: Pretrained Deep Language Models for Learning Analytics. arXiv preprint 2019, arXiv:1912.00690.
- Li, M.; Ge, M.; Zhao, H.; et al. Modeling and Analysis of Learners’ Emotions and Behaviors Based on Online Forum Texts. Comput. Intell. Neurosci. 2022, 2022, 9696422. DOI: https://doi.org/10.1155/2022/9696422
- Giannakos, M.N.; Sharma, K.; Pappas, I.O.; et al. Multimodal Data as a Means to Understand the Learning Experience. Int. J. Inf. Manag. 2019, 48, 108–119. DOI: https://doi.org/10.1016/j.ijinfomgt.2019.02.003
- Rubio, M.A. Automated Prediction of Novice Programmer Performance Using Programming Trajectories. Artif. Intell. Educ. 2020, 12164, 268–272.
- Maphalala, M.C.; Mkhasibe, R.G.; Mncube, D.W. Exploring the Roles of AI-Powered E-Tutors in Enhancing Self-Directed Learning in Open Distance E-Learning Courses. Interdiscip. J. Educ. Res. 2025, 7, a12. DOI: https://doi.org/10.38140/ijer-2025.vol7.1.12
- Liu, S.; Abadia, R.; Strambi, A.; et al. Leveraging Student Confusion in Online Forum Posts to Enhance Student Engagement Using Text-Based Learning Analytics. In Proceedings of the ASCILITE 2025, Adelaide, Australia, 30 November–3 December 2025; pp. 221–230.
- Yee, M.; Roy, A.; Perdue, M.; et al. AI-Assisted Analysis of Content, Structure, and Sentiment in MOOC Discussion Forums. Front. Educ. 2023, 8, 1250846. DOI: https://doi.org/10.3389/feduc.2023.1250846
- Friedman, A.; Beasley, Z. Using Textual Analysis to Examine Student Engagement in Online Undergraduate Science Education. J. Stat. Data Sci. Educ. 2024, 1–11. DOI: https://doi.org/10.1080/26939169.2024.2410796
- Shoaib, M.; Sayed, N.; Singh, J.; et al. AI student success predictor: Enhancing personalized learning in campus management systems. Comput. Hum. Behav. 2024, 158, 108301. DOI: https://doi.org/10.1016/j.chb.2024.108301
- Mehenaoui, Z.; Lafifi, Y.; Zemmouri, L. Learning Behavior Analysis to Identify Learner’s Learning Style Based on Machine Learning Techniques. J. Univ. Comput. Sci. 2022, 28, 1193–1220. DOI: https://doi.org/10.3897/jucs.81518
- Awadh, W.A.; Sulaiman, R.B.; Mahmoud, M.A. Aspect-Based Sentiment Analysis in MOOCs: A Systematic Literature Review Introducing the MASC-MEF Framework. J. King Saud Univ. Comput. Inf. Sci. 2025, 37, 2. DOI: https://doi.org/10.1007/s44443-025-00018-1
- Angeioplastis, A.; Aliprantis, J.; Konstantakis, M.; et al. The Learning Style Decoder: FSLSM-Guided Behavior Mapping Meets Deep Neural Prediction in LMS Settings. Computers 2025, 14, 377. DOI: https://doi.org/10.3390/computers14090377
- Hashemi, S.E.; Gholian-Jouybari, F.; Hajiaghaei-Keshteli, M. A Fuzzy C-Means Algorithm for Optimizing Data Clustering. Expert Syst. Appl. 2023, 227, 120377.
- Lestari, A.; Lawi, A.; Thamrin, S.A.; et al. Automated Detection of Learning Styles Using Online Activities and Model Indicators. Int. J. Adv. Comput. Sci. Appl. 2024, 15.
- Ait Daoud, M.; Namir, A.; Talbi, M. FSLSM-Based Analysis of Student Performance Information in a Blended Learning Course Using Moodle LMS. Open Inf. Sci. 2024, 8, 20220163.
- Pineda-Arizmendi, M.G.; Hernández-Castañeda, Á.; García-Hernández, R.A.; et al. Automatic Identification of Learning Styles through Behavioral Patterns. In Proceedings of the Mexican Conference on Pattern Recognition, Tepic, Mexico, 21–24 June 2023; pp. 79–88.
- Najem, K.; Seghroucheni, Y.Z.; Ziti, S. Behavioral Clustering for Adaptive Learning: A Data-Driven Alternative to Static Learning Style Models. Int. J. Inf. Educ. Technol. 2026, 16, 196–204. DOI: https://doi.org/10.18178/ijiet.2026.16.1.2494
- Alzamzami, F. Towards Domain-Independent Multi-Lingual-Dialectal Online Social Behavior Modeling. PhD Thesis, University of Ottawa, Ottawa, ON, Canada, 2024. DOI: https://doi.org/10.20381/ruor-30206
- Hussain, T.; Yu, L.; Asim, M.; et al. Enhancing E-Learning Adaptability with Automated Learning Style Identification and Sentiment Analysis: A Hybrid Deep Learning Approach for Smart Education. Information 2024, 15, 277. DOI: https://doi.org/10.3390/info15050277
- Ezzaim, A.; Dahbi, A.; Haidine, A.; et al. Development, Implementation, and Evaluation of a Machine Learning-Based Multi-Factor Adaptive E-Learning System. IAENG Int. J. Comput. Sci. 2024, 51, 1250–1271.
- Hananto, A.R.; Musdholifah, A.; Wardoyo, R. Utilizing Support Vector Machine and Dimensionality Reduction to Identify Student Learning Styles within the Felder-Silverman Model. J. Appl. Data Sci. 2024, 5, 1495–1507.
- Petrov, P.; Milev, V.; Byalmarkova, P. Applying a Data Classification Model of Learning Style Prediction. In System Design in Software Engineering; Springer: Cham, Switzerland, 2024; pp. 116–125.
- Muhammad, B.A.; Jianping, W.; Gao, G.; et al. A Fuzzy C-Means Algorithm to Detect Learning Styles in Online Learning Environment. J. Netw. Netw. Appl. 2024, 4, 39–47.
- Essa, S.G.; Celik, T.; Human-Hendricks, N.E. Personalized Adaptive Learning Technologies Based on Machine Learning Techniques to Identify Learning Styles: A Systematic Literature Review. IEEE Access 2023, 11, 48392–48409. DOI: https://doi.org/10.1109/ACCESS.2023.3276439
- Muhammad, B.A.; Qi, C.; Wu, Z.; et al. An Evolving Learning Style Detection Approach for Online Education Using Bipartite Graph Embedding. Appl. Soft Comput. 2024, 152, 111230.
- Giamphy, E.; Guillaume, J.L.; Doucet, A.; et al. A Survey on Bipartite Graphs Embedding. Soc. Netw. Anal. Min. 2023, 13, 54. DOI: https://doi.org/10.1007/s13278-023-01058-z
- Manorat, P.; Tuarob, S.; Pongpaichet, S. Artificial intelligence in computer programming education: A systematic literature review. Comput. Educ.: Artif. Intell. 2025, 8, 100403. DOI: https://doi.org/10.1016/j.caeai.2025.100403
- Pires, J.P.J.; Correia, F.B.; Gomes, A.; et al. Predicting student performance in introductory programming courses. Computers 2024, 13, 219. DOI: https://doi.org/10.3390/computers13090219
- Zhang, V.Y.F.; Jeffries, B.; Koprinska, I. A machine learning approach for predicting student progress in online programming education. Int. J. Artif. Intell. Educ. 2025, 35, 3614–3644. DOI: https://doi.org/10.1007/s40593-025-00510-9
- Alonso-Fernández, C.; Calvo-Morata, A.; Freire, M.; et al. Applications of data science to game learning analytics data: A systematic literature review. Comput. Educ. 2019, 141, 103612.
- Alonso-Fernández, C.; Cano, A.R.; Calvo-Morata, A.; et al. Lessons learned applying learning analytics to assess serious games. Comput. Hum. Behav. 2019, 99, 301–309.
- Llanos, J.M.; Bucheli, V.A.; Restrepo-Calle, F. Early prediction of student performance in CS1 programming courses. PeerJ Comput. Sci. 2023, 9, e1655. DOI: https://doi.org/10.7717/peerj-cs.1655
- Suárez, C.G.H.; Llanos, J.; Bucheli, V.A. Predicting the final grade using a machine learning regression model: Insights from fifty percent of course grades in CS1 courses. PeerJ Comput. Sci. 2023, 9, e1689. DOI: https://doi.org/10.7717/peerj-cs.1689
- Gutiérrez-Benítez, R.; Vásquez-Guerra, A.; Carrasco-Sáez, J.L. Who Fails and Why: An Analysis of Student Trajectories and the Prediction of Undergraduate Performance in Programming Courses. Preprints 2026, 2026031704. DOI: https://doi.org/10.20944/preprints202603.1704.v1
- Chen, J.; Zhou, X.; Yao, J.; et al. Application of Machine Learning in Higher Education to Predict Students' Performance, Learning Engagement and Self-Efficacy: A Systematic Literature Review. Asian Educ. Dev. Stud. 2025, 14, 205–240. DOI: https://doi.org/10.1108/AEDS-08-2024-0166
- Hafdi, Z.S.; El Kafhali, S. A Comparative Evaluation of Machine Learning Methods for Predicting Student Outcomes in Coding Courses. AppliedMath 2025, 5, 75. DOI: https://doi.org/10.3390/appliedmath5020075
- Alshammari, M.T. Machine Learning-Enabled Personalization of Programming Learning Feedback. Int. J. Adv. Comput. Sci. Appl. 2025, 16. DOI: https://doi.org/10.14569/ijacsa.2025.01602108
- Choi, W.-C.; Lam, C.-T.; Pang, P.C.; et al. A Systematic Literature Review of Explainable Artificial Intelligence (XAI) for Interpreting Student Performance Prediction in Computer Science and STEM Education. In Proceedings of the 30th ACM Conference on Innovation and Technology in Computer Science Education V.1, Nijmegen, The Netherlands, 27 June–2 July 2025; pp. 221–227. DOI: https://doi.org/10.1145/3724363.3729027
- Jiang, Z.; Zhang, Z. From Black Box to Transparency: Enhancing Automated Interpreting Assessment with Explainable AI in College Classrooms. Res. Methods Appl. Linguist. 2025, 4, 100237. DOI: https://doi.org/10.1016/j.rmal.2025.100237
- Lu, M.; Hu, Z. Leveraging Multimodal Information for Web Front-End Development Instruction: Analyzing Effects on Cognitive Behavior, Interaction, and Persistent Learning. Information 2025, 16, 734. DOI: https://doi.org/10.3390/info16090734
- Barbierato, E.; Gatti, A. The Challenges of Machine Learning: A Critical Review. Electronics 2024, 13, 416. DOI: https://doi.org/10.3390/electronics13020416
- Mu, S.; Cui, M.; Huang, X. Multimodal data fusion in learning analytics: A systematic review. Sensors 2020, 20, 6856. DOI: https://doi.org/10.3390/s20236856
- Mangaroska, K.; Sharma, K.; Gasevic, D.; et al. Multimodal learning analytics to inform learning design: Lessons learned from computing education. J. Learn. Anal. 2020, 7, 79–97.
- Di Mitri, D.; Schneider, J.; Specht, M.; et al. From Signals to Knowledge: A Conceptual Model for Multimodal Learning Analytics. J. Comput. Assist. Learn. 2018, 34, 338–349.
- Hennessy, S. Analysing educational dialogue around shared artefacts in technology-mediated contexts: A new coding framework. Classr. Discourse 2025, 16, 172–206. DOI: https://doi.org/10.1080/19463014.2024.2339346
- Ouhaichi, H.; Spikol, D.; Vogel, B. Research Trends in Multimodal Learning Analytics: A Systematic Mapping Study. Comput. Educ. Artif. Intell. 2023, 4, 100136.
- Xu, W.; Wu, Y.; Ouyang, F. Multimodal Learning Analytics of Collaborative Patterns During Pair Programming in Higher Education. Int. J. Educ. Technol. High. Educ. 2023, 20, 8. DOI: https://doi.org/10.1186/s41239-022-00377-z
- Prinsloo, P.; Slade, S.; Khalil, M. Multimodal Learning Analytics—In-between Student Privacy and Encroachment: A Systematic Review. Br. J. Educ. Technol. 2023, 54, 1566–1586. DOI: https://doi.org/10.1111/bjet.13373
- Giannakos, M.; Cukurova, M. The Role of Learning Theory in Multimodal Learning Analytics. Br. J. Educ. Technol. 2023, 54, 1246–1267. DOI: https://doi.org/10.1111/bjet.13320
- Sellberg, C.; Sharma, A. Toward Multimodal Learning Analytics in Simulation-Based Collaborative Learning: A Design Ethnography of Maritime Training. Int. J. Comput.-Support. Collab. Learn. 2025, 20, 201–221. DOI: https://doi.org/10.1007/s11412-024-09435-2
- Acosta, H.; Lee, S.; Mott, B.; et al. Multimodal Learning Analytics for Predicting Student Collaboration Satisfaction in Collaborative Game-Based Learning. In Proceedings of the 17th International Conference on Educational Data Mining, Atlanta, GA, USA, July 2024.
- Yan, L.; Echeverria, V.; Jin, Y.; et al. Evidence-Based Multimodal Learning Analytics for Feedback and Reflection in Collaborative Learning. Br. J. Educ. Technol. 2024, 55, 1900–1925. DOI: https://doi.org/10.1111/bjet.13498
- Martinez-Maldonado, R.; Echeverria, V.; Fernandez-Nieto, G.; et al. Lessons Learnt from a Multimodal Learning Analytics Deployment In-the-Wild. ACM Trans. Comput.-Hum. Interact. 2023, 31, 1–41. DOI: https://doi.org/10.1145/3622784
- Chango, W.; Lara, J.A.; Cerezo, R.; et al. A Review on Data Fusion in Multimodal Learning Analytics and Educational Data Mining. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2022, 12, e1458. DOI: https://doi.org/10.1002/widm.1458
- Deng, J.H.; Zhao, Y. A Literature Review of Data-Driven Multimodal Learning Analytics in Education Based on CiteSpace. In Proceedings of the 2022 5th International Conference on Education Technology Management, Lincoln, UK, 16–18 December 2022; pp. 390–397. DOI: https://doi.org/10.1145/3582580.3582646
- Bhatti, A.; Angkan, P.; Behinaein, B.; et al. CLARE: Cognitive Load Assessment in Real-Time with Multimodal Data. IEEE Trans. Cogn. Dev. Syst. 2025, 17, 1337–1349. DOI: https://doi.org/10.1109/TCDS.2025.3555517

Download
