Digital Technologies Research and Applications

Article

Machine Learning–Based Behavioral Analysis and Natural Language Mining for Computer Learning Development

Huang, D., Ma, Y., & Saltanat , B. (2026). Machine Learning–Based Behavioral Analysis and Natural Language Mining for Computer Learning Development. Digital Technologies Research and Applications, 5(2), 198–224. https://doi.org/10.54963/dtra.v5i2.2327

Authors

  • Da Huang

    Institute of the New Information Technologies, Kyrgyz State University named after I. Arabaev, Bishkek 720001, Kyrgyzstan
  • Yiming Ma

    Institute of the New Information Technologies, Kyrgyz State University named after I. Arabaev, Bishkek 720001, Kyrgyzstan
  • Biibosunova Saltanat

    Institute of the New Information Technologies, Kyrgyz State University named after I. Arabaev, Bishkek 720001, Kyrgyzstan

Received: 21 January 2026; Revised: 22 February 2026; Accepted: 10 March 2026; Published: 11 May 2026

Programming education continues to face significant challenges, with failure and dropout rates exceeding 30% in many introductory courses. Existing learning analytics approaches largely rely on static behavioral indicators derived from the Felder–Silverman Learning Style Model (FSLSM), which often fail to capture the temporal dynamics of learning and the syntactic complexity involved in programming activities. These limitations are particularly evident in detecting the Sequential/Global learning dimension and understanding how students interact with programming tasks over time. This study aims to address these limitations by proposing CAFNet (Crossmodal Attention Fusion Network), a multimodal learning analytics framework that integrates behavioral machine learning with natural language and code analysis. The proposed architecture combines Temporal Convolutional Networks to model behavioral indicators, CodeBERT for forum discourse representation, and Tree-Transformer models for Abstract Syntax Tree-based code analysis. A hierarchical cross-modal attention mechanism aligns these heterogeneous data sources, while Federated Supervised Contrastive Learning ensures privacy-preserving deployment across institutions under differential privacy constraints (ε = 0.5). The framework was evaluated using three heterogeneous datasets comprising 14,308 learners from programming education environments. Experimental results show that CAFNet achieved 91.7% classification accuracy with an AUC-ROC of 0.947, outperforming classical machine learning and deep learning baselines by 17.5%. The model achieved 94.1% accuracy for the Sequential/Global dimension, representing a major improvement over previous studies. Additionally, early at-risk prediction reached 88.9% accuracy at week four of the course. These findings demonstrate that integrating behavioral, linguistic, and programming data provides a scalable and privacy-compliant approach for intelligent educational systems supporting personalized learning and early academic intervention.

Keywords:

Cross-Modal Learning Behavioral Indicators Felder–Silverman Model Federated Learning Differential Privacy Programming Education

References

  1. Babulak, E. AI-Driven Approaches for Fully Automated Smart Engineering; IGI Global Scientific Publishing: Hershey, PA, USA, 2026.
  2. Babulak, E. Educational AI Humanoid Computing Devices for Cyber Nomads; IGI Global Scientific Publishing: Hershey, PA, USA, 2025.
  3. Ferreira, R.; Freitas, E.; Cabral, L.; et al. Words of Wisdom: A Journey through the Realm of Natural Language Processing for Learning Analytics—A Systematic Literature Review. J. Learn. Anal. 2024, 11, 82–105. DOI: https://doi.org/10.18608/jla.2024.8403
  4. Shaik, T.; Tao, X.; Dann, C.; et al. Sentiment Analysis and Opinion Mining on Educational Data: A Survey. Nat. Lang. Process. 2023, 2, 100003. DOI: https://doi.org/10.1016/j.nlp.2022.100003
  5. Clavié, B.; Gal, K. EduBERT: Pretrained Deep Language Models for Learning Analytics. arXiv preprint 2019, arXiv:1912.00690.
  6. Li, M.; Ge, M.; Zhao, H.; et al. Modeling and Analysis of Learners’ Emotions and Behaviors Based on Online Forum Texts. Comput. Intell. Neurosci. 2022, 2022, 9696422. DOI: https://doi.org/10.1155/2022/9696422
  7. Giannakos, M.N.; Sharma, K.; Pappas, I.O.; et al. Multimodal Data as a Means to Understand the Learning Experience. Int. J. Inf. Manag. 2019, 48, 108–119. DOI: https://doi.org/10.1016/j.ijinfomgt.2019.02.003
  8. Rubio, M.A. Automated Prediction of Novice Programmer Performance Using Programming Trajectories. Artif. Intell. Educ. 2020, 12164, 268–272.
  9. Maphalala, M.C.; Mkhasibe, R.G.; Mncube, D.W. Exploring the Roles of AI-Powered E-Tutors in Enhancing Self-Directed Learning in Open Distance E-Learning Courses. Interdiscip. J. Educ. Res. 2025, 7, a12. DOI: https://doi.org/10.38140/ijer-2025.vol7.1.12
  10. Liu, S.; Abadia, R.; Strambi, A.; et al. Leveraging Student Confusion in Online Forum Posts to Enhance Student Engagement Using Text-Based Learning Analytics. In Proceedings of the ASCILITE 2025, Adelaide, Australia, 30 November–3 December 2025; pp. 221–230.
  11. Yee, M.; Roy, A.; Perdue, M.; et al. AI-Assisted Analysis of Content, Structure, and Sentiment in MOOC Discussion Forums. Front. Educ. 2023, 8, 1250846. DOI: https://doi.org/10.3389/feduc.2023.1250846
  12. Friedman, A.; Beasley, Z. Using Textual Analysis to Examine Student Engagement in Online Undergraduate Science Education. J. Stat. Data Sci. Educ. 2024, 1–11. DOI: https://doi.org/10.1080/26939169.2024.2410796
  13. Shoaib, M.; Sayed, N.; Singh, J.; et al. AI student success predictor: Enhancing personalized learning in campus management systems. Comput. Hum. Behav. 2024, 158, 108301. DOI: https://doi.org/10.1016/j.chb.2024.108301
  14. Mehenaoui, Z.; Lafifi, Y.; Zemmouri, L. Learning Behavior Analysis to Identify Learner’s Learning Style Based on Machine Learning Techniques. J. Univ. Comput. Sci. 2022, 28, 1193–1220. DOI: https://doi.org/10.3897/jucs.81518
  15. Awadh, W.A.; Sulaiman, R.B.; Mahmoud, M.A. Aspect-Based Sentiment Analysis in MOOCs: A Systematic Literature Review Introducing the MASC-MEF Framework. J. King Saud Univ. Comput. Inf. Sci. 2025, 37, 2. DOI: https://doi.org/10.1007/s44443-025-00018-1
  16. Angeioplastis, A.; Aliprantis, J.; Konstantakis, M.; et al. The Learning Style Decoder: FSLSM-Guided Behavior Mapping Meets Deep Neural Prediction in LMS Settings. Computers 2025, 14, 377. DOI: https://doi.org/10.3390/computers14090377
  17. Hashemi, S.E.; Gholian-Jouybari, F.; Hajiaghaei-Keshteli, M. A Fuzzy C-Means Algorithm for Optimizing Data Clustering. Expert Syst. Appl. 2023, 227, 120377.
  18. Lestari, A.; Lawi, A.; Thamrin, S.A.; et al. Automated Detection of Learning Styles Using Online Activities and Model Indicators. Int. J. Adv. Comput. Sci. Appl. 2024, 15.
  19. Ait Daoud, M.; Namir, A.; Talbi, M. FSLSM-Based Analysis of Student Performance Information in a Blended Learning Course Using Moodle LMS. Open Inf. Sci. 2024, 8, 20220163.
  20. Pineda-Arizmendi, M.G.; Hernández-Castañeda, Á.; García-Hernández, R.A.; et al. Automatic Identification of Learning Styles through Behavioral Patterns. In Proceedings of the Mexican Conference on Pattern Recognition, Tepic, Mexico, 21–24 June 2023; pp. 79–88.
  21. Najem, K.; Seghroucheni, Y.Z.; Ziti, S. Behavioral Clustering for Adaptive Learning: A Data-Driven Alternative to Static Learning Style Models. Int. J. Inf. Educ. Technol. 2026, 16, 196–204. DOI: https://doi.org/10.18178/ijiet.2026.16.1.2494
  22. Alzamzami, F. Towards Domain-Independent Multi-Lingual-Dialectal Online Social Behavior Modeling. PhD Thesis, University of Ottawa, Ottawa, ON, Canada, 2024. DOI: https://doi.org/10.20381/ruor-30206
  23. Hussain, T.; Yu, L.; Asim, M.; et al. Enhancing E-Learning Adaptability with Automated Learning Style Identification and Sentiment Analysis: A Hybrid Deep Learning Approach for Smart Education. Information 2024, 15, 277. DOI: https://doi.org/10.3390/info15050277
  24. Ezzaim, A.; Dahbi, A.; Haidine, A.; et al. Development, Implementation, and Evaluation of a Machine Learning-Based Multi-Factor Adaptive E-Learning System. IAENG Int. J. Comput. Sci. 2024, 51, 1250–1271.
  25. Hananto, A.R.; Musdholifah, A.; Wardoyo, R. Utilizing Support Vector Machine and Dimensionality Reduction to Identify Student Learning Styles within the Felder-Silverman Model. J. Appl. Data Sci. 2024, 5, 1495–1507.
  26. Petrov, P.; Milev, V.; Byalmarkova, P. Applying a Data Classification Model of Learning Style Prediction. In System Design in Software Engineering; Springer: Cham, Switzerland, 2024; pp. 116–125.
  27. Muhammad, B.A.; Jianping, W.; Gao, G.; et al. A Fuzzy C-Means Algorithm to Detect Learning Styles in Online Learning Environment. J. Netw. Netw. Appl. 2024, 4, 39–47.
  28. Essa, S.G.; Celik, T.; Human-Hendricks, N.E. Personalized Adaptive Learning Technologies Based on Machine Learning Techniques to Identify Learning Styles: A Systematic Literature Review. IEEE Access 2023, 11, 48392–48409. DOI: https://doi.org/10.1109/ACCESS.2023.3276439
  29. Muhammad, B.A.; Qi, C.; Wu, Z.; et al. An Evolving Learning Style Detection Approach for Online Education Using Bipartite Graph Embedding. Appl. Soft Comput. 2024, 152, 111230.
  30. Giamphy, E.; Guillaume, J.L.; Doucet, A.; et al. A Survey on Bipartite Graphs Embedding. Soc. Netw. Anal. Min. 2023, 13, 54. DOI: https://doi.org/10.1007/s13278-023-01058-z
  31. Manorat, P.; Tuarob, S.; Pongpaichet, S. Artificial intelligence in computer programming education: A systematic literature review. Comput. Educ.: Artif. Intell. 2025, 8, 100403. DOI: https://doi.org/10.1016/j.caeai.2025.100403
  32. Pires, J.P.J.; Correia, F.B.; Gomes, A.; et al. Predicting student performance in introductory programming courses. Computers 2024, 13, 219. DOI: https://doi.org/10.3390/computers13090219
  33. Zhang, V.Y.F.; Jeffries, B.; Koprinska, I. A machine learning approach for predicting student progress in online programming education. Int. J. Artif. Intell. Educ. 2025, 35, 3614–3644. DOI: https://doi.org/10.1007/s40593-025-00510-9
  34. Alonso-Fernández, C.; Calvo-Morata, A.; Freire, M.; et al. Applications of data science to game learning analytics data: A systematic literature review. Comput. Educ. 2019, 141, 103612.
  35. Alonso-Fernández, C.; Cano, A.R.; Calvo-Morata, A.; et al. Lessons learned applying learning analytics to assess serious games. Comput. Hum. Behav. 2019, 99, 301–309.
  36. Llanos, J.M.; Bucheli, V.A.; Restrepo-Calle, F. Early prediction of student performance in CS1 programming courses. PeerJ Comput. Sci. 2023, 9, e1655. DOI: https://doi.org/10.7717/peerj-cs.1655
  37. Suárez, C.G.H.; Llanos, J.; Bucheli, V.A. Predicting the final grade using a machine learning regression model: Insights from fifty percent of course grades in CS1 courses. PeerJ Comput. Sci. 2023, 9, e1689. DOI: https://doi.org/10.7717/peerj-cs.1689
  38. Gutiérrez-Benítez, R.; Vásquez-Guerra, A.; Carrasco-Sáez, J.L. Who Fails and Why: An Analysis of Student Trajectories and the Prediction of Undergraduate Performance in Programming Courses. Preprints 2026, 2026031704. DOI: https://doi.org/10.20944/preprints202603.1704.v1
  39. Chen, J.; Zhou, X.; Yao, J.; et al. Application of Machine Learning in Higher Education to Predict Students' Performance, Learning Engagement and Self-Efficacy: A Systematic Literature Review. Asian Educ. Dev. Stud. 2025, 14, 205–240. DOI: https://doi.org/10.1108/AEDS-08-2024-0166
  40. Hafdi, Z.S.; El Kafhali, S. A Comparative Evaluation of Machine Learning Methods for Predicting Student Outcomes in Coding Courses. AppliedMath 2025, 5, 75. DOI: https://doi.org/10.3390/appliedmath5020075
  41. Alshammari, M.T. Machine Learning-Enabled Personalization of Programming Learning Feedback. Int. J. Adv. Comput. Sci. Appl. 2025, 16. DOI: https://doi.org/10.14569/ijacsa.2025.01602108
  42. Choi, W.-C.; Lam, C.-T.; Pang, P.C.; et al. A Systematic Literature Review of Explainable Artificial Intelligence (XAI) for Interpreting Student Performance Prediction in Computer Science and STEM Education. In Proceedings of the 30th ACM Conference on Innovation and Technology in Computer Science Education V.1, Nijmegen, The Netherlands, 27 June–2 July 2025; pp. 221–227. DOI: https://doi.org/10.1145/3724363.3729027
  43. Jiang, Z.; Zhang, Z. From Black Box to Transparency: Enhancing Automated Interpreting Assessment with Explainable AI in College Classrooms. Res. Methods Appl. Linguist. 2025, 4, 100237. DOI: https://doi.org/10.1016/j.rmal.2025.100237
  44. Lu, M.; Hu, Z. Leveraging Multimodal Information for Web Front-End Development Instruction: Analyzing Effects on Cognitive Behavior, Interaction, and Persistent Learning. Information 2025, 16, 734. DOI: https://doi.org/10.3390/info16090734
  45. Barbierato, E.; Gatti, A. The Challenges of Machine Learning: A Critical Review. Electronics 2024, 13, 416. DOI: https://doi.org/10.3390/electronics13020416
  46. Mu, S.; Cui, M.; Huang, X. Multimodal data fusion in learning analytics: A systematic review. Sensors 2020, 20, 6856. DOI: https://doi.org/10.3390/s20236856
  47. Mangaroska, K.; Sharma, K.; Gasevic, D.; et al. Multimodal learning analytics to inform learning design: Lessons learned from computing education. J. Learn. Anal. 2020, 7, 79–97.
  48. Di Mitri, D.; Schneider, J.; Specht, M.; et al. From Signals to Knowledge: A Conceptual Model for Multimodal Learning Analytics. J. Comput. Assist. Learn. 2018, 34, 338–349.
  49. Hennessy, S. Analysing educational dialogue around shared artefacts in technology-mediated contexts: A new coding framework. Classr. Discourse 2025, 16, 172–206. DOI: https://doi.org/10.1080/19463014.2024.2339346
  50. Ouhaichi, H.; Spikol, D.; Vogel, B. Research Trends in Multimodal Learning Analytics: A Systematic Mapping Study. Comput. Educ. Artif. Intell. 2023, 4, 100136.
  51. Xu, W.; Wu, Y.; Ouyang, F. Multimodal Learning Analytics of Collaborative Patterns During Pair Programming in Higher Education. Int. J. Educ. Technol. High. Educ. 2023, 20, 8. DOI: https://doi.org/10.1186/s41239-022-00377-z
  52. Prinsloo, P.; Slade, S.; Khalil, M. Multimodal Learning Analytics—In-between Student Privacy and Encroachment: A Systematic Review. Br. J. Educ. Technol. 2023, 54, 1566–1586. DOI: https://doi.org/10.1111/bjet.13373
  53. Giannakos, M.; Cukurova, M. The Role of Learning Theory in Multimodal Learning Analytics. Br. J. Educ. Technol. 2023, 54, 1246–1267. DOI: https://doi.org/10.1111/bjet.13320
  54. Sellberg, C.; Sharma, A. Toward Multimodal Learning Analytics in Simulation-Based Collaborative Learning: A Design Ethnography of Maritime Training. Int. J. Comput.-Support. Collab. Learn. 2025, 20, 201–221. DOI: https://doi.org/10.1007/s11412-024-09435-2
  55. Acosta, H.; Lee, S.; Mott, B.; et al. Multimodal Learning Analytics for Predicting Student Collaboration Satisfaction in Collaborative Game-Based Learning. In Proceedings of the 17th International Conference on Educational Data Mining, Atlanta, GA, USA, July 2024.
  56. Yan, L.; Echeverria, V.; Jin, Y.; et al. Evidence-Based Multimodal Learning Analytics for Feedback and Reflection in Collaborative Learning. Br. J. Educ. Technol. 2024, 55, 1900–1925. DOI: https://doi.org/10.1111/bjet.13498
  57. Martinez-Maldonado, R.; Echeverria, V.; Fernandez-Nieto, G.; et al. Lessons Learnt from a Multimodal Learning Analytics Deployment In-the-Wild. ACM Trans. Comput.-Hum. Interact. 2023, 31, 1–41. DOI: https://doi.org/10.1145/3622784
  58. Chango, W.; Lara, J.A.; Cerezo, R.; et al. A Review on Data Fusion in Multimodal Learning Analytics and Educational Data Mining. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2022, 12, e1458. DOI: https://doi.org/10.1002/widm.1458
  59. Deng, J.H.; Zhao, Y. A Literature Review of Data-Driven Multimodal Learning Analytics in Education Based on CiteSpace. In Proceedings of the 2022 5th International Conference on Education Technology Management, Lincoln, UK, 16–18 December 2022; pp. 390–397. DOI: https://doi.org/10.1145/3582580.3582646
  60. Bhatti, A.; Angkan, P.; Behinaein, B.; et al. CLARE: Cognitive Load Assessment in Real-Time with Multimodal Data. IEEE Trans. Cogn. Dev. Syst. 2025, 17, 1337–1349. DOI: https://doi.org/10.1109/TCDS.2025.3555517