Securing Online Platforms: Hybrid Machine Learning Approaches for URL Phishing Detections

Ugochukwu Onwudebelu; John O. Ugah; Samuel Elochukwu  Ezeh; Olusanjo Olugbemi

doi:10.54963/jic.v5i1.1693

Authors

Ugochukwu Onwudebelu
Department of Computer Science/Informatics, Alex Ekwueme Federal University Ndufu Alike (FUNAI), Abakaliki P.M.B. 1010, Nigeria
John O. Ugah
Department of Computer Science/Informatics, Alex Ekwueme Federal University Ndufu Alike (FUNAI), Abakaliki P.M.B. 1010, Nigeria
Department of Computer Science, Ebonyi State University, Abakaliki P.M.B. 053, Nigeria
Samuel Elochukwu Ezeh
Department of Computer Science/Informatics, Alex Ekwueme Federal University Ndufu Alike (FUNAI), Abakaliki P.M.B. 1010, Nigeria
Olusanjo Olugbemi
Department of Cybersecurity, School of Information and Communication Technology, Federal University of Tech‑ nology, Minna P.M.B 65, Nigeria

Received: 12 October 2025; Revised: 12 December 2025; Accepted: 16 December 2025; Published: 4 January 2026

Abstract:

Phishing attacks pose significant risks in the digital landscape, resulting in financial losses and sensitive information breaches. Traditional detection methods often struggle to keep pace with evolving threats, compromising their effectiveness. This study addresses these limitations by developing a robust detection system using a hybrid machine learning approach. We combine random forest, gradient boosting, and logistic regression algorithms to enhance phishing detection accuracy. A labeled dataset of URLs from Kaggle is utilized, with robust feature engineering extracting key attributes for model training. Following the CRISP-DM framework and leveraging Object-Oriented Programming principles, we develop a model that achieves strong performance metrics. The model's accuracy stands at 84%, with precision, recall, and F1-score values of 85%, 86%, and 84%, respectively. Notably, the model demonstrates excellent ability to differentiate between phishing and legitimate URLs, with an ROC AUC score of 91%. These results confirm the model's potential as a reliable phishing detection tool, capable of identifying phishing URLs effectively while minimizing false positives. Our research contributes to the development of more effective phishing detection strategies, ultimately safeguarding users and organizations from economic and reputational harm. By leveraging machine learning, we can develop more robust cybersecurity systems. Our proposed model can be seamlessly integrated into existing security frameworks to improve the detection of phishing threats.

Keywords:

Hybrid Machine Learning URL Classification Cybersecurity Random Forest Gradient Boosting Phishing Detection

References

Digital 2021: The Latest Insights Into the ‘State of Digital’. Available online: https://wearesocial.com/uk/blog/2021/01/digital-2021-the-latest-insights-into-the-state-of-digital/ (accessed on 5 July 2021).
Debas, E.; Alhumam, N.; Riad, K. Unveiling the Dynamic Landscape of Malware Sandboxing: A Comprehensive Review. Int. J. Adv. Comput. Sci. Appl. 2024, 15, 1402–1416. DOI: https://doi.org/10.14569/ijacsa.2024.01503137
Mahmoud, R. Redefining Malware Sandboxing: Enhancing Analysis Through Sysmon and ELK Integration. IEEE Access 2024, 12, 68624–68636. DOI: https://doi.org/10.1109/access.2024.3400167
CISA. Phishing: What’s in a Name? Available online: https://www.cisa.gov/news-events/news/phishing-whats-name (accessed on 19 September 2025).
CSI Today. Phishing/Scam Alert. Available online: https://csitoday.com/2024/05/phishing-scam-alert/ (accessed on 19 September 2025).
Gupta, B.B.; Tewari, A.; Jain, A.K.; et al. Fighting Against Phishing Attacks: State of the Art and Future Challenges. Neural Comput. Appl. 2017, 28, 3629–3654. DOI: https://doi.org/10.1007/s00521-016-2275-y
Abdelhamid, N.A.; Ayesh, F.; Thabtah, F. Phishing Detection Based Associative Classification Data Mining. Expert Syst. Appl. 2014, 41, 5948–5959. DOI: https://doi.org/10.1016/j.eswa.2014.03.019
APWG. Phishing Activity Trends Reports. Available online: https://apwg.org/trendsreports/ (accessed on 20 September 2025).
Sarker, I.H. Cyberlearning: Effectiveness Analysis of Machine Learning Security Modeling to Detect Cyber-Anomalies and Multi-Attacks. Internet of Things 2021, 14, 100393. DOI: https://doi.org/10.1016/j.iot.2021.100393
Artashyan, A. The Number of Internet Users Worldwide Reaches 4.66 Billion. Available online: https://www.gizchina.com/featured/the-number-of-internet-users-worldwide-reaches-4-66-billion (accessed on 15 July 2025).
Jain, A.K.; Gupta, B.B. A Machine Learning Based Approach for Phishing Detection Using Hyperlinks Information. J. Ambient Intell. Human Comput. 2019, 10, 2015–2028. DOI: https://doi.org/10.1007/s12652-018-0798-z
Rao, R.S.; Pais, A.R. Detection of Phishing Websites Using an Efficient Feature-Based Machine Learning Framework. Neural Comput. Appl. 2019, 31, 3851–3873. DOI: https://doi.org/10.1007/s00521-017-3305-0
Internet Crime Complaint Center (IC3). Internet Crime Report 2021. Available online: https://www.ic3.gov/AnnualReport/Reports/2021_ic3report.pdf (accessed on 10 July 2025).
Sarker, I.H. Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions. SN Comput. Sci. 2021, 2, 420. DOI: https://doi.org/10.1007/s42979-021-00815-1
Shi, Y.; Tian, Y.; Kou, G.; et al. Optimization Based Data Mining: Theory and Applications; Springer: London, UK, 2011. DOI: https://doi.org/10.1007/978-0-85729-504-0
Iqbal, H.; Sarker, A.C.; Han, J.; et al. Context-Aware Machine Learning and Mobile Data Analytics. Automated Rule-Based Services With Intelligent Decision Making; Springer: Cham, Switzerland, 2021. DOI: https://doi.org/10.1007/978-3-030-88530-4
Olson, D.L.; Shi, Y. Introduction to Business Data Mining; McGraw-Hill/Irwin: Boston, MA, USA, 2007.
Li, T.; Kou, G.; Peng, Y. Improving Malicious URLs Detection via Feature Engineering: Linear and Nonlinear Space Transformation Methods. Inf. Syst. 2020, 91, 10149419. DOI: https://doi.org/10.1016/j.is.2020.101494
Wardman, B.T.; Stallings, G.; Warner, A.; et al. High Performance Content-Based Phishing Attack Detection. In Proceedings of the 2011 eCrime Researchers Summit, San Diego, CA, USA, 7–9 November 2011; pp. 1–9. DOI: https://doi.org/10.1109/eCrime.2011.6151977
Chiew, K.L.; Chang, E.H.; Sze, S.N.; et al. Utilisation of Website Logo for Phishing Detection. Comput. Secur. 2015, 54, 16–26. DOI: https://doi.org/10.1016/j.cose.2015.07.006
Aydin, M.; Baykal, N. Feature Extraction and Classification Phishing Websites Based on URL. In Proceedings of the IEEE Conference on Communications and Network Security (CNS), Florence, Italy, 28–30 September 2015; pp. 769–770. DOI: https://doi.org/10.1109/CNS.2015.7346927
Sheng, S.; Magnien, B.; Kumaraguru, P.; et al. Anti-Phishing Phil: The Design and Evaluation of a Game That Teaches People Not to Fall for Phish. In Proceedings of the 3rd Symposium on Usable Privacy and Security, New York, NY, USA, 18–20 July 2007; pp. 88–99. DOI: https://doi.org/10.1145/1280680.1280692
Kumaraguru, P.; Sheng, S.; Acquisti, A.; et al. Teaching Johnny Not to Fall for Phish. ACM Trans. Internet Technol. 2010, 10, 1–31. DOI: https://doi.org/10.1145/1754393.1754396
Arachchilage, N.A.G.; Love, S. Security Awareness of Computer Users: A Phishing Threat Avoidance Perspective. Comput. Hum. Behav. 2014, 38, 304–312. DOI: https://doi.org/10.1016/j.chb.2014.05.046
Wang, X.; Zhang, R.; Yang, X.; et al. Voice Pharming Attack and the Trust of VoIP. In Proceedings of the 4th International Conference on Security and Privacy in Communication Networks, Istanbul, Turkey, 22–25 September 2008; pp. 1–11. DOI: https://doi.org/10.1145/1460877.1460908
Han, W.; Cao, Y.; Bertino, E.; et al. Using Automated Individual White-List to Protect Web Digital Identities. Expert Syst. Appl. 2012, 39, 11861–11869. DOI: http://dx.doi.org/10.1016/j.eswa.2012.02.020
Rosiello, A.P.E.; Kirda, E.; Kruegel, S.; et al. A Layout-Similarity-Based Approach for Detecting Phishing Pages. In Proceedings of the Third International Conference on Security and Privacy in Communications Networks and the Workshops - SecureComm, Nice, France, 17–21 September 2007; pp. 454–463. DOI: https://doi.org/10.1109/SECCOM.2007.4550367
Felegyhazi, M.; Kreibich, C.; Paxson, V. On the Potential of Proactive Domain Blacklisting. In Proceedings of the 3rd USENIX conference on Large-Scale Exploits and emergent Threats: Botnets, Spyware, Worms, and More, Berkeley, CA, USA, 27 April 2010. DOI: https://dl.acm.org/doi/10.5555/1855686.1855692
Mohammad, R.M.; Thabtah, F.; McCluskey, L. Predicting Phishing Websites Based on Self-Structuring Neural Network. Neural Comput. Appl. 2014, 25, 443–458. DOI: https://doi.org/10.1007/s00521-013-1490-z
Taeri, K.; Noseong, P.; Jiwon, H.; et al. Phishing URL Detection: A Network-Based Approach Robust to Evasion. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security (CCS ’22), Los Angeles, CA, USA, 7–11 November 2022; pp. 1769–178. DOI: https://doi.org/10.1145/3548606.3560615
Mao, J.; Tian, W.; Li, P.; et al. Phishing Website Detection Based on Effective CSS Features of Web Pages. In Wireless Algorithms, Systems, and Applications; Ma, L., Khreishah, A., Zhang, Y., et al., Eds.; Springer: Cham, Switzerland, 2017; 10251, pp. 1868–1878. DOI: https://doi.org/10.1007/978-3-319-60033-8_68
Feng, F.; Zhou, Q.; Shen, Z.; et al. The Application of a Novel Neural Network in the Detection of Phishing Websites. J. Ambient Intell. Human Comput. 2024, 15, 1865–1879. DOI: https://doi.org/10.1007/s12652-018-0786-3
Huang, Y.; Qin, J.; Wen, W. Phishing URL Detection via Capsule-Based Neural Network. In Proceedings of the 2019 IEEE 13th International Conference on Anti-Counterfeiting, Security, and Identification (ASID), Xiamen, China, 25–27 October 2019; pp. 22–26. DOI: https://doi.org/10.1109/icasid.2019.8925000
Xiao, X.; Xiao, W.; Zhang, D.; et al. Phishing Websites Detection via CNN and Multi-Head Self-Attention on Imbalanced Datasets. Comput. Secur. 2021, 108, 102372. DOI: https://doi.org/10.1016/j.cose.2021.102372
Aldakheel, E.A.; Zakariah, M.; Gashgari, G.; et al. A Deep Learning-Based Innovative Technique for Phishing Detection in Modern Security With Uniform Resource Locators. Sensors 2023, 23, 4403. DOI: https://doi.org/10.3390/s23094403
Yasin, A.; Abuhasan, A. An Intelligent Classification Model for Phishing Email Detection. Int. J. Network Secure Application 2019, 8, 55–72. DOI: https://doi.org/10.5121/ijnsa.2016.8405
Rao, R.S.; Vaishnavi, T.; Pais, A.R. CatchPhish: Detection of Phishing Websites by Inspecting URLs. J. Ambient Intell. Human Comput. 2020, 11, 813–825. DOI: https://doi.org/10.1007/s12652-019-01311-4
Babagoli, M.; Aghababa, M.P.; Solouk, V. Heuristic Nonlinear Regression Strategy for Detecting Phishing Websites. Soft Comput. 2019, 23, 4315–4327. DOI: https://doi.org/10.1007/s00500-018-3084-2
Abedin, N.F.; Bawm, R.; Sarwar, T.; et al. Phishing Attack Detection Using Machine Learning Classification Techniques. In Proceedings of the 3rd International Conference on Intelligent Sustainable Systems (ICISS), Thoothukudi, India, 3–5 December 2020; pp. 1125–1130. DOI: https://doi.org/10.1109/ICISS49785.2020.9315895
Ronish, N.; Fahim, A.; Shahbaz, P. PhishEmailLLM: A Meta Model Approach to Detect Phishing Emails by Leveraging LLMs and Machine Learning Models. In Proceedings of the 2025 Australasian Computer Science Week (ACSW 2025), Brisbane, Australia, 10–13 February 2025; pp. 19–29. DOI: https://doi.org/10.1145/3727166.3727169
Do, N.Q.; Selamat, A.; Krejcar, O.; et al. Deep Learning for Phishing Detection: Taxonomy, Current Challenges and Future Directions. IEEE Access 2022, 10, 36429–36463. DOI: https://doi.org/10.1109/ACCESS.2022.3151903
Feng, J.L.; Zou, O.; Ye, J.H. Web2Vesc: Phishing Webpage Detection Method Based on Multidimensional Features Driven by Deep Learning. IEEE Access 2020, 8, 221214–221224. DOI: https://doi.org/10.1109/ACCESS.2020.3043188
Venugopal, S.; Panale, S.Y.; Agarwal, M.; et al. Detection of Malicious URLs Through an Ensemble of Machine Learning Techniques. In Proceedings of the 2021 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE), Brisbane, Australia, 8–10 December 2021; pp. 1–6. DOI: https://doi.org/10.1109/CSDE53843.2021.9718370
Aljofey, A.; Jiang, Q.; Rasool, A.; et al. An Effective Detection Approach for Phishing Websites Using URL and HTML Features. Sci. Rep. 2022, 12, 8842. DOI: https://doi.org/10.1038/s41598-022-10841-5
Vecliuc, D.-D.; Artene, C.-G.; Tibeică, M.-N.; et al. An Experimental Study of Machine Learning for Phishing Detection. In Intelligent Information and Database Systems; Nguyen, N.T.; Chittayasothorn, S., Niyato, D., Trawiński, B., Eds.; Springer, Cham, 2021; 12672, pp. 427–439. DOI: https://doi.org/10.1007/978-3-030-73280-6_34
Opara, C.; Chen, Y.; Wei, B. Look Before You Leap: Detecting Phishing Web Pages by Exploiting Raw URL and HTML Characteristics. Expert Syst. Appl. 2023, 236, 121183. DOI: https://doi.org/10.1016/j.eswa.2023.121183
Lin, Y.; Liu, R.; Divakaran, D.M.; et al. Phishpedia: A Hybrid Deep Learning Based Approach to Visually Identify Phishing Webpages. In Proceedings of the 30th USENIX Security Symposium, Online, 11–13 August 2021. Available online: https://www.usenix.org/system/files/sec21fall-lin.pdf
Hong, J.; Kim, T.; Liu, J.; et al. Phishing URL Detection With Lexical Features and Blacklisted Domains. In Adaptive Autonomous Secure Cyber Systems; Jajodia, S., Cybenko, G., Subrahmanian, V., et al., Eds.; Springer: Cham, Switzerland, 2020; pp. 253–267. DOI: https://doi.org/10.1007/978-3-030-33432-1_12
Sahoo, D.; Liu, C.; Hoi, S. Malicious URL Detection Using Machine Learning: A Survey. arXiv preprint 2022, arXiv:1701.07179. DOI: https://doi.org/10.48550/arXiv.1701.07179.
Ravindu, D.-S.; Nabeel, M.; Elvitigala, C.; et al. Compromised or Attacker-Owned: A Large Scale Classification and Study of Hosting Domains of Malicious URLs. In Proceedings of the 30th USENIX Security Symposium, Online, 11–13 August 2021. Available online: https://www.usenix.org/conference/usenixsecurity21/presentation/desilva
Chen, Z.; Wu, L.; Hu, Y.; et al. Lifting the Grey Curtain: Analyzing the Ecosystem of Android Scam Apps. IEEE TDSC 2023, 21, 3406–3421. DOI: https://doi.org/10.1109/TDSC.2023.3329205
Hong, G.Z.; Yang, S.; Yang, X.; et al. Analyzing Ground-Truth Data of Mobile Gambling Scams. In Proceedings of the 2022 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 22–26 May 2022. DOI: https://doi.org/10.1109/SP46214.2022.9833665
Sharma, A. More Than 200 Cryptomining Packages Flood npm and PyPI Registry. Available online: https://blog.sonatype.com/more-than-200-cryptominers-flood-npm-and-pypi-registry (accessed on 1 May 2023).
Sun, X.; Gao, X.; Cao, S.; et al. 1+1>2: Integrating Deep Code Behaviors with Metadata Features for Malicious PyPI Package Detection. In Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering, Sacramento, CA, USA, 27 October 2024; pp. 1159–1170. DOI: https://doi.org/10.1145/3691620.3695493
Guo, W.; Xu, Z.; Liu, C.; et al. An Empirical Study of Malicious Code in PyPI Ecosystem. In Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering, Echternach, Luxembourg, 11–15 November 2023; pp. 166–177.
Maci, A.; Santorsola, A.; Coscia, A.; et al. Unbalanced Web Phishing Classification Through Deep Reinforcement Learning. Computers 2023, 12, 118. DOI: https://www.mdpi.com/2073-431X/12/6/118
Samet. What is Virus Total? Available online: https://medium.com/@sametyorulmaz777/what-is-virus-total-70c64b7c5e95 (accessed on 10 November 2024).
VirusTotal. VT Intelligence. Available online: https://www.virustotal.com/gui/intelligence-overview (accessed on 10 November 2024).
VirusTotal. VirusTotal. Available online: https://www.virustotal.com/gui/home/upload (accessed on 17 February 2025).
Onwudebelu, U.; Ugah, J.O.; Eze, S.E.; et al. Developing a Security-Driven Hybrid Model for Detecting Malicious URLs. In Proceedings of the 2025 Conference of the Society for the Advancement of ICT & Comparative Knowledge (SOCTHADICKconf’25), Ibadan, Nigeria, 16–19 November 2025.
Roy, S.S.; Awad, A.I.; Amare, L.A.; et al. Multimodel Phishing URL Detection Using LSTM, Bidirectional LSTM, and GRU Models. Future Internet 2022, 14, 340. DOI: https://doi.org/10.3390/fi14110340
Maneriker, P.; Stokes, J.W.; Lazo, E.G.; et al. URLTran: Improving Phishing URL Detection Using Transformers. In Proceedings of the MILCOM 2021–2021 IEEE Military Communications Conference (MILCOM), San Diego, CA, USA, 29 November–2 December 2021; pp. 197–204. DOI: https://doi.org/10.1109/MILCOM52596.2021.9653028
Alkawaz, M.H.; Steven, S.J.; Hajamydeen, A.I. Detecting Phishing Website Using Machine Learning. In Proceedings of the 2020 16th IEEE International Colloquium on Signal Processing & Its Applications (CSPA), Langkawi, Malaysia, 28–29 February 2020; pp. 111–114. DOI: https://doi.org/10.1109/CSPA48992.2020.9068728

Journal of Intelligent Communication

Article

Securing Online Platforms: Hybrid Machine Learning Approaches for URL Phishing Detections

Downloads

Authors

Keywords:

References