Article
Securing Online Platforms: Hybrid Machine Learning Approaches for URL Phishing Detections


This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright
The authors shall retain the copyright of their work but allow the Publisher to publish, copy, distribute, and convey the work.
License
Journal of Intelligent Communication (JIC) publishes accepted manuscripts under Creative Commons Attribution 4.0 International (CC BY 4.0). Authors who submit their papers for publication by Journal of Intelligent Communication (JIC) agree to have the CC BY 4.0 license applied to their work, and that anyone is allowed to reuse the article or part of it free of charge for any purpose, including commercial use. As long as the author and original source is properly cited, anyone may copy, redistribute, reuse and transform the content.
Received: 12 October 2025; Revised: 12 December 2025; Accepted: 16 December 2025; Published: 4 January 2026
Phishing attacks pose significant risks in the digital landscape, resulting in financial losses and sensitive information breaches. Traditional detection methods often struggle to keep pace with evolving threats, compromising their effectiveness. This study addresses these limitations by developing a robust detection system using a hybrid machine learning approach. We combine random forest, gradient boosting, and logistic regression algorithms to enhance phishing detection accuracy. A labeled dataset of URLs from Kaggle is utilized, with robust feature engineering extracting key attributes for model training. Following the CRISP-DM framework and leveraging Object-Oriented Programming principles, we develop a model that achieves strong performance metrics. The model's accuracy stands at 84%, with precision, recall, and F1-score values of 85%, 86%, and 84%, respectively. Notably, the model demonstrates excellent ability to differentiate between phishing and legitimate URLs, with an ROC AUC score of 91%. These results confirm the model's potential as a reliable phishing detection tool, capable of identifying phishing URLs effectively while minimizing false positives. Our research contributes to the development of more effective phishing detection strategies, ultimately safeguarding users and organizations from economic and reputational harm. By leveraging machine learning, we can develop more robust cybersecurity systems. Our proposed model can be seamlessly integrated into existing security frameworks to improve the detection of phishing threats.
Keywords:
Hybrid Machine Learning URL Classification Cybersecurity Random Forest Gradient Boosting Phishing DetectionReferences
- Digital 2021: The Latest Insights Into the ‘State of Digital’. Available online: https://wearesocial.com/uk/blog/2021/01/digital-2021-the-latest-insights-into-the-state-of-digital/ (accessed on 5 July 2021).
- Debas, E.; Alhumam, N.; Riad, K. Unveiling the Dynamic Landscape of Malware Sandboxing: A Comprehensive Review. Int. J. Adv. Comput. Sci. Appl. 2024, 15, 1402–1416. DOI: https://doi.org/10.14569/ijacsa.2024.01503137
- Mahmoud, R. Redefining Malware Sandboxing: Enhancing Analysis Through Sysmon and ELK Integration. IEEE Access 2024, 12, 68624–68636. DOI: https://doi.org/10.1109/access.2024.3400167
- CISA. Phishing: What’s in a Name? Available online: https://www.cisa.gov/news-events/news/phishing-whats-name (accessed on 19 September 2025).
- CSI Today. Phishing/Scam Alert. Available online: https://csitoday.com/2024/05/phishing-scam-alert/ (accessed on 19 September 2025).
- Gupta, B.B.; Tewari, A.; Jain, A.K.; et al. Fighting Against Phishing Attacks: State of the Art and Future Challenges. Neural Comput. Appl. 2017, 28, 3629–3654. DOI: https://doi.org/10.1007/s00521-016-2275-y
- Abdelhamid, N.A.; Ayesh, F.; Thabtah, F. Phishing Detection Based Associative Classification Data Mining. Expert Syst. Appl. 2014, 41, 5948–5959. DOI: https://doi.org/10.1016/j.eswa.2014.03.019
- APWG. Phishing Activity Trends Reports. Available online: https://apwg.org/trendsreports/ (accessed on 20 September 2025).
- Sarker, I.H. Cyberlearning: Effectiveness Analysis of Machine Learning Security Modeling to Detect Cyber-Anomalies and Multi-Attacks. Internet of Things 2021, 14, 100393. DOI: https://doi.org/10.1016/j.iot.2021.100393
- Artashyan, A. The Number of Internet Users Worldwide Reaches 4.66 Billion. Available online: https://www.gizchina.com/featured/the-number-of-internet-users-worldwide-reaches-4-66-billion (accessed on 15 July 2025).
- Jain, A.K.; Gupta, B.B. A Machine Learning Based Approach for Phishing Detection Using Hyperlinks Information. J. Ambient Intell. Human Comput. 2019, 10, 2015–2028. DOI: https://doi.org/10.1007/s12652-018-0798-z
- Rao, R.S.; Pais, A.R. Detection of Phishing Websites Using an Efficient Feature-Based Machine Learning Framework. Neural Comput. Appl. 2019, 31, 3851–3873. DOI: https://doi.org/10.1007/s00521-017-3305-0
- Internet Crime Complaint Center (IC3). Internet Crime Report 2021. Available online: https://www.ic3.gov/AnnualReport/Reports/2021_ic3report.pdf (accessed on 10 July 2025).
- Sarker, I.H. Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions. SN Comput. Sci. 2021, 2, 420. DOI: https://doi.org/10.1007/s42979-021-00815-1
- Shi, Y.; Tian, Y.; Kou, G.; et al. Optimization Based Data Mining: Theory and Applications; Springer: London, UK, 2011. DOI: https://doi.org/10.1007/978-0-85729-504-0
- Iqbal, H.; Sarker, A.C.; Han, J.; et al. Context-Aware Machine Learning and Mobile Data Analytics. Automated Rule-Based Services With Intelligent Decision Making; Springer: Cham, Switzerland, 2021. DOI: https://doi.org/10.1007/978-3-030-88530-4
- Olson, D.L.; Shi, Y. Introduction to Business Data Mining; McGraw-Hill/Irwin: Boston, MA, USA, 2007.
- Li, T.; Kou, G.; Peng, Y. Improving Malicious URLs Detection via Feature Engineering: Linear and Nonlinear Space Transformation Methods. Inf. Syst. 2020, 91, 10149419. DOI: https://doi.org/10.1016/j.is.2020.101494
- Wardman, B.T.; Stallings, G.; Warner, A.; et al. High Performance Content-Based Phishing Attack Detection. In Proceedings of the 2011 eCrime Researchers Summit, San Diego, CA, USA, 7–9 November 2011; pp. 1–9. DOI: https://doi.org/10.1109/eCrime.2011.6151977
- Chiew, K.L.; Chang, E.H.; Sze, S.N.; et al. Utilisation of Website Logo for Phishing Detection. Comput. Secur. 2015, 54, 16–26. DOI: https://doi.org/10.1016/j.cose.2015.07.006
- Aydin, M.; Baykal, N. Feature Extraction and Classification Phishing Websites Based on URL. In Proceedings of the IEEE Conference on Communications and Network Security (CNS), Florence, Italy, 28–30 September 2015; pp. 769–770. DOI: https://doi.org/10.1109/CNS.2015.7346927
- Sheng, S.; Magnien, B.; Kumaraguru, P.; et al. Anti-Phishing Phil: The Design and Evaluation of a Game That Teaches People Not to Fall for Phish. In Proceedings of the 3rd Symposium on Usable Privacy and Security, New York, NY, USA, 18–20 July 2007; pp. 88–99. DOI: https://doi.org/10.1145/1280680.1280692
- Kumaraguru, P.; Sheng, S.; Acquisti, A.; et al. Teaching Johnny Not to Fall for Phish. ACM Trans. Internet Technol. 2010, 10, 1–31. DOI: https://doi.org/10.1145/1754393.1754396
- Arachchilage, N.A.G.; Love, S. Security Awareness of Computer Users: A Phishing Threat Avoidance Perspective. Comput. Hum. Behav. 2014, 38, 304–312. DOI: https://doi.org/10.1016/j.chb.2014.05.046
- Wang, X.; Zhang, R.; Yang, X.; et al. Voice Pharming Attack and the Trust of VoIP. In Proceedings of the 4th International Conference on Security and Privacy in Communication Networks, Istanbul, Turkey, 22–25 September 2008; pp. 1–11. DOI: https://doi.org/10.1145/1460877.1460908
- Han, W.; Cao, Y.; Bertino, E.; et al. Using Automated Individual White-List to Protect Web Digital Identities. Expert Syst. Appl. 2012, 39, 11861–11869. DOI: http://dx.doi.org/10.1016/j.eswa.2012.02.020
- Rosiello, A.P.E.; Kirda, E.; Kruegel, S.; et al. A Layout-Similarity-Based Approach for Detecting Phishing Pages. In Proceedings of the Third International Conference on Security and Privacy in Communications Networks and the Workshops - SecureComm, Nice, France, 17–21 September 2007; pp. 454–463. DOI: https://doi.org/10.1109/SECCOM.2007.4550367
- Felegyhazi, M.; Kreibich, C.; Paxson, V. On the Potential of Proactive Domain Blacklisting. In Proceedings of the 3rd USENIX conference on Large-Scale Exploits and emergent Threats: Botnets, Spyware, Worms, and More, Berkeley, CA, USA, 27 April 2010. DOI: https://dl.acm.org/doi/10.5555/1855686.1855692
- Mohammad, R.M.; Thabtah, F.; McCluskey, L. Predicting Phishing Websites Based on Self-Structuring Neural Network. Neural Comput. Appl. 2014, 25, 443–458. DOI: https://doi.org/10.1007/s00521-013-1490-z
- Taeri, K.; Noseong, P.; Jiwon, H.; et al. Phishing URL Detection: A Network-Based Approach Robust to Evasion. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security (CCS ’22), Los Angeles, CA, USA, 7–11 November 2022; pp. 1769–178. DOI: https://doi.org/10.1145/3548606.3560615
- Mao, J.; Tian, W.; Li, P.; et al. Phishing Website Detection Based on Effective CSS Features of Web Pages. In Wireless Algorithms, Systems, and Applications; Ma, L., Khreishah, A., Zhang, Y., et al., Eds.; Springer: Cham, Switzerland, 2017; 10251, pp. 1868–1878. DOI: https://doi.org/10.1007/978-3-319-60033-8_68
- Feng, F.; Zhou, Q.; Shen, Z.; et al. The Application of a Novel Neural Network in the Detection of Phishing Websites. J. Ambient Intell. Human Comput. 2024, 15, 1865–1879. DOI: https://doi.org/10.1007/s12652-018-0786-3
- Huang, Y.; Qin, J.; Wen, W. Phishing URL Detection via Capsule-Based Neural Network. In Proceedings of the 2019 IEEE 13th International Conference on Anti-Counterfeiting, Security, and Identification (ASID), Xiamen, China, 25–27 October 2019; pp. 22–26. DOI: https://doi.org/10.1109/icasid.2019.8925000
- Xiao, X.; Xiao, W.; Zhang, D.; et al. Phishing Websites Detection via CNN and Multi-Head Self-Attention on Imbalanced Datasets. Comput. Secur. 2021, 108, 102372. DOI: https://doi.org/10.1016/j.cose.2021.102372
- Aldakheel, E.A.; Zakariah, M.; Gashgari, G.; et al. A Deep Learning-Based Innovative Technique for Phishing Detection in Modern Security With Uniform Resource Locators. Sensors 2023, 23, 4403. DOI: https://doi.org/10.3390/s23094403
- Yasin, A.; Abuhasan, A. An Intelligent Classification Model for Phishing Email Detection. Int. J. Network Secure Application 2019, 8, 55–72. DOI: https://doi.org/10.5121/ijnsa.2016.8405
- Rao, R.S.; Vaishnavi, T.; Pais, A.R. CatchPhish: Detection of Phishing Websites by Inspecting URLs. J. Ambient Intell. Human Comput. 2020, 11, 813–825. DOI: https://doi.org/10.1007/s12652-019-01311-4
- Babagoli, M.; Aghababa, M.P.; Solouk, V. Heuristic Nonlinear Regression Strategy for Detecting Phishing Websites. Soft Comput. 2019, 23, 4315–4327. DOI: https://doi.org/10.1007/s00500-018-3084-2
- Abedin, N.F.; Bawm, R.; Sarwar, T.; et al. Phishing Attack Detection Using Machine Learning Classification Techniques. In Proceedings of the 3rd International Conference on Intelligent Sustainable Systems (ICISS), Thoothukudi, India, 3–5 December 2020; pp. 1125–1130. DOI: https://doi.org/10.1109/ICISS49785.2020.9315895
- Ronish, N.; Fahim, A.; Shahbaz, P. PhishEmailLLM: A Meta Model Approach to Detect Phishing Emails by Leveraging LLMs and Machine Learning Models. In Proceedings of the 2025 Australasian Computer Science Week (ACSW 2025), Brisbane, Australia, 10–13 February 2025; pp. 19–29. DOI: https://doi.org/10.1145/3727166.3727169
- Do, N.Q.; Selamat, A.; Krejcar, O.; et al. Deep Learning for Phishing Detection: Taxonomy, Current Challenges and Future Directions. IEEE Access 2022, 10, 36429–36463. DOI: https://doi.org/10.1109/ACCESS.2022.3151903
- Feng, J.L.; Zou, O.; Ye, J.H. Web2Vesc: Phishing Webpage Detection Method Based on Multidimensional Features Driven by Deep Learning. IEEE Access 2020, 8, 221214–221224. DOI: https://doi.org/10.1109/ACCESS.2020.3043188
- Venugopal, S.; Panale, S.Y.; Agarwal, M.; et al. Detection of Malicious URLs Through an Ensemble of Machine Learning Techniques. In Proceedings of the 2021 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE), Brisbane, Australia, 8–10 December 2021; pp. 1–6. DOI: https://doi.org/10.1109/CSDE53843.2021.9718370
- Aljofey, A.; Jiang, Q.; Rasool, A.; et al. An Effective Detection Approach for Phishing Websites Using URL and HTML Features. Sci. Rep. 2022, 12, 8842. DOI: https://doi.org/10.1038/s41598-022-10841-5
- Vecliuc, D.-D.; Artene, C.-G.; Tibeică, M.-N.; et al. An Experimental Study of Machine Learning for Phishing Detection. In Intelligent Information and Database Systems; Nguyen, N.T.; Chittayasothorn, S., Niyato, D., Trawiński, B., Eds.; Springer, Cham, 2021; 12672, pp. 427–439. DOI: https://doi.org/10.1007/978-3-030-73280-6_34
- Opara, C.; Chen, Y.; Wei, B. Look Before You Leap: Detecting Phishing Web Pages by Exploiting Raw URL and HTML Characteristics. Expert Syst. Appl. 2023, 236, 121183. DOI: https://doi.org/10.1016/j.eswa.2023.121183
- Lin, Y.; Liu, R.; Divakaran, D.M.; et al. Phishpedia: A Hybrid Deep Learning Based Approach to Visually Identify Phishing Webpages. In Proceedings of the 30th USENIX Security Symposium, Online, 11–13 August 2021. Available online: https://www.usenix.org/system/files/sec21fall-lin.pdf
- Hong, J.; Kim, T.; Liu, J.; et al. Phishing URL Detection With Lexical Features and Blacklisted Domains. In Adaptive Autonomous Secure Cyber Systems; Jajodia, S., Cybenko, G., Subrahmanian, V., et al., Eds.; Springer: Cham, Switzerland, 2020; pp. 253–267. DOI: https://doi.org/10.1007/978-3-030-33432-1_12
- Sahoo, D.; Liu, C.; Hoi, S. Malicious URL Detection Using Machine Learning: A Survey. arXiv preprint 2022, arXiv:1701.07179. DOI: https://doi.org/10.48550/arXiv.1701.07179.
- Ravindu, D.-S.; Nabeel, M.; Elvitigala, C.; et al. Compromised or Attacker-Owned: A Large Scale Classification and Study of Hosting Domains of Malicious URLs. In Proceedings of the 30th USENIX Security Symposium, Online, 11–13 August 2021. Available online: https://www.usenix.org/conference/usenixsecurity21/presentation/desilva
- Chen, Z.; Wu, L.; Hu, Y.; et al. Lifting the Grey Curtain: Analyzing the Ecosystem of Android Scam Apps. IEEE TDSC 2023, 21, 3406–3421. DOI: https://doi.org/10.1109/TDSC.2023.3329205
- Hong, G.Z.; Yang, S.; Yang, X.; et al. Analyzing Ground-Truth Data of Mobile Gambling Scams. In Proceedings of the 2022 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 22–26 May 2022. DOI: https://doi.org/10.1109/SP46214.2022.9833665
- Sharma, A. More Than 200 Cryptomining Packages Flood npm and PyPI Registry. Available online: https://blog.sonatype.com/more-than-200-cryptominers-flood-npm-and-pypi-registry (accessed on 1 May 2023).
- Sun, X.; Gao, X.; Cao, S.; et al. 1+1>2: Integrating Deep Code Behaviors with Metadata Features for Malicious PyPI Package Detection. In Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering, Sacramento, CA, USA, 27 October 2024; pp. 1159–1170. DOI: https://doi.org/10.1145/3691620.3695493
- Guo, W.; Xu, Z.; Liu, C.; et al. An Empirical Study of Malicious Code in PyPI Ecosystem. In Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering, Echternach, Luxembourg, 11–15 November 2023; pp. 166–177.
- Maci, A.; Santorsola, A.; Coscia, A.; et al. Unbalanced Web Phishing Classification Through Deep Reinforcement Learning. Computers 2023, 12, 118. DOI: https://www.mdpi.com/2073-431X/12/6/118
- Samet. What is Virus Total? Available online: https://medium.com/@sametyorulmaz777/what-is-virus-total-70c64b7c5e95 (accessed on 10 November 2024).
- VirusTotal. VT Intelligence. Available online: https://www.virustotal.com/gui/intelligence-overview (accessed on 10 November 2024).
- VirusTotal. VirusTotal. Available online: https://www.virustotal.com/gui/home/upload (accessed on 17 February 2025).
- Onwudebelu, U.; Ugah, J.O.; Eze, S.E.; et al. Developing a Security-Driven Hybrid Model for Detecting Malicious URLs. In Proceedings of the 2025 Conference of the Society for the Advancement of ICT & Comparative Knowledge (SOCTHADICKconf’25), Ibadan, Nigeria, 16–19 November 2025.
- Roy, S.S.; Awad, A.I.; Amare, L.A.; et al. Multimodel Phishing URL Detection Using LSTM, Bidirectional LSTM, and GRU Models. Future Internet 2022, 14, 340. DOI: https://doi.org/10.3390/fi14110340
- Maneriker, P.; Stokes, J.W.; Lazo, E.G.; et al. URLTran: Improving Phishing URL Detection Using Transformers. In Proceedings of the MILCOM 2021–2021 IEEE Military Communications Conference (MILCOM), San Diego, CA, USA, 29 November–2 December 2021; pp. 197–204. DOI: https://doi.org/10.1109/MILCOM52596.2021.9653028
- Alkawaz, M.H.; Steven, S.J.; Hajamydeen, A.I. Detecting Phishing Website Using Machine Learning. In Proceedings of the 2020 16th IEEE International Colloquium on Signal Processing & Its Applications (CSPA), Langkawi, Malaysia, 28–29 February 2020; pp. 111–114. DOI: https://doi.org/10.1109/CSPA48992.2020.9068728

Download
