Identifying Mobile Malware Attacks on Android Devices Using Machine Learning Algorithms

Identificación de ataques de Malware móvil en dispositivos Android mediante algoritmos de aprendizaje automático

Main Article Content

Víctor Guzmán-Brand
Laura Gélvez-García

Abstract

Objective: to identify mobile malware attacks on Android devices using machine learning algorithms. Methodology: based on the KDD (Knowledge Discovery in Databases) process, a structured approach that organizes data mining in clearly defined stages. This model ensures precise control at each stage, allowing for efficient information extraction, transformation and analysis. Results: The LightGBM algorithm demonstrates, through the confusion matrix, its ability to efficiently process large volumes of data and multiple features, contributing to more accurate classification. In addition, this model excels in evaluation metrics, achieving optimal performance compared to other machine learning approaches. Discussion: Machine learning is key in cybersecurity, improving the detection of threats such as malware and DDoS attacks. LightGBM stands out for its efficiency achieving the best accuracy (94.4%), followed by XGBoost with high performance, but longer computation time. Random Forest, although faster, presents lower accuracy. Conclusions: Machine learning has revolutionized cybersecurity, strengthening the detection of threats such as malware and intrusions. In identifying mobile malware on Android, LightGBM stood out for its accuracy and efficiency in handling unbalanced data, outperforming other models in key metrics. However, the challenge remains balancing accuracy and resource consumption, especially on mobile devices with limited hardware.

Downloads

Download data is not yet available.

Article Details

References (SEE)

Ahmed, A., Saeed, M., Hamood, A., Alazab, A., & Ahmed, K. (2023). Comparative Study of Static Analysis and Machine Learning Approaches for Detecting Android Banking Malware. 2023 3rd International Conference on Emerging Smart Technologies and Applications (eSmarTA), 3, 01-08. https://doi.org/10.1109/eSmarTA59349.2023.10293602

Alkahtani, H., & Aldhyani, T. (2022). Artificial Intelligence Algorithms for Malware Detection in Android-Operated Mobile Devices. Sensors, 22(6), Article 6. https://doi.org/10.3390/s22062268

Álvarez, M., & Montoya, H. (2020). Ciberseguridad en las redes móviles de telecomunicaciones y su gestión de riesgos. Ingeniería y Desarrollo, 38(2), 279-297. https://doi.org/10.14482/inde.38.2.006.31

Bashir, S., Maqbool, F., Khan, F., & Abid, A. (2024). Hybrid machine learning model for malware analysis in android apps. Pervasive and Mobile Computing, 97(97), 101-121. https://doi.org/10.1016/j.pmcj.2023.101859

Cassinda, F. (2019). Caracterização de sistemas operacionais móveis celulares: Android, Symbian, iphone e Windows phone. Project Design and Management, 1(2), Article 2. https://doi.org/10.35992/pdm.v1i2.200

Espinosa-Zúñiga, J. (2020). Aplicación de algoritmos Random Forest y XGBoost en una base de solicitudes de tarjetas de crédito. Ingeniería Investigación y Tecnología, 21(3), 1-16. https://doi.org/10.22201/fi.25940732e.2020.21.3.022

Gainza, D., Reyes, D., Brito, H., Véliz, Y., & Pérez, Y. (2023). Técnicas de Aprendizaje Automático para la detección y prevención de amenazas de ciberseguridad. Proyecciones futuras. Revista Cubana de Ciencias Informáticas, 10(10), 5. https://rcci.uci.cu/?journal=rcci&page=article&op=view&path[]=2823

Ghazal, M., & Hammad, A. (2022). Application of knowledge discovery in database (KDD) techniques in cost overrun of construction projects. International Journal of Construction Management, 22(9), 1632-1646. https://doi.org/10.1080/15623599.2020.1738205

Gironés, J., Casas, J., Minguillón, J., & Caihuelas, R. (2017). Minería de datos Modelos y algoritmos. Editorial UOC (Oberta UOC Publishing, SL).

González, L. (2019). Machine Learning con Python Aprendizaje Supervisado. Independiente.

Guzmán-Brand, V., & Gélvez-García, L. (2024). Identificación de patrones a través de algoritmos de machine learning en los casos registrados de intentos suicidas en una ciudad de Colombia. Psicoespacios, 18(32), 50-65. https://doi.org/10.25057/21452776.1634

Guzman-Brand, V., & Gelvez-Garcia, L. (2025). Identificación de ataques de denegación de servicio distribuido (DDoS) mediante la integración de algoritmos de aprendizaje automático y arquitecturas de redes neuronales artificiales. Revista Ingeniería, Matemáticas y Ciencias de la Información, 12(23), Article 23. https://doi.org/10.21017/rimci.1116

IBM. (2024, junio 10). ¿Qué es el smishing (phishing por SMS)? | IBM. https://www.ibm.com/es-es/topics/smishing

Iqubal, A., & Payal, A. (2024). Malware Detection Technique for Android Devices Using Machine Learning Algorithms. 2024 International Conference on Computing, Sciences and Communications (ICCSC), 2, 1-6. https://doi.org/10.1109/ICCSC62048.2024.10830310

Jones, H. (2019). Minería de Datos Guía de Minería de Datos para Principiantes, que Incluye Aplicaciones para Negocios, Técnicas de Minería de Datos, Conceptos y Más. Editorial Privada.

kaspersky. (2017, noviembre 9). ¿Qué es el riskware? | Amenazas de seguridad en Internet. https://latam.kaspersky.com/resource-center/threats/riskware

Kaspersky. (2024, febrero 26). El informe anual de amenazas móviles de Kaspersky destaca la creciente prevalencia de los riesgos de seguridad móvil junto con el avance de herramientas y tecnologías maliciosas. latam.kaspersky. https://latam.kaspersky.com/about/press-releases/los-ataques-a-dispositivos-moviles-aumentaron-mas-del-50-en-2023

Kumari, A., & Sharma, I. (2023). SafeDroid: Safeguarding Android Mobile Phones from Adware and Banking Maldroid Attacks. 2023 International Conference on Sustainable Communication Networks and Application (ICSCNA), 2, 98-103. https://doi.org/10.1109/ICSCNA58489.2023.10370154

León, Á., Llinás, H., & Tilano, J. (2008). Análisis multivariado aplicando componentes principales al caso de los desplazados. Revista Ingenieria y Desarrollo, 23(23), 1-20. https://rcientificas.uninorte.edu.co/index.php/ingenieria/article/download/2098/4467?inline=1

Li, S., Jin, N., Dogani, A., Yang, Y., Zhang, M., & Gu, X. (2024). Enhancing LightGBM for Industrial Fault Warning: An Innovative Hybrid Algorithm. Processes, 12(1), Article 1. https://doi.org/10.3390/pr12010221

Liu, K., Xu, S., Xu, G., Zhang, M., Sun, D., & Liu, H. (2020). A Review of Android Malware Detection Approaches Based on Machine Learning. IEEE Access, 8(8), 124579-124607. IEEE Access. https://doi.org/10.1109/ACCESS.2020.3006143

Liu, Z., Wang, R., Japkowicz, N., Tang, D., Zhang, W., & Zhao, J. (2021). Research on unsupervised feature learning for Android malware detection based on Restricted Boltzmann Machines. Future Generation Computer Systems, 120, 91-108. https://doi.org/10.1016/j.future.2021.02.015

Llatas, C., Soust-Verdaguer, B., Castro, L., & Cagigas, D. (2024). Application of Knowledge Discovery in Databases (KDD) to environmental, economic, and social indicators used in BIM workflow to support sustainable design. Journal of Building Engineering, 91(45), 109546. https://doi.org/10.1016/j.jobe.2024.109546

Mahdavifar, S., Abdul, A. F., Fatemi, R., Alhadid, D., & Ghorbani, A. (2020). MalDroid 2020 | Conjuntos de datos | Investigación | Instituto Canadiense de Ciberseguridad | UNB (Versión Primera) [Dataset]. 18.a Conferencia internacional del IEEE sobre computación confiable, autónoma y segura (DASC). https://www.unb.ca/cic/datasets/maldroid-2020.html

Martínez, J., Gavilanes, Y., Gavilanes, T., & Lozano, M. (2018). Seguridad por capas frenar ataques de Smishing. Dominio de las Ciencias, 4(1), Article 1. https://doi.org/10.23857/dom.cien.pocaip.2017.4.1.enero.115-130

Martínez, J., & Rojas, L. (2015). Vulnerabilidad en dispositivos móviles con sistema operativo Android. Cuaderno activa, 7(7), 55-65. https://ojs.tdea.edu.co/index.php/cuadernoactiva/article/view/248

McElroy, S. (2024). Identifying Android Banking Malware Through Measurement of User Interface Complexity. 2024 IEEE International Conference on Cyber Security and Resilience (CSR), 2, 348-353. https://doi.org/10.1109/CSR61664.2024.10679403

Milosevic, N., Dehghantanha, A., & Choo, K.-K. (2017). Machine learning aided Android malware classification. Computers & Electrical Engineering, 61(61), 266-274. https://doi.org/10.1016/j.compeleceng.2017.02.013

Mohammed, A., & Awad, A. I. (2022). AdStop: Efficient flow-based mobile adware detection using machine learning. Computers & Security, 117(23), 102718. https://doi.org/10.1016/j.cose.2022.102718

Olguin, A., & Arana, J. (2024). Ataques a celulares a través del uso de aplicaciones móviles: Una revisión narrativa. TECNOCIENCIA Chihuahua, 18(3), Article 3. https://doi.org/10.54167/tch.v18i3.1584

Pincay-Ponce, J., De Giusti, A., Sánchez-Andrade, D., & Figueroa-Suárez, J. (2024). CatBoost: Aprendizaje automático de conjunto para la analítica de los factores socioeconómicos que inciden en el rendimiento escolar. TE & ET, 38(38), 1-20. https://teyet-revista.info.unlp.edu.ar/TEyET/article/view/2492

Quirumbay, D., Castillo, C., & Coronel, I. (2022). Una revisión del aprendizaje profundo aplicado a la ciberseguridad. Revista Científica y Tecnológica UPSE (RCTU), 9(1), 57-65. https://doi.org/10.26423/rctu.v9i1.671

Sacramento, L., Salcedo, G., & Mendoza, A. (2024). Técnicas de protección contra malware impulsadas por IA en entorno móviles | Campus. Revista científica tecnológica Campus, 29(38), 1-20. https://doi.org/10.24265/campus.2024.v29n38.04

Villarroel, E., & Gutiérrez-Cárdenas, J. (2024). Dynamic Malware Analysis Using Machine Learning-Based Detection Algorithms. Interfases, 19(19), Article 019. https://doi.org/10.26439/interfases2024.n19.7097

Wang, C., Liu, T., Zhao, Y., Zhang, L., Du, X., Li, L., & Wang, H. (2024). Towards Demystifying Android Adware: Dataset and Payload Location. Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering Workshops, 2, 167-175. https://doi.org/10.1145/3691621.3694948