La revolución de los modelos transformadores en procesamiento de lenguaje natural: Un análisis comparativo de arquitecturas y aplicaciones

The revolution of transformer models in natural language processing: A comparative analysis of architectures and applications

Contenido principal del artículo

Carlos Alberto Gómez Cano
Carlos Alberto Pacheco Sánchez

Resumen

El presente artículo ofrece un análisis del impacto de los modelos transformadores (transformer models) en el procesamiento de lenguaje natural, mediante la comparación de sus principales arquitecturas y aplicaciones. Para ello, se realizó una revisión documental de artículos científicos en español e inglés indexados en la base de datos Scopus entre 2018 y 2022. Se seleccionaron estudios que abordaran avances teóricos, implementaciones prácticas y desafíos asociados a estos modelos. La metodología empleada incluyó un análisis cualitativo centrado en cuatro ejes temáticos: evolución arquitectónica, eficiencia computacional, aplicaciones en traducción automática y generación de texto, así como limitaciones éticas y sesgos. Los resultados evidencian que estos modelos han revolucionado el procesamiento de lenguaje natural gracias a su capacidad para capturar el contexto lingüístico de manera eficiente. No obstante, persisten desafíos relacionados con la escalabilidad y equidad algorítmica. Se concluye que, pese a su superioridad frente a modelos previos, es necesario profundizar en técnicas de optimización y en el desarrollo de marcos éticos que orienten su implementación responsable en entornos industriales y académicos.

Descargas

Los datos de descargas todavía no están disponibles.

Detalles del artículo

Referencias (VER)

Amjad, F., Abbas, W., Zia-Ur-Rehman, M., Baig, S., Hashim, M., Khan, A., & Rehman, H. (2021). Effect of green human resource management practices on organizational sustainability: the mediating role of environmental and employee performance. Environmental Science and Pollution Research, 28, 28191 - 28206. https://doi.org/10.1007/s11356-020-11307-9

Amrutha, K., & Prabu, P. (2022). Effortless and beneficial processing of natural languages using modelos transformadoress. Journal of Discrete Mathematical Sciences and Cryptography, 25, 1987 - 2005. https://doi.org/10.1080/09720529.2022.2133239

Aytan, B., & Sakar, C. (2022). Comparison of Modelos transformadores-Based Models Trained in Turkish and Different Languages on Turkish Natural Language Processing Problems. 2022 30th Signal Processing and Communications Applications Conference (SIU), 1-4. https://doi.org/10.1109/SIU55565.2022.9864818

Bahmei, B., Birmingham, E., & Arzanpour, S. (2022). CNN-RNN and Data Augmentation Using Deep Convolutional Generative Adversarial Network for Environmental Sound Classification. IEEE Signal Processing Letters, 29, 682-686. https://doi.org/10.1109/lsp.2022.3150258

Birou, L., Green, K., & Inman, R. (2019). Sustainability knowledge and training: outcomes and firm performance. Journal of Manufacturing Technology Management. https://doi.org/10.1108/JMTM-05-2018-0148

Borges Machín, A. Y. y González Bravo, Y. L. (2022). Educación comunitaria para un envejecimiento activo: experiencia en construcción desde el autodesarrollo. Región Científica, 1(1), 202212. https://doi.org/10.58763/rc202213

Canizo, M., Triguero, I., Conde, A., & Onieva, E. (2019). Multi-head CNN-RNN for multi-time series anomaly detection: An industrial case study. Neurocomputing, 363, 246-260. https://doi.org/10.1016/J.NEUCOM.2019.07.034

Chaudhary, K., & Bali, R. (2022). Easter2.0: Improving convolutional models for handwritten text recognition. ArXiv, abs/2205.14879. https://doi.org/10.48550/arXiv.2205.14879

De Giacomo, M., & Bleischwitz, R. (2020). Business models for environmental sustainability: Contemporary shortcomings and some perspectives. Business Strategy and the Environment. https://doi.org/10.1002/bse.2576

Delgado-Santos, P., Tolosana, R., Guest, R., Deravi, F., & Vera-Rodríguez, R. (2022). Exploring Modelos transformadoress for Behavioural Biometrics: A Case Study in Gait Recognition. ArXiv, abs/2206.01441. https://doi.org/10.48550/arXiv.2206.01441

Dong, Z., Zhang, R., & Shao, X. (2019). A CNN-RNN Hybrid Model with 2D Wavelet Transform Layer for Image Classification. 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), 1050-1056. https://doi.org/10.1109/ICTAI.2019.00147

Dutta, K., Krishnan, P., Mathew, M., Jawahar, C., Dutta, K., Krishnan, P., & Mathew, M. (2018). Improving CNN-RNN Hybrid Networks for Handwriting Recognition. 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), 80-85. https://doi.org/10.1109/ICFHR-2018.2018.00023

Gandhi, D., Shah, V., & Chawan, P. (2022). A Vision Modelos transformadores Approach for Classification an A Small-Sized Medical Image Dataset. 2022 5th International Conference on Advances in Science and Technology (ICAST), 519-524. https://doi.org/10.1109/ICAST55766.2022.10039593

Ganesh, P., Chen, Y., Lou, X., Khan, M., Yang, Y., Chen, D., Winslett, M., Sajjad, H., & Nakov, P. (2020). Compressing Large-Scale Modelos transformadores-Based Models: A Case Study on BERT. Transactions of the Association for Computational Linguistics, 9, 1061-1080. https://doi.org/10.1162/tacl_a_00413

Gillioz, A., Casas, J., Mugellini, E., & Khaled, O. (2020). Overview of the Modelos transformadores-based Models for NLP Tasks. 2020 15th Conference on Computer Science and Information Systems (FedCSIS), 179-183. https://doi.org/10.15439/2020F20

Gómez Cano, C. A. (2022). Ingreso, permanencia y estrategias para el fomento de los Semilleros de Investigación en una IES de Colombia. Región Científica, 1(1), 20226. https://doi.org/10.58763/rc20226

Gómez Miranda, O. M. (2022). La franquicia: de la inversión al emprendimiento. Región Científica, 1(1), 20229. https://doi.org/10.58763/rc20229

Gómez-Cano, C. y Sánchez-Castillo, V. (2021). Evaluación del nivel de madurez en la gestión de proyectos de una empresa prestadora de servicios públicos. Económicas CUC, 42(2), 133-144. https://doi.org/10.17981/econcuc.42.2.2021.Org.7

Gulati, A., Qin, J., Chiu, C., Parmar, N., Zhang, Y., Yu, J., Han, W., Wang, S., Zhang, Z., Wu, Y., & Pang, R. (2020). Conformer: Convolution-augmented Modelos transformadores for Speech Recognition. ArXiv, abs/2005.08100. https://doi.org/10.21437/interspeech.2020-3015

Guler, B., & Yener, A. (2021). A Framework for Sustainable Federated Learning. 2021 19th International Symposium on Modeling and Optimization in Mobile, Ad hoc, and Wireless Networks (WiOpt), 1-8. https://doi.org/10.23919/WiOpt52861.2021.9589930

Guo, Y., Liu, Y., Bakker, E., Guo, Y., & Lew, M. (2017). CNN-RNN: a large-scale hierarchical image classification framework. Multimedia Tools and Applications, 77, 10251 - 10271. https://doi.org/10.1007/s11042-017-5443-x

Gupta, M., & Agrawal, P. (2020). Compression of Deep Learning Models for Text: A Survey. ACM Trans. Knowl. Discov. Data, 16, 61:1-61:55. https://doi.org/10.1145/3487045

Guzmán, D. L., Gómez-Cano, C., & Sánchez-Castillo, V. (2022). Construcción del Estado a partir de la participación Ciudadana. Revista Academia & Derecho, 14(25). https://doi.org/10.18041/2215-8944/academia.25.10601

Higuera Carrillo, E. L. (2022). Aspectos clave en agroproyectos con enfoque comercial: Una aproximación desde las concepciones epistemológicas sobre el problema rural agrario en Colombia. Región Científica, 1(1), 20224. https://doi.org/10.58763/rc20224

Hoyos Chavarro, Y. A., Melo Zamudio, J. C., & Sánchez Castillo, V. (2022). Sistematización de la experiencia de circuito corto de comercialización estudio de caso Tibasosa, Boyacá. Región Científica, 1(1), 20228. https://doi.org/10.58763/rc20228

Hu, Y., Wong, Y., Wei, W., Du, Y., Kankanhalli, M., & Geng, W. (2018). A novel attention-based hybrid CNN-RNN architecture for sEMG-based gesture recognition. PLoS ONE, 13. https://doi.org/10.1371/journal.pone.0206049

Kalyan, K., Rajasekharan, A., & Sangeetha, S. (2021). AMMU: A survey of modelos transformadores-based biomedical pretrained language models. Journal of biomedical informatics, 103982 . https://doi.org/10.1016/j.jbi.2021.103982

Keddous, F., Nguyen, H., & Nakib, A. (2021). Characters Recognition based on CNN-RNN architecture and Metaheuristic. 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 500-507. https://doi.org/10.1109/IPDPSW52791.2021.00082

Kızıltepe, R., Gan, J., & Escobar, J. (2021). A novel keyframe extraction method for video classification using deep neural networks. Neural Computing and Applications, 35, 24513-24524. https://doi.org/10.1007/s00521-021-06322-x

Leamons, R., Cheng, H., & Shami, A. (2022). Vision Modelos transformadoress for Medical Images Classifications. 319-325. https://doi.org/10.1007/978-3-031-16075-2_22

Ledesma, F. y Malave-González, B. E. (2022). Patrones de comunicación científica sobre E-commerce: un estudio bibliométrico en la base de datos Scopus. Región Científica, 1(1), 202214. https://doi.org/10.58763/rc202214

Li, Y., Wehbe, R., Ahmad, F., Wang, H., & Luo, Y. (2022). A Comparative Study of Pretrained Language Models for Long Clinical Text. Journal of the American Medical Informatics Association : JAMIA. https://doi.org/10.1093/jamia/ocac225

Liu, Y., Zhang, Y., Wang, Y., Hou, F., Yuan, J., Tian, J., Zhang, Y., Shi, Z., Fan, J., & He, Z. (2021). A Survey of Visual Modelos transformadoress. IEEE Transactions on Neural Networks and Learning Systems, 35, 7478-7498. https://doi.org/10.1109/TNNLS.2022.3227717

Lyu, S., & Liu, J. (2021). Convolutional Recurrent Neural Networks for Text Classification. J. Database Manag., 32, 65-82. https://doi.org/10.4018/jdm.2021100105

Mittal, S., & Umesh, S. (2020). A survey On hardware accelerators and optimization techniques for RNNs. J. Syst. Archit., 112, 101839. https://doi.org/10.1016/j.sysarc.2020.101839

Mogrovejo Andrade, J. M. (2022). Estrategias resilientes y mecanismos de las organizaciones para mitigar los efectos ocasionados por la pandemia a nivel internacional. Región Científica, 1(1), 202211. https://doi.org/10.58763/rc202211

Nath, S., Marie, A., Ellershaw, S., Korot, E., & Keane, P. (2022). New meaning for NLP: the trials and tribulations of natural language processing with GPT-3 in ophthalmology. British Journal of Ophthalmology, 106, 889 - 892. https://doi.org/10.1136/bjophthalmol-2022-321141

Onan, A. (2022). Bidirectional convolutional recurrent neural network architecture with group-wise enhancement mechanism for text sentiment classification. J. King Saud Univ. Comput. Inf. Sci., 34, 2098-2117. https://doi.org/10.1016/j.jksuci.2022.02.025

Orozco Castillo, E. A. (2022). Experiencias en torno al emprendimiento femenino. Región Científica, 1(1), 20227. https://doi.org/10.58763/rc20225

Özçift, A., Akarsu, K., Yumuk, F., & Söylemez, C. (2021). Advancing natural language processing (NLP) applications of morphologically rich languages with bidirectional encoder representations from modelos transformadoress (BERT): an empirical case study for Turkish. Automatika, 62, 226 - 238. https://doi.org/10.1080/00051144.2021.1922150

Paillé, P., Valeau, P., & Renwick, D. (2020). Leveraging green human resource practices to achieve environmental sustainability. Journal of Cleaner Production. https://doi.org/10.1016/j.jclepro.2020.121137

Pathak, A. (2021). Comparative Analysis of Modelos transformadores based Language Models. Computer Science & Information Technology (CS & IT). https://doi.org/10.5121/CSIT.2021.110111

Pérez Gamboa, A. J., García Acevedo, Y., & García Batán, J. (2019). Proyecto de vida y proceso formativo universitario: un estudio exploratorio en la Universidad de Camagüey. Trasnsformación, 15(3), 280-296. http://scielo.sld.cu/scielo.php?script=sci_arttext&pid=S2077-29552019000300280

Pérez-Gamboa, A. J., Gómez-Cano, C., & Sánchez-Castillo, V. (2022). Decision making in university contexts based on knowledge management systems. Data & Metadata, 2, 92. https://doi.org/10.56294/dm202292

Radfar, M., Barnwal, R., Swaminathan, R., Chang, F., Strimel, G., Susanj, N., & Mouchtaris, A. (2022). ConvRNN-T: Convolutional Augmented Recurrent Neural Network Transducers for Streaming Speech Recognition. ArXiv, abs/2209.14868. https://doi.org/10.21437/interspeech.2022-10844

Ramprasath, M., Dhanasekaran, K., Karthick, T., Velumani, R., & Sudhakaran, P. (2022). An Extensive Study on Pretrained Models for Natural Language Processing Based on Modelos transformadoress. 2022 International Conference on Electronics and Renewable Systems (ICEARS), 382-389. https://doi.org/10.1109/ICEARS53579.2022.9752241

Ricardo Jiménez, L. S. (2022). Dimensiones de emprendimiento: Relación educativa. El caso del programa cumbre. Región Científica, 1(1), 202210. https://doi.org/10.58763/rc202210

Rodríguez-Torres, E., Gómez-Cano, C., & Sánchez-Castillo, V. (2022). Management information systems and their impact on business decision making. Data & Metadata, 1, 21. https://doi.org/10.56294/dm202221

Ronaghi, M. (2022). The effect of virtual reality technology and education on sustainable behavior: a comparative quasi-experimental study. Interact. Technol. Smart Educ., 20, 475-492. https://doi.org/10.1108/itse-02-2022-0025

Roshanzamir, A., Aghajan, H., & Baghshah, M. (2021). Modelos transformadores-based deep neural network language models for Alzheimer’s disease risk assessment from targeted speech. BMC Medical Informatics and Decision Making, 21. https://doi.org/10.1186/s12911-021-01456-3

Sanabria Martínez, M. J. (2022). Construir nuevos espacios sostenibles respetando la diversidad cultural desde el nivel local. Región Científica, 1(1), 20222. https://doi.org/10.58763/rc20222

Shao, R., & Bi, X. (2022). Modelos transformadoress Meet Small Datasets. IEEE Access, 10, 118454-118464. https://doi.org/10.1109/ACCESS.2022.3221138

Singh, S., & Mahmood, A. (2021). The NLP Cookbook: Modern Recipes for Modelos transformadores Based Deep Learning Architectures. IEEE Access, 9, 68675-68702. https://doi.org/10.1109/ACCESS.2021.3077350

Singh, S., Chen, J., Giudice, M., & El-Kassar, A. (2019). Environmental ethics, environmental performance, and competitive advantage: Role of environmental training. Technological Forecasting and Social Change. https://doi.org/10.1016/J.TECHFORE.2019.05.032

Tang, G., Müller, M., Gonzales, A., & Sennrich, R. (2018). Why Self-Attention? A Targeted Evaluation of Neural Machine Translation Architectures. ArXiv, abs/1808.08946. https://doi.org/10.18653/v1/D18-1458

Täuscher, K., & Abdelkafi, N. (2018). Scalability and robustness of business models for sustainability: A simulation experiment. Journal of Cleaner Production, 170, 654-664. https://doi.org/10.1016/J.JCLEPRO.2017.09.023

Trummer, I. (2021). Database Tuning using Natural Language Processing. ACM SIGMOD Record, 50, 27 - 28. https://doi.org/10.1145/3503780.3503788

Usama, M., Ahmad, B., Song, E., Hossain, M., Alrashoud, M., & Muhammad, G. (2020). Attention-based sentiment analysis using convolutional and recurrent neural network. Future Gener. Comput. Syst., 113, 571-578. https://doi.org/10.1016/j.future.2020.07.022

Wang, S., Yao, R., Tsiftsis, T., Miridakis, N., & Qi, N. (2020). Signal Detection in Uplink Time-Varying OFDM Systems Using RNN With Bidirectional LSTM. IEEE Wireless Communications Letters, 9, 1947-1951. https://doi.org/10.1109/LWC.2020.3009170

Xu, Y., Li, L., Xu, H., Huang, S., Huang, F., & Cai, J. (2022). Image Captioning In the Modelos transformadores Age. ArXiv, abs/2204.07374. https://doi.org/10.48550/arXiv.2204.07374

Yang, F., Wang, X., , H., & Li, J. (2021). Modelos transformadoress-sklearn: a toolkit for medical language understanding with modelos transformadores-based models. BMC Medical Informatics and Decision Making, 21. https://doi.org/10.1186/s12911-021-01459-0

Yang, H., Gonçalves, T., Quaresma, P., Vieira, R., Veladas, R., Pinto, C., Oliveira, J., Ferreira, M., Morais, J., Pereira, A., Fernandes, N., & Gonçalves, C. (2022). Clinical Trial Classification of SNS24 Calls with Neural Networks. Future Internet, 14, 130. https://doi.org/10.3390/fi14050130

Ye, Z., Gilman, A., Peng, Q., Levick, K., Cosman, P., & Milstein, L. (2019). Comparison of Neural Network Architectures for Spectrum Sensing. 2019 IEEE Globecom Workshops (GC Wkshps), 1-6. https://doi.org/10.1109/GCWkshps45667.2019.9024482

Zhao, B., Li, X., Lu, X., & Wang, Z. (2018). A CNN-RNN architecture for multi-label weather recognition. ArXiv, abs/1904.10709. https://doi.org/10.1016/J.NEUCOM.2018.09.048

Zhu, S., Guendel, R., Yarovoy, A., & Fioranelli, F. (2022). Continuous Human Activity Recognition With Distributed Radar Sensor Networks and CNN–RNN Architectures. IEEE Transactions on Geoscience and Remote Sensing, 60, 1-15. https://doi.org/10.1109/tgrs.2022.3189746