Una evaluacion comparativa de ChatGPT, DeepSeek y Gemini en la generacion automatica de pruebas unitarias: un analisis de la tasa de exito

  • Marlen Trevino-Villalobos
  • , Christian Quesada-Lopez
  • , Efren Jimenez-Delgado
  • , Rocio Quiros-Oviedo
  • , Ignacio Diaz-Oreiro

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

Resumen

The advancement of large-scale language models (LLMs) has opened up new possibilities for automating unit test generation, a traditionally manual and expensive task. This quantitative study evaluates the performance of three LLMs-ChatGPT 4o mini, DeepSeek v3, and Gemini 2.5 Flash Pro-in generating test cases for methods in C# developed in Unity. The execution success rate of the generated tests was measured using real and synthetic data. The synthetic data was intentionally created to represent common structures, while the real data came from existing project functions. The experimental design was controlled and included the factors LLM and data type and the blocks cyclomatic complexity and contextual memory with four replicates per combination, for a total of 96 experimental treatments. The results show that LLMs have a high potential to support the automatic generation of unit tests. Furthermore, it was evidenced that the choice of model has a significant effect on the success rate of the generated tests.

Título traducido de la contribuciónA Comparative Evaluation of ChatGPT, DeepSeek and Gemini in Automatic Unit Test Generation: A Success Rate Analysis
Idioma originalEspañol
Título de la publicación alojada8th Congreso Internacional en Inteligencia Ambiental, Ingenieria de Software y Salud Electronica y Movil, AmITIC 2025
EditoresVladimir Villarreal, Lilia Munoz
EditorialInstitute of Electrical and Electronics Engineers Inc.
ISBN (versión digital)9798331578466
DOI
EstadoPublicada - 2025
Evento8th International Congress on Environmental Intelligence, Software Engineering, and Electronic and Mobile Health, AmITIC 2025 - Neiva, Colombia
Duración: 24 sept 202526 sept 2025

Serie de la publicación

Nombre8th Congreso Internacional en Inteligencia Ambiental, Ingenieria de Software y Salud Electronica y Movil, AmITIC 2025

Conferencia

Conferencia8th International Congress on Environmental Intelligence, Software Engineering, and Electronic and Mobile Health, AmITIC 2025
País/TerritorioColombia
CiudadNeiva
Período24/09/2526/09/25

Palabras clave

  • automatic testing
  • LLM
  • prompt
  • unit testing
  • Unity

Huella

Profundice en los temas de investigación de 'Una evaluacion comparativa de ChatGPT, DeepSeek y Gemini en la generacion automatica de pruebas unitarias: un analisis de la tasa de exito'. En conjunto forman una huella única.

Citar esto