Skip to main navigation Skip to search Skip to main content

Una evaluacion comparativa de ChatGPT, DeepSeek y Gemini en la generacion automatica de pruebas unitarias: un analisis de la tasa de exito

Translated title of the contribution: A Comparative Evaluation of ChatGPT, DeepSeek and Gemini in Automatic Unit Test Generation: A Success Rate Analysis
  • Marlen Trevino-Villalobos
  • , Christian Quesada-Lopez
  • , Efren Jimenez-Delgado
  • , Rocio Quiros-Oviedo
  • , Ignacio Diaz-Oreiro

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The advancement of large-scale language models (LLMs) has opened up new possibilities for automating unit test generation, a traditionally manual and expensive task. This quantitative study evaluates the performance of three LLMs-ChatGPT 4o mini, DeepSeek v3, and Gemini 2.5 Flash Pro-in generating test cases for methods in C# developed in Unity. The execution success rate of the generated tests was measured using real and synthetic data. The synthetic data was intentionally created to represent common structures, while the real data came from existing project functions. The experimental design was controlled and included the factors LLM and data type and the blocks cyclomatic complexity and contextual memory with four replicates per combination, for a total of 96 experimental treatments. The results show that LLMs have a high potential to support the automatic generation of unit tests. Furthermore, it was evidenced that the choice of model has a significant effect on the success rate of the generated tests.

Translated title of the contributionA Comparative Evaluation of ChatGPT, DeepSeek and Gemini in Automatic Unit Test Generation: A Success Rate Analysis
Original languageSpanish
Title of host publication8th Congreso Internacional en Inteligencia Ambiental, Ingenieria de Software y Salud Electronica y Movil, AmITIC 2025
EditorsVladimir Villarreal, Lilia Munoz
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798331578466
DOIs
StatePublished - 2025
Event8th International Congress on Environmental Intelligence, Software Engineering, and Electronic and Mobile Health, AmITIC 2025 - Neiva, Colombia
Duration: 24 Sep 202526 Sep 2025

Publication series

Name8th Congreso Internacional en Inteligencia Ambiental, Ingenieria de Software y Salud Electronica y Movil, AmITIC 2025

Conference

Conference8th International Congress on Environmental Intelligence, Software Engineering, and Electronic and Mobile Health, AmITIC 2025
Country/TerritoryColombia
CityNeiva
Period24/09/2526/09/25

Fingerprint

Dive into the research topics of 'A Comparative Evaluation of ChatGPT, DeepSeek and Gemini in Automatic Unit Test Generation: A Success Rate Analysis'. Together they form a unique fingerprint.

Cite this