Automating Test Design Using LLM: Results from an Empirical Study on the Public Sector
Conference on Digital Government Research (DGO2025)
"An efficient test process can detect failures earlier in software development, contributing to the quality of software produced by governmental entities and bringing the potential to improve government service delivery. Due to the high cost and the reduced resources available, the automation of the test activities plays a strategic role in improving testing efficiency. Designing test cases from user stories is a common approach to assessing the quality of a system under testing. This paper reports on the research, implementation, and evaluation of a tool that automatically generates system test designs from user stories with the support of Generative Pre-trained Transformer 4 (GPT-4) in the context of a public sector organization. The tool has been conceived to match the needs and particularities of the organization’s test process. Such a tool reads user stories from the Redmine tool, interacts with GPT-4 using a prompt that outputs test cases, and stores the automatically designed tests in the Squash TM test management tool. Organization test analysts stated that the tool produces good quality tests and reduces the effort to create tests. As a consequence, analysts can put more energy into other activities related to testing. Moreover, comparing the tests designed manually by test analysts with the tests designed by the tool shows that both have the same functional coverage. The paper discusses the impacts of the approach in the process, limitations, and related and future work."
Read Full Paper