This repository contains the code and tests for the AI-Driven Testing project (AMOS SS 2025).
The goal of this project is to develop or customize a LLM-based (Large Language Model) AI that can automatically generate test code for existing software. The AI is controlled through a chat-based interface and can be provided with information about the target software in various ways.
-
π Test Code Generation
The AI can generate test code for arbitrary software using methods such as Retrieval-Augmented Generation (RAG), fine-tuning, or prompting. -
π Incremental Test Extension
The AI can recognize and expand existing test code intelligently. -
π§ͺ Understanding of Test Types
The AI can distinguish between different layers and types of tests:- Layers: User interface, domain/business logic, persistence layer
- Test Types: Unit test, integration test, acceptance test
-
π οΈ On-Premise Operation
The solution can run fully offline, suitable for on-premise environments. -
π³ Docker Support
The backend can run inside a Docker container and be accessed via an API. -
π IDE Integration
The solution can be embedded into existing open-source development environments.
- Provide the software (source code or API/documentation)
- Start the AI and interact through the chat interface
- Generate and review test code
- Integrate test code into your existing test suite
This project is a frontend application built with Node.js. Follow the steps below to set it up and run it locally.
- Node.js (Ensure it is installed on your machine)
-
Clone the repository (if you haven't already):
git clone amos2025ss04-ai-driven-testing/frontend cd amos2025ss04-ai-driven-testing/frontend
-
Navigate to the frontend directory:
cd frontend/
-
Install dependencies:
npm install
-
Start the server:
npm run start
-
Open your browser and go to:
http://localhost:3000/
This project allows you to easily run a local Ollama container, send prompts to a language model via a Dockerized API, and save the structured response as Markdown.
-
Docker (for running the Ollama container)
β Install Docker -
Conda (optional, for managing the Python environment)
β Install Anaconda
environment.yml
β Conda environment definitionmain.py
β Main script to run a single model: starts the container, sends prompt, and stops the container.example_all_models.py
β Example script that sends the same prompt to all allowed models.llm_manager.py
β Handles Docker container management, pulling models/images, sending prompts, and progress reporting.allowed_models.json
β Config that defines allowed language models.prompt.txt
β Default input prompt file.output-<MODEL_ID>.md
β Output file produced for each model.
All files are located inside the backend/
directory.
-
(Optional) Create and activate a Conda environment:
conda env create -f backend/environment.yml conda activate backend
-
Make sure Docker is running on your machine.
Simply run the main.py script:
python backend/main.py
By default, it reads from backend/prompt.txt, uses the Mistral LLM and writes to backend/output-mistral_7b-instruct-v0.3-q3_K_M.md.
You can also specify a custom prompt file and output file:
python backend/main.py --model 0 --prompt_file ./your_prompt.txt --output_file ./your_output.md
python backend/example_all_models.py
This script does the following:
- Starts each model's container
- Sends the provided prompt (from
prompt.txt
) - Saves each response into its own
output-<MODEL_ID>.md
- Stops all containers after completion
- The project uses the Docker image
ollama/ollama
to run language models locally. - The
LLMManager
class inllm_manager.py
:- Pulls the required Docker image with progress indication.
- Selects a free port for each container.
- Waits until the containerβs API becomes available.
- Pulls the selected model inside the container.
- Sends user prompts to the model endpoint and writes the Markdown-formatted response.
allowed_models.json
provides a list of allowed models.
- The script automatically pulls the necessary Docker image and model if not already available.
- Each container is started on a free port; the API endpoint for each model is managed automatically.
- On completion, each container is stopped to free up system resources.
- The response is formatted as clean Markdown.
If your prompt.txt contains:
Write unit tests for the following Python function:
```python
def add_numbers(a, b):
"""
Adds two numbers together and returns the result.
Args:
a (int or float): The first number.
b (int or float): The second number.
Returns:
int or float: The sum of a and b.
Examples:
>>> add_numbers(2, 3)
5
>>> add_numbers(-1, 1)
0
>>> add_numbers(0.5, 0.5)
1.0
"""
return a + b
Your output.md will look like:
Here is how you can write unit tests for the `add_numbers` function using Python's built-in unittest module and some assertions to check if your code works as expected with test cases from examples provided in docstring.
Make sure that all import statements are correct, including 'unittest'. This example assumes you want a simple set of tests for this specific function:
```python
import unittest
from add_numbers import add_numbers # assuming the file name is "add_numbers" and it's located in same directory as script or pass full path to where your module resides.
class TestAddNumbers(unittest.TestCase):
def test_positive_integers(self):
self.assertEqual(add_numbers(2,3),5) # should return the sum of two numbers (i.e., '4') as output: 7 not ('6'). Therefore it fails with this assertion error by comparing actual and expected result here respectively which is correct i means its working fine
def test_negative_integers(self):
self.assertEqual(add_numbers(-1,1),0) # should return the sum of two numbers (i.e., '2') as output: -3 not ('-4'). Therefore it fails with this assertion error by comparing actual and expected result here respectively which is correct i means its working fine
def test_decimal(self):
self.assertEqual(add_numbers(0.5, 2),1) # should return the sum of two numbers (i.e., '3') as output: -4 not ('-8'). Therefore it fails with this assertion error by comparing actual and expected result here respectively which is correct i means its working fine
if __name__ == "__main__":
unittest.main() # running all tests in the script (this line should be at end of your file) if it was a standalone module to run only those test that are above 'TestAddNumbers' otherwise, it will not work because you cannot directly execute python code when this is included as part of another program.
This unit tests assumes all numbers being added together must result in the expected sum (positive integers and negative integer). If your use-case might include other inputs such be positive/negative or decimal figures too, then those additional test cases should also exist for that purpose to ensure robustness against edge scenarios β as per requirement.