-
-
Notifications
You must be signed in to change notification settings - Fork 368
Description
Is there an existing issue for this?
- There is no existing issue for this feature
What are you currently unable to do
I noticed that the System-related tests in Incus’ GitHub Actions workflow are particularly time-consuming. After a brief discussion with Stéphane, he explained that the goal is to improve the overall execution time of the Incus test suite on GitHub Actions. Currently, a full test run takes around 1 hour and 20 minutes, and the target is to reduce this to a more reasonable ~30 minutes.
This can be achieved by splitting the existing test jobs into smaller units and leveraging the parallel execution capabilities of GitHub Actions runners, thereby reducing the runtime of each individual job.
Problem Analysis
-
Currently, most system-related tests take between 10 and 40 minutes, which is still within a reasonable range. However, there is a system test using Ceph as the backend that consumes approximately 1 hour and 20 minutes.
-
At present, all tests are only roughly divided into two suites: cluster and standalone. This causes each GitHub Actions job to handle too many tests, resulting in extended execution times.
Improvement Approach
1. Split test jobs into finer-grained units
- Refine existing suite structure
- Building upon the original architecture with all, standalone, and cluster as suites, further categorize existing test cases into seven distinct categories (need to confirm what):
- Benefit: This approach reduces the number of test cases each individual job has to run, thereby lowering the execution time per job and improving overall CI efficiency.
- Categories: core, storage, network, security, misc, instances
- Building upon the original architecture with all, standalone, and cluster as suites, further categorize existing test cases into seven distinct categories (need to confirm what):
| Dimension | Values |
|---|---|
| suite | cluster, standalone |
| test-category | core, storage, network, security, misc, instances |
| backend | dir, btrfs, lvm, zfs, ceph, linstor, random |
| go | oldstable, stable, tip |
| os | ubuntu-24.04, ubuntu-24.04-arm |
2. Reassessing the Test Matrix
After introducing the concept of test-category in Step 1 to split tests into different groups, the side effect is a combinatorial explosion of job combinations. Although GitHub Actions provides parallel runners, handling a large number of jobs at once remains challenging.
Therefore, I believe it is necessary to carefully review and optimize the test grouping. It will reduce the number of jobs to avoid long queue times for individual jobs, which can extend the overall test workflow duration.