Skip to content

Commit 1190f66

Browse files
feat: langmgr and python (#1091)
Co-authored-by: Robbie Cronin <robert.owen.cronin@gmail.com>
1 parent 4f51759 commit 1190f66

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

60 files changed

+4527
-369
lines changed

.github/copilot-instructions.md

Lines changed: 54 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -9,12 +9,13 @@ Copacetic (Copa) is a CLI tool that patches container image vulnerabilities usin
99
## Folder Structure
1010
- `pkg/patch/`: CLI commands and core patching logic
1111
- `pkg/buildkit/`: BuildKit integration and platform discovery
12-
- `pkg/pkgmgr/`: Package manager adapters (dpkg, rpm, apk)
12+
- `pkg/pkgmgr/`: OS package manager adapters (dpkg, rpm, apk)
13+
- `pkg/langmgr/`: Language (application/library) package managers (e.g. Python/pip)
1314
- `pkg/report/`: Vulnerability report parsing and scanner plugin interface
1415
- `pkg/imageloader/`: Container engine integration (Docker, Podman)
1516
- `pkg/types/`: Type definitions and configurations
1617
- `website/docs/`: Project documentation and user guides
17-
- `integration/`: Integration tests for multi-arch and single-arch scenarios
18+
- `integration/`: Integration tests for multi-arch and single-arch scenarios (OS + language flows)
1819
- `main.go`: Root CLI setup with Cobra framework
1920

2021
## Libraries and Frameworks
@@ -36,19 +37,63 @@ Copacetic (Copa) is a CLI tool that patches container image vulnerabilities usin
3637
## Key Architecture Concepts
3738
- **Patching modes**: Targeted (with vulnerability reports) or comprehensive (all available updates)
3839
- **Multi-platform support**: Handles amd64, arm64, and other architectures with QEMU emulation
39-
- **Package managers**: Debian (apt/dpkg), RHEL (yum/rpm), Alpine (apk), Azure Linux (tdnf)
40+
- **OS package managers**: Debian (apt/dpkg), RHEL family (yum/rpm/dnf/microdnf/tdnf), Alpine (apk), Azure Linux / CBL-Mariner (tdnf)
41+
- **Language (application/library) patching**: Experimental support for upgrading vulnerable application dependencies (currently Python via pip) with semantic version / patch-level controls.
4042
- **BuildKit integration**: Uses LLB operations for image building and manipulation
4143
- **Scanner plugins**: Supports custom vulnerability scanners via `customParseScanReport` interface
44+
- **Selective package types**: User can choose to patch only OS, only library, or both via `--pkg-types`.
4245

43-
## Supported Operating Systems & Package Managers
44-
- **Debian/Ubuntu**: Uses `dpkg` and `apt`
45-
- **RHEL/CentOS/Rocky/Alma/Oracle/Amazon**: Uses `rpm`, `yum`, and `dnf`
46-
- **Alpine**: Uses `apk`
47-
- **CBL-Mariner/Azure Linux**: Uses `rpm` and `tdnf`
46+
## Supported Targets
47+
### Operating Systems (OS package layer)
48+
- **Debian/Ubuntu**: `dpkg` + `apt`
49+
- **RHEL/CentOS/Rocky/Alma/Oracle/Amazon**: `rpm`, `yum`, `dnf`, `microdnf`, `tdnf` (as available)
50+
- **Alpine**: `apk`
51+
- **CBL-Mariner/Azure Linux**: `rpm` + `tdnf`
52+
53+
### Language / Application Dependencies (Experimental)
54+
- **Python**: pip-based site-packages upgrades with version validation and patch-level selection.
55+
- Controlled via `--pkg-types library` (or `os,library`) and `--library-patch-level`.
56+
- Patch levels: `patch` (default), `minor`, `major`; influences chosen fixed version when multiple are available.
57+
- Special per-package overrides supported (see `getSpecialPackagePatchLevels()` inside Python manager for curated exceptions).
4858

4959
## Key Functions
60+
61+
### CLI / Orchestration
5062
- `Patch()`: Main entry point for patching operations
5163
- `patchSingleArchImage()` / `patchMultiPlatformImage()`: Core patching logic
5264
- `DiscoverPlatformsFromReference()` / `DiscoverPlatformsFromReport()`: Platform discovery
53-
- `InstallUpdates()`: Package manager interface for applying updates
5465
- `InitializeBuildkitConfig()`: Initializes BuildKit configuration for patching operations
66+
67+
### OS Package Layer
68+
- `pkgmgr.GetPackageManager()` and concrete managers' `InstallUpdates()` methods
69+
- `GetUniqueLatestUpdates()` (OS) for deduplicating + selecting latest OS package versions
70+
71+
### Language / Library Layer
72+
- `langmgr.GetLanguageManagers()` returns appropriate language managers based on manifest content
73+
- `pythonManager.InstallUpdates()` coordinates: selecting versions, performing pip upgrades, validating results
74+
- `langmgr.GetUniqueLatestUpdates()` (libraries) similar to OS but tolerant of empty sets & patch-level filtering
75+
- `FindOptimalFixedVersionWithPatchLevel()` (Trivy parsing path) chooses best fixed version under patch-level constraint
76+
77+
### Report Parsing & Filtering
78+
- `report.TryParseScanReport()` / Trivy parser: builds unified UpdateManifest (OS + Lang)
79+
- Filtering by `--pkg-types` occurs early and again before build execution for safety.
80+
81+
### Validation & VEX
82+
- `vex.TryOutputVexDocument()` generates optional VEX documents (only if report + updates applied)
83+
84+
## Language Patching
85+
Language/library patching is gated behind `COPA_EXPERIMENTAL=1`.
86+
87+
- `--pkg-types`: comma list of `os`, `library` (default `os`). Determines which sections of the UpdateManifest are acted upon.
88+
- `--library-patch-level`: one of `patch|minor|major` (default `patch`). Sets semantic version boundary for chosen upgrade version.
89+
- Behavior when report only has library vulns:
90+
- If `--pkg-types` includes `library`, proceed (even if no OS updates). Empty OS set no longer triggers an error.
91+
- If `--pkg-types os` only, library updates are ignored (manifest language section cleared early).
92+
93+
## Library Patching Flow (Python)
94+
1. Trivy report parsed -> vulnerable Python packages aggregated with all candidate fixed versions.
95+
2. Patch level rule applied to select optimal fixed version per package (with per-package override map for exceptions).
96+
3. Upgrade executed in ephemeral tooling container (derives base Python image tag when possible; fallback tag `3-slim`).
97+
4. Post-upgrade validation via `pip freeze` subset matching: ensures requested versions actually installed.
98+
5. Failed installs or mismatches collected; errors either propagate or are logged based on `--ignore-errors`.
99+
6. Validated updates merged into final manifest fed to VEX generation.

.github/workflows/build.yml

Lines changed: 17 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ permissions:
3333
jobs:
3434
unit-test:
3535
name: Unit Test
36-
runs-on: ubuntu-latest
36+
runs-on: ubuntu-latest-16-cores
3737
timeout-minutes: 5
3838
steps:
3939
- name: Harden Runner
@@ -61,14 +61,11 @@ jobs:
6161
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
6262
build:
6363
name: Build
64-
runs-on: ${{ matrix.os }}
64+
runs-on: ubuntu-latest-16-cores
6565
timeout-minutes: 5
6666
permissions:
6767
packages: write
6868
contents: read
69-
strategy:
70-
matrix:
71-
os: [ubuntu-latest]
7269
steps:
7370
- name: Harden Runner
7471
uses: step-security/harden-runner@f4a75cfd619ee5ce8d5b864b0d183aff3c69b55a # v2.3.1
@@ -104,7 +101,7 @@ jobs:
104101
test-patch-trivy:
105102
needs: build
106103
name: Test patch with trivy ${{ matrix.buildkit_mode }}
107-
runs-on: ubuntu-latest
104+
runs-on: ubuntu-latest-16-cores
108105
timeout-minutes: 30
109106
strategy:
110107
fail-fast: false
@@ -150,7 +147,7 @@ jobs:
150147
test-patch-no-report:
151148
needs: build
152149
name: Test patch no report ${{ matrix.buildkit_mode }}
153-
runs-on: ubuntu-latest
150+
runs-on: ubuntu-latest-16-cores
154151
timeout-minutes: 30
155152
strategy:
156153
fail-fast: false
@@ -202,11 +199,11 @@ jobs:
202199
if [[ -n "${COPA_BUILDKIT_ADDR}" && "${COPA_BUILDKIT_ADDR}" == docker://* ]]; then
203200
export DOCKER_HOST="${COPA_BUILDKIT_ADDR#docker://}"
204201
fi
205-
202+
206203
docker create --name test ghcr.io/project-copacetic/copacetic/test/openssl:test-rpm-patched /bin/sh
207204
tmp="$(mktemp)"
208205
docker cp test:/etc/pki/tls/openssl.cnf "${tmp}"
209-
206+
210207
if ! grep -q foo "${tmp}"; then
211208
echo "Error: openssl.cnf content replaced" >&2
212209
rm "${tmp}"
@@ -256,11 +253,11 @@ jobs:
256253
export DOCKER_HOST="${COPA_BUILDKIT_ADDR#docker://}"
257254
trap '_cleanup' EXIT
258255
fi
259-
256+
260257
docker create --name test ghcr.io/project-copacetic/copacetic/test/openssl:test-debian-patched /bin/sh
261258
tmp="$(mktemp)"
262259
docker cp test:/etc/ssl/openssl.cnf "${tmp}"
263-
260+
264261
if ! grep -q foo "${tmp}"; then
265262
echo "Error: openssl.cnf content replaced" >&2
266263
rm "${tmp}"
@@ -274,7 +271,7 @@ jobs:
274271
test-plugin:
275272
needs: build
276273
name: Test plugin
277-
runs-on: ubuntu-latest
274+
runs-on: ubuntu-latest-16-cores
278275
timeout-minutes: 5
279276
steps:
280277
- name: Harden Runner
@@ -321,7 +318,7 @@ jobs:
321318
test-push:
322319
needs: build
323320
name: Test push
324-
runs-on: ubuntu-latest
321+
runs-on: ubuntu-latest-16-cores
325322
timeout-minutes: 5
326323
steps:
327324
- name: Check out code
@@ -364,12 +361,12 @@ jobs:
364361
with:
365362
paths: "test-results.xml"
366363

367-
364+
368365
test-patch-multiplatform:
369366
needs: build
370367
name: Test patch with multiplatform ${{ matrix.buildkit_mode }}
371-
runs-on: ubuntu-latest
372-
timeout-minutes: 30
368+
runs-on: ubuntu-latest-16-cores
369+
timeout-minutes: 40
373370
strategy:
374371
fail-fast: false
375372
matrix:
@@ -434,15 +431,15 @@ jobs:
434431
shell: bash
435432
run: |
436433
set -eu -o pipefail
437-
434+
438435
if [[ "${{ matrix.buildkit_mode }}" == "docker" ]]; then
439436
# For docker mode, use the default docker daemon
440437
export COPA_BUILDKIT_ADDR="docker://"
441438
else
442439
# For other modes, source the corresponding script in the same shell session
443440
. .github/workflows/scripts/buildkitenvs/${{ matrix.buildkit_mode }}
444441
fi
445-
442+
446443
gotestsum --format testname --junitfile test-results.xml -- ./integration/multiarch --addr="${COPA_BUILDKIT_ADDR}" --copa="$(pwd)/copa" -timeout 0
447444
- name: Test Summary
448445
uses: test-summary/action@v2
@@ -452,8 +449,8 @@ jobs:
452449
test-patch-multiplatform-plugin:
453450
needs: build
454451
name: Test multiplatform with plugin
455-
runs-on: ubuntu-latest
456-
timeout-minutes: 30
452+
runs-on: ubuntu-latest-16-cores
453+
timeout-minutes: 50
457454
steps:
458455
- name: Change docker daemon config
459456
run: |

go.mod

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ module github.com/project-copacetic/copacetic
33
go 1.24.4
44

55
require (
6+
github.com/aquasecurity/go-pep440-version v0.0.1
67
github.com/aquasecurity/trivy v0.66.0
78
github.com/containerd/errdefs v1.0.0
89
github.com/containerd/platforms v1.0.0-rc.1
@@ -38,6 +39,7 @@ require (
3839
github.com/Masterminds/semver v1.5.0 // indirect
3940
github.com/Microsoft/go-winio v0.6.2 // indirect
4041
github.com/apparentlymart/go-textseg/v15 v15.0.0 // indirect
42+
github.com/aquasecurity/go-version v0.0.1 // indirect
4143
github.com/aquasecurity/trivy-db v0.0.0-20250731052236-c7c831e2254d // indirect
4244
github.com/aws/aws-sdk-go-v2 v1.38.3 // indirect
4345
github.com/aws/aws-sdk-go-v2/config v1.31.6 // indirect

go.sum

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,10 @@ github.com/anchore/go-struct-converter v0.0.0-20221118182256-c68fdcfa2092 h1:aM1
1212
github.com/anchore/go-struct-converter v0.0.0-20221118182256-c68fdcfa2092/go.mod h1:rYqSE9HbjzpHTI74vwPvae4ZVYZd1lue2ta6xHPdblA=
1313
github.com/apparentlymart/go-textseg/v15 v15.0.0 h1:uYvfpb3DyLSCGWnctWKGj857c6ew1u1fNQOlOtuGxQY=
1414
github.com/apparentlymart/go-textseg/v15 v15.0.0/go.mod h1:K8XmNZdhEBkdlyDdvbmmsvpAG721bKi0joRfFdHIWJ4=
15+
github.com/aquasecurity/go-pep440-version v0.0.1 h1:8VKKQtH2aV61+0hovZS3T//rUF+6GDn18paFTVS0h0M=
16+
github.com/aquasecurity/go-pep440-version v0.0.1/go.mod h1:3naPe+Bp6wi3n4l5iBFCZgS0JG8vY6FT0H4NGhFJ+i4=
17+
github.com/aquasecurity/go-version v0.0.1 h1:4cNl516agK0TCn5F7mmYN+xVs1E3S45LkgZk3cbaW2E=
18+
github.com/aquasecurity/go-version v0.0.1/go.mod h1:s1UU6/v2hctXcOa3OLwfj5d9yoXHa3ahf+ipSwEvGT0=
1519
github.com/aquasecurity/trivy v0.66.0 h1:eU0M0PXQ6F0UxKYxMjI7b0ue07il3l1eqcBGTliE+tY=
1620
github.com/aquasecurity/trivy v0.66.0/go.mod h1:DFk2QdnR/ZXSfgy2xR7sPPWM/mJvmnVUzqtowsQJSVo=
1721
github.com/aquasecurity/trivy-db v0.0.0-20250731052236-c7c831e2254d h1:Lc+p2CLARivVF48o7uRoFPaahNCvNFyBfeby0JqAMXo=

integration/multiarch/fixtures/test-images.json

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,5 +39,59 @@
3939
"linux/arm/v7",
4040
"linux/arm64"
4141
]
42+
},
43+
{
44+
"originalImage": "quay.io/jupyter/base-notebook",
45+
"localImage": "localhost:5000/jupyter-base-notebook",
46+
"tag": "2025-01-27",
47+
"distro": "ubuntu",
48+
"description": "Multi-arch Jupyter base-notebook 2025-01-27, with push",
49+
"ignoreErrors": false,
50+
"push": true,
51+
"skipAnnotations": true,
52+
"platforms": [
53+
"linux/amd64",
54+
"linux/arm64"
55+
]
56+
},
57+
{
58+
"originalImage": "quay.io/jupyter/base-notebook",
59+
"localImage": "localhost:5000/jupyter-base-notebook",
60+
"tag": "2025-01-27",
61+
"distro": "ubuntu",
62+
"description": "Multi-arch Jupyter base-notebook 2025-01-27, no push",
63+
"ignoreErrors": false,
64+
"push": false,
65+
"platforms": [
66+
"linux/amd64",
67+
"linux/arm64"
68+
]
69+
},
70+
{
71+
"originalImage": "mcr.microsoft.com/azure-cli",
72+
"localImage": "localhost:5000/azure-cli",
73+
"tag": "2.50.0",
74+
"distro": "debian",
75+
"description": "Multi-arch Azure CLI 2.50.0 for Python package testing, with push",
76+
"ignoreErrors": false,
77+
"push": true,
78+
"skipAnnotations": true,
79+
"platforms": [
80+
"linux/amd64",
81+
"linux/arm64"
82+
]
83+
},
84+
{
85+
"originalImage": "mcr.microsoft.com/azure-cli",
86+
"localImage": "localhost:5000/azure-cli",
87+
"tag": "2.50.0",
88+
"distro": "debian",
89+
"description": "Multi-arch Azure CLI 2.50.0 for Python package testing, no push",
90+
"ignoreErrors": false,
91+
"push": false,
92+
"platforms": [
93+
"linux/amd64",
94+
"linux/arm64"
95+
]
4296
}
4397
]

pkg/buildkit/buildkit.go

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ import (
1919

2020
"github.com/project-copacetic/copacetic/pkg/report"
2121
"github.com/project-copacetic/copacetic/pkg/types"
22+
"github.com/project-copacetic/copacetic/pkg/utils"
2223
log "github.com/sirupsen/logrus"
2324

2425
"github.com/google/go-containerregistry/pkg/name"
@@ -135,7 +136,7 @@ func DiscoverPlatformsFromReport(reportDir, scanner string) ([]types.PatchPlatfo
135136
if file.IsDir() {
136137
continue
137138
}
138-
report, err := report.TryParseScanReport(filePath, scanner)
139+
report, err := report.TryParseScanReport(filePath, scanner, utils.PkgTypeOS, utils.PatchTypePatch)
139140
if err != nil {
140141
return nil, fmt.Errorf("error parsing report %w", err)
141142
}
@@ -168,7 +169,18 @@ func DiscoverPlatformsFromReport(reportDir, scanner string) ([]types.PatchPlatfo
168169

169170
func isSupportedOsType(osType string) bool {
170171
switch osType {
171-
case "alpine", "debian", "ubuntu", "cbl-mariner", "azurelinux", "centos", "oracle", "redhat", "rocky", "amazon", "alma":
172+
case utils.OSTypeAlpine,
173+
utils.OSTypeDebian,
174+
utils.OSTypeUbuntu,
175+
utils.OSTypeCBLMariner,
176+
utils.OSTypeAzureLinux,
177+
utils.OSTypeCentOS,
178+
utils.OSTypeOracle,
179+
utils.OSTypeRedHat,
180+
utils.OSTypeRocky,
181+
utils.OSTypeAmazon,
182+
utils.OSTypeAlma,
183+
utils.OSTypeAlmaLinux:
172184
return true
173185
default:
174186
return false

pkg/buildkit/buildkit_test.go

Lines changed: 15 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ import (
1919

2020
"github.com/project-copacetic/copacetic/mocks"
2121
"github.com/project-copacetic/copacetic/pkg/types"
22+
"github.com/project-copacetic/copacetic/pkg/utils"
2223

2324
"github.com/stretchr/testify/mock"
2425

@@ -96,7 +97,7 @@ func newMockBuildkitAPI(t *testing.T, caps ...apicaps.CapID) string {
9697
var sockPath string
9798
if runtime.GOOS == goosDarwin {
9899
// On macOS, use /tmp directly for shorter paths to avoid socket path length limits
99-
sockPath = filepath.Join("/tmp", fmt.Sprintf("bk-%d.sock", time.Now().UnixNano()))
100+
sockPath = filepath.Join(utils.DefaultTempWorkingFolder, fmt.Sprintf("bk-%d.sock", time.Now().UnixNano()))
100101
} else {
101102
// On other platforms, use temp dir but with shorter name
102103
tmp := t.TempDir()
@@ -445,7 +446,19 @@ func TestMapGoArch(t *testing.T) {
445446
}
446447

447448
func TestIsSupportedOsType(t *testing.T) {
448-
supported := []string{"alpine", "debian", "ubuntu", "cbl-mariner", "azurelinux", "centos", "oracle", "redhat", "rocky", "amazon", "alma"}
449+
supported := []string{
450+
utils.OSTypeAlpine,
451+
utils.OSTypeDebian,
452+
utils.OSTypeUbuntu,
453+
utils.OSTypeCBLMariner,
454+
utils.OSTypeAzureLinux,
455+
utils.OSTypeCentOS,
456+
utils.OSTypeOracle,
457+
utils.OSTypeRedHat,
458+
utils.OSTypeRocky,
459+
utils.OSTypeAmazon,
460+
utils.OSTypeAlma,
461+
}
449462
for _, os := range supported {
450463
if !isSupportedOsType(os) {
451464
t.Errorf("expected %s to be supported", os)

0 commit comments

Comments
 (0)