-
Notifications
You must be signed in to change notification settings - Fork 1
Adding ECR compatibility and fixing deadlock #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughImplements registry-aware image name parsing and ECR-specific Basic auth handling in Docker download logic; adds unit tests for image parsing; ensures NoseyParker scanner always signals completion via deferred send; extends NP provenance types and populates RepoPath and BlobPath during layer/file processing. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant C as Caller
participant D as DockerDownloadLink
participant A as Auth Endpoint
participant R as Registry
C->>D: Download(imageRef)
D->>D: parseImageName(imageRef)
D->>D: Detect registry (ECR vs non-ECR)
alt ECR or public.ecr.aws
D->>D: Build Basic token (base64 username:password)
note right of D: Token stored as dd.token
else Non-ECR
D->>A: GET /auth (optional Basic using creds)
A-->>D: Bearer token
D->>D: Store dd.token
end
D->>R: GET /manifests/:ref
alt ECR
note over D,R: Authorization: Basic dd.token
else Non-ECR
note over D,R: Authorization: Bearer dd.token
end
R-->>D: Manifest
loop For each blob
D->>R: GET /blobs/:digest
R-->>D: Blob content
end
D-->>C: Content/links
sequenceDiagram
participant Caller
participant NP as NoseyParker
participant G as Goroutine
participant Done as n.done (chan bool)
Caller->>NP: startNpScanner()
NP->>G: launch scan goroutine
activate G
G->>G: defer send Done <- true
G->>G: run scan command (may error)
G-->>Done: (deferred) true
deactivate G
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Pre-merge checks (2 passed, 1 warning)❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Poem
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal). Please share your feedback with us on this Discord post. ✨ Finishing Touches
🧪 Generate unit tests
Comment |
Claude encountered an error —— View job I'll analyze this and get back to you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/links/docker/download.go (1)
376-378
: Potential issue with token refresh retry logicAfter refreshing the token in line 371, the code sets a Bearer token header regardless of registry type. This could fail for ECR registries which require Basic auth.
Apply this fix to handle ECR registries correctly during retry:
// Update the request with the new token if dd.token != "" { - req.Header.Set("Authorization", "Bearer "+dd.token) + registryBase := strings.Split(dd.dockerImage.Image, "/")[0] + if (strings.Contains(registryBase, "ecr") && strings.Contains(registryBase, "amazonaws.com")) || strings.Contains(registryBase, "public.ecr.aws") { + req.Header.Set("Authorization", "Basic "+dd.token) + } else { + req.Header.Set("Authorization", "Bearer "+dd.token) + } }
🧹 Nitpick comments (6)
pkg/types/docker.go (1)
194-197
: Provenance enrichment looks good, but consider initializing the commit metadata fullyWhile adding
RepoPath
andFirstCommit
metadata is valuable for tracking provenance, theNPCommitMetadata
struct is only partially initialized withBlobPath
. Consider whether other fields in the commit metadata should be initialized or if a comment should clarify why onlyBlobPath
is relevant here.pkg/links/docker/download.go (1)
151-162
: ECR authentication handling looks correctThe special case for ECR registries correctly uses Basic auth with base64-encoded credentials. The debug logging will be helpful for troubleshooting authentication issues.
Consider extracting the ECR detection logic into a helper function to reduce code duplication:
+func isECRRegistry(registryBase string) bool { + return (strings.Contains(registryBase, "ecr") && strings.Contains(registryBase, "amazonaws.com")) || + strings.Contains(registryBase, "public.ecr.aws") +} func (dd *DockerDownloadLink) refreshToken(dockerImage *types.DockerImage) error { imageName, tag := dd.parseImageName(dockerImage.Image) // Special handling for ECR - don't reset token, just recreate it registryBase := strings.Split(dockerImage.Image, "/")[0] - if (strings.Contains(registryBase, "ecr") && strings.Contains(registryBase, "amazonaws.com")) || strings.Contains(registryBase, "public.ecr.aws") { + if isECRRegistry(registryBase) { if dockerImage.AuthConfig.Username != "" && dockerImage.AuthConfig.Password != "" {This helper function could be reused in lines 274, 327, and 544.
pkg/links/docker/download_test.go (4)
7-9
: Enable test parallelism at the suite level.Add t.Parallel() to speed up and avoid future accidental shared-state issues.
func TestDockerDownloadLink_parseImageName(t *testing.T) { - dd := &DockerDownloadLink{} + t.Parallel() + dd := &DockerDownloadLink{}
125-137
: Make subtests parallel-safe (capture loop var) and run them in parallel.Prevents the classic loop-var capture pitfall and speeds up execution.
- for _, tt := range tests { - t.Run(tt.name, func(t *testing.T) { + for _, tt := range tests { + tt := tt + t.Run(tt.name, func(t *testing.T) { + t.Parallel() image, tag := dd.parseImageName(tt.input)
16-123
: Broaden coverage: add explicit docker.io, index.docker.io, port+2-part+tag, and common registries.These pass under typical parsing rules and help guard regressions.
// Edge cases { name: "complex tag with dash", input: "user/repo:v1.0-alpine", expectedImage: "user/repo", expectedTag: "v1.0-alpine", }, { name: "complex registry domain", input: "my-registry.internal.company.com/team/project:sha-abc123", expectedImage: "team/project", expectedTag: "sha-abc123", }, + // Additional common forms + { + name: "explicit docker.io official no tag", + input: "docker.io/nginx", + expectedImage: "library/nginx", + expectedTag: "latest", + }, + { + name: "explicit index.docker.io official with tag", + input: "index.docker.io/library/busybox:1.36", + expectedImage: "library/busybox", + expectedTag: "1.36", + }, + { + name: "registry with port 2-part with tag", + input: "localhost:5000/myimage:dev", + expectedImage: "myimage", + expectedTag: "dev", + }, + { + name: "ghcr no tag", + input: "ghcr.io/owner/repo", + expectedImage: "owner/repo", + expectedTag: "latest", + }, + { + name: "quay with tag", + input: "quay.io/coreos/etcd:latest", + expectedImage: "coreos/etcd", + expectedTag: "latest", + }, + { + name: "gcr multi-segment path", + input: "gcr.io/k8s-artifacts-prod/kube-apiserver:v1.31.0", + expectedImage: "k8s-artifacts-prod/kube-apiserver", + expectedTag: "v1.31.0", + },
140-185
: Parallelize and dedupe the focused edge-case tests.Parallelize subtests; consider folding these into the main table to avoid duplication.
func TestDockerDownloadLink_parseImageName_EdgeCases(t *testing.T) { - dd := &DockerDownloadLink{} + t.Parallel() + dd := &DockerDownloadLink{} t.Run("grafana/grafana should remain unchanged", func(t *testing.T) { + t.Parallel() image, tag := dd.parseImageName("grafana/grafana") @@ t.Run("registry.com/image should extract image name", func(t *testing.T) { + t.Parallel() image, tag := dd.parseImageName("registry.com/myapp") @@ t.Run("ECR registry should extract repo name", func(t *testing.T) { + t.Parallel() image, tag := dd.parseImageName("123456789012.dkr.ecr.us-east-2.amazonaws.com/nebula-test-secrets-repo-nyx0")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- Jira integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
pkg/links/docker/download.go
(5 hunks)pkg/links/docker/download_test.go
(1 hunks)pkg/links/noseyparker/noseyparker.go
(1 hunks)pkg/types/docker.go
(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
pkg/links/docker/download_test.go (1)
pkg/links/docker/download.go (1)
DockerDownloadLink
(24-29)
pkg/types/docker.go (1)
pkg/types/noseyparker.go (1)
NPCommitMetadata
(121-133)
🔇 Additional comments (7)
pkg/links/noseyparker/noseyparker.go (1)
32-35
: Good fix for the deadlock issue!The addition of the
defer
block ensures thatn.done
always receives a signal even if an error occurs during scanning. This prevents the goroutine from hanging indefinitely whenn.executeFn
returns an error.pkg/types/docker.go (1)
274-274
: LGTM!The addition of
RepoPath
for file processing maintains consistency with the layer processing approach.pkg/links/docker/download.go (4)
111-117
: Good improvement to tag extraction logicUsing
LastIndex
and checking for slashes in the suffix properly handles registry URLs with ports (e.g.,registry.example.com:5000/image:tag
).
124-143
: Registry normalization logic is comprehensiveThe implementation handles various registry formats well, including ECR, public.ecr.aws, and standard registries. The logic correctly preserves Docker Hub's 2-part format while extracting image names from registry URLs.
272-283
: ECR token generation in auth endpoint is properly implementedThe function correctly identifies ECR registries and returns Basic auth tokens directly without making HTTP requests to auth endpoints, which aligns with ECR's authentication model.
325-331
: Authorization header handling for ECR is consistentBoth
getManifest
andfetchBlob
correctly apply Basic auth for ECR registries and Bearer auth for others.Also applies to: 542-548
pkg/links/docker/download_test.go (1)
1-186
: Solid coverage and clear intent.The table-driven structure is good and catches many real-world formats. Nice work.
// ECR formats | ||
{ | ||
name: "ecr private registry", | ||
input: "123456789012.dkr.ecr.us-east-1.amazonaws.com/my-repo", | ||
expectedImage: "my-repo", | ||
expectedTag: "latest", | ||
}, | ||
{ | ||
name: "ecr private registry with tag", | ||
input: "123456789012.dkr.ecr.us-east-1.amazonaws.com/my-repo:v1.0", | ||
expectedImage: "my-repo", | ||
expectedTag: "v1.0", | ||
}, | ||
{ | ||
name: "ecr public registry", | ||
input: "public.ecr.aws/my-namespace/my-repo", | ||
expectedImage: "my-repo", | ||
expectedTag: "latest", | ||
}, | ||
{ | ||
name: "ecr public registry with tag", | ||
input: "public.ecr.aws/my-namespace/my-repo:latest", | ||
expectedImage: "my-repo", | ||
expectedTag: "latest", | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Confirm ECR path normalization assumptions.
Tests strip the namespace segment for public ECR (public.ecr.aws/namespace/repo → repo). Double-check this matches how the download code constructs the v2 manifest path; public ECR often expects namespace/repo in the path.
🌐 Web query:
For AWS ECR:
1) For public ECR, are repository references of the form public.ecr.aws/<namespace>/<repo>, and does the Docker Registry API path require v2/<namespace>/<repo>/manifests/<tag>?
2) For private ECR, can repository names include slashes (e.g., team/repo), and how should these be represented in the v2 path?
Please provide authoritative AWS docs references.
💡 Result:
Short answers
-
Yes. Public ECR repository image URIs use public.ecr.aws/<registry_alias_or_namespace>/. The Docker Registry HTTP API path for manifests is https://public.ecr.aws/v2/<registry_alias_or_namespace>//manifests/. [1][2]
-
Yes. Private ECR repository names may include forward slashes (namespaces), e.g. team/repo. When calling the Docker Registry HTTP API against a private registry use the full repository name in the v2 path, e.g. https://<aws_account_id>.dkr.ecr..amazonaws.com/v2/team/repo/manifests/. (Repository name rules and examples shown in the CreateRepository docs; private registry HTTP API example shown in the auth docs.) [3][4]
References
- Amazon ECR Public — public repo URI and pull examples. [1]
- Amazon ECR Public — HTTP API (curl) example showing v2///manifests/. [2]
- Amazon ECR (private) — creating repositories; repository-name pattern allows forward slashes (namespaces). [3]
- Amazon ECR (private) — HTTP API authentication example showing v2//tags/list (same v2/ path form applies for manifests). [4]
Preserve full ECR namespace in manifest paths and tests
Update pkg/links/docker/download_test.go:
- Public ECR cases must expect
<namespace>/<repo>
, not justrepo
, since the manifest endpoint is/v2/<namespace>/<repo>/manifests/<tag>
. - Add a private ECR test with a multi-segment repository name (e.g.
team/repo
) to ensure slashes are preserved in the v2 path.
🤖 Prompt for AI Agents
In pkg/links/docker/download_test.go around lines 84 to 108, the ECR test cases
incorrectly expect only the repository name for public ECR and lack a private
ECR case with a multi-segment repo; update the public ECR expectedImage values
to include the full namespace/repo (e.g. "my-namespace/my-repo") so the manifest
path matches /v2/<namespace>/<repo>/manifests/<tag>, and add a new private ECR
test case using a multi-segment repository name (e.g.
"123456789012.dkr.ecr.us-east-1.amazonaws.com/team/my-repo" with expectedImage
"team/my-repo" and appropriate expectedTag) to ensure slashes are preserved in
the v2 path.
Summary by CodeRabbit
New Features
Bug Fixes
Tests