-
Notifications
You must be signed in to change notification settings - Fork 5
build: Standardize asset containers to use official Ubuntu base images. #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
build: Standardize asset containers to use official Ubuntu base images. #3
Conversation
WalkthroughThe base image in four Dockerfiles under the Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~7 minutes Note 🔌 MCP (Model Context Protocol) integration is now available in Early Access!Pro users can now connect to remote MCP servers under the Integrations page to get reviews and chat conversations that understand additional development context. ✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
f332f29
to
7d1f01d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (4)
assets/clickhouse/main.py
(2 hunks)assets/clp/main.py
(2 hunks)assets/presto_clp/main.py
(2 hunks)scripts/benchall.py
(5 hunks)
🧰 Additional context used
🪛 Ruff (0.12.2)
scripts/benchall.py
41-41: Trailing comma missing
Add trailing comma
(COM812)
55-55: Missing return type annotation for public function run
Add return type annotation: None
(ANN201)
55-55: Missing type annotation for function argument bencher
(ANN001)
55-55: Missing type annotation for function argument kwargs
(ANN001)
55-55: Missing type annotation for function argument bench_target
(ANN001)
55-55: Boolean default positional argument in function definition
(FBT002)
55-55: Missing type annotation for function argument attach
(ANN001)
55-55: Boolean default positional argument in function definition
(FBT002)
55-55: Missing type annotation for function argument attach_on_error
(ANN001)
🔇 Additional comments (8)
assets/clp/main.py (2)
8-8
: Class renaming improves consistency.The renaming from
clp_s_bench
toclp_bench
enhances naming clarity and aligns with the standardization across benchmark modules.Also applies to: 64-64
1-69
: Inconsistency between PR objectives and actual changes.The PR objective mentions updating Docker images to use more popular Ubuntu images, but the actual changes involve class renaming for consistency. The file content doesn't show any Docker image updates.
Likely an incorrect or invalid review comment.
assets/presto_clp/main.py (1)
19-19
: Improved naming convention with logical ordering.The renaming from
clp_presto_bench
topresto_clp_bench
follows a more logical naming pattern by placing the primary service (presto) first, improving code readability and consistency.Also applies to: 105-105
assets/clickhouse/main.py (1)
16-16
: Simplified naming removes implementation details.The renaming from
clickhouse_native_json_bench
toclickhouse_bench
appropriately removes implementation-specific details from the class name, making it more concise and maintainable.Also applies to: 157-157
scripts/benchall.py (4)
3-4
: Import updates align with class renamings.The import statement updates correctly reflect the standardized class names from the benchmark modules.
Also applies to: 9-9
36-45
: Benchmark tuples updated consistently.The benchmark list properly references the renamed classes with appropriate arguments.
73-73
: Improved error handling logic.The updated condition
attach or attach_on_error
correctly implements the new debugging functionality, allowing container attachment on errors even whenattach
is False.
83-86
: Confirm deliberate dataset filteringWe ran
rg -i "todo|fixme|hack|temporary"
against scripts/benchall.py and found no markers indicating this restriction is temporary. Please verify that limiting the benchmark to the'mongod'
dataset on production runs is intentional and, if not, remove or document this filter.– File:
scripts/benchall.py
, Lines 83–86
'manual_column_names': False, | ||
'keys': [], | ||
'additional_order_by': [], | ||
'timestamp_key': True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Add trailing comma for consistency.
'timestamp_key': True
- }),
+ }),
Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 Ruff (0.12.2)
41-41: Trailing comma missing
Add trailing comma
(COM812)
🤖 Prompt for AI Agents
In scripts/benchall.py at line 41, add a trailing comma after 'timestamp_key':
True to maintain consistency in the dictionary formatting. This involves placing
a comma immediately after the True value.
scripts/benchall.py
Outdated
] | ||
|
||
def run(bencher, kwargs, bench_target, attach=False): | ||
def run(bencher, kwargs, bench_target, attach=False, attach_on_error=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Enhance function signature with type annotations.
The new attach_on_error
parameter improves error debugging capabilities. However, the function lacks proper type annotations and has boolean positional arguments.
-def run(bencher, kwargs, bench_target, attach=False, attach_on_error=False):
+def run(
+ bencher: type,
+ kwargs: dict,
+ bench_target: Path,
+ *,
+ attach: bool = False,
+ attach_on_error: bool = False,
+) -> None:
This addresses the static analysis hints by:
- Adding type annotations
- Making boolean parameters keyword-only to prevent positional confusion
- Adding return type annotation
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def run(bencher, kwargs, bench_target, attach=False, attach_on_error=False): | |
def run( | |
bencher: type, | |
kwargs: dict, | |
bench_target: Path, | |
*, | |
attach: bool = False, | |
attach_on_error: bool = False, | |
) -> None: |
🧰 Tools
🪛 Ruff (0.12.2)
55-55: Missing return type annotation for public function run
Add return type annotation: None
(ANN201)
55-55: Missing type annotation for function argument bencher
(ANN001)
55-55: Missing type annotation for function argument kwargs
(ANN001)
55-55: Missing type annotation for function argument bench_target
(ANN001)
55-55: Boolean default positional argument in function definition
(FBT002)
55-55: Missing type annotation for function argument attach
(ANN001)
55-55: Boolean default positional argument in function definition
(FBT002)
55-55: Missing type annotation for function argument attach_on_error
(ANN001)
🤖 Prompt for AI Agents
In scripts/benchall.py at line 55, the run function lacks type annotations and
uses boolean positional arguments which can cause confusion. Add appropriate
type annotations for all parameters and the return type. Also, make the boolean
parameters attach and attach_on_error keyword-only by placing them after a * in
the parameter list to prevent positional argument misuse.
Oops, wrong commit being cherry picked. |
7d1f01d
to
6e27486
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🔭 Outside diff range comments (8)
assets/elasticsearch/Dockerfile (3)
3-7
: Optimise the first APT layer
Add--no-install-recommends
, pin versions, and cleanapt
lists to keep the image slim and reproducible.-RUN apt-get update \ - && DEBIAN_FRONTEND=noninteractive apt-get install -y \ - curl \ - gpg; +RUN set -eux; \ + apt-get update; \ + DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \ + curl=7.81.0-1ubuntu1.13 \ + gnupg=2.2.*; \ + rm -rf /var/lib/apt/lists/*
14-22
: Second APT layer could be merged & cleaned
Merging with the previous layer (or repeating the clean-up here) avoids leaving/var/lib/apt/lists/*
behind and reduces layers.Also consider
--no-install-recommends
unless the extra dependencies are required.
1-24
: Run as non-root and add a HEALTHCHECK
Static analysis flags CKV_DOCKER_2 & 3: create a dedicated user and configure a simple health-probe for Elasticsearch to improve container hardening.If desired, I can provide a patched snippet.
assets/overhead_test/Dockerfile (2)
8-11
: Trim APT install & clean cache
Same optimisation as above: add--no-install-recommends
, pin versions, andrm -rf /var/lib/apt/lists/*
after install to cut ~40 MB.
1-13
: Non-root user & HEALTHCHECK missing
Consider addingHEALTHCHECK --interval=30s CMD true USER nobody
or a project-specific probe to satisfy CKV_DOCKER_2/3.
assets/template/Dockerfile (2)
8-11
: Same APT clean-up recommendation
See comments in other Dockerfiles; slimming the image here yields faster CI pulls.
1-13
: Harden container (non-root + healthcheck)
Replicate the pattern suggested for the overhead_test image to keep base templates aligned.assets/gzip/Dockerfile (1)
1-14
: Add non-root user & HEALTHCHECK
Align with security best practices as discussed for the other images.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (4)
assets/elasticsearch/Dockerfile
(1 hunks)assets/gzip/Dockerfile
(1 hunks)assets/overhead_test/Dockerfile
(1 hunks)assets/template/Dockerfile
(1 hunks)
🧰 Additional context used
🪛 Checkov (3.2.334)
assets/elasticsearch/Dockerfile
[LOW] 1-24: Ensure that HEALTHCHECK instructions have been added to container images
(CKV_DOCKER_2)
[LOW] 1-24: Ensure that a user for the container has been created
(CKV_DOCKER_3)
🪛 Hadolint (2.12.0)
assets/gzip/Dockerfile
[warning] 8-8: Pin versions in apt get install. Instead of apt-get install <package>
use apt-get install <package>=<version>
(DL3008)
[info] 8-8: Delete the apt-get lists after installing something
(DL3009)
[info] 8-8: Avoid additional packages by specifying --no-install-recommends
(DL3015)
assets/gzip/Dockerfile
Outdated
&& DEBIAN_FRONTEND=noninteractive apt-get install -y --allow-downgrades \ | ||
tmux \ | ||
vim \ | ||
gzip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
--allow-downgrades
warrants justification & pinning
Allowing downgrades can silently pull older vulnerable packages. Pin explicit versions or document the requirement; otherwise drop the flag.
Additionally, apply the usual --no-install-recommends
and cache clean-up:
-RUN apt-get update \
- && DEBIAN_FRONTEND=noninteractive apt-get install -y --allow-downgrades \
- tmux \
- vim \
- gzip
+RUN set -eux; \
+ apt-get update; \
+ DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
+ gzip=1.10-4ubuntu4 \
+ vim-tiny=2:8.* \
+ tmux=3.2a-4; \
+ rm -rf /var/lib/apt/lists/*
🏁 Script executed:
#!/bin/bash
# Display the beginning of the Dockerfile to review base image and context
sed -n '1,50p' assets/gzip/Dockerfile
Length of output: 426
Review Dockerfile: remove or justify --allow-downgrades
& improve apt-install hygiene
The --allow-downgrades
flag should either be documented (why a downgrade is needed and what version is targeted) or removed to avoid silently installing older, potentially vulnerable packages. Additionally, to minimise image size and surface area:
- Drop
--allow-downgrades
unless there’s a compelling, documented reason. - Add
--no-install-recommends
to avoid installing unnecessary packages. - Cleanup Apt cache after installation with
rm -rf /var/lib/apt/lists/*
. - (Optional) Pin package versions for reproducibility, e.g.
gzip=1.10-4ubuntu4
,tmux=3.2a-4
,vim-tiny=2:8.*
.
Suggested diff:
-RUN apt-get update \
- && DEBIAN_FRONTEND=noninteractive apt-get install -y --allow-downgrades \
- tmux \
- vim \
- gzip
+RUN set -eux; \
+ apt-get update; \
+ DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
+ gzip \
+ vim-tiny \
+ tmux; \
+ rm -rf /var/lib/apt/lists/*
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
&& DEBIAN_FRONTEND=noninteractive apt-get install -y --allow-downgrades \ | |
tmux \ | |
vim \ | |
gzip | |
RUN set -eux; \ | |
apt-get update; \ | |
DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \ | |
gzip \ | |
vim-tiny \ | |
tmux; \ | |
rm -rf /var/lib/apt/lists/* |
🧰 Tools
🪛 Checkov (3.2.334)
[LOW] 1-12: Ensure that HEALTHCHECK instructions have been added to container images
(CKV_DOCKER_2)
[LOW] 1-12: Ensure that a user for the container has been created
(CKV_DOCKER_3)
🤖 Prompt for AI Agents
In assets/gzip/Dockerfile around lines 9 to 12, remove the --allow-downgrades
flag unless there is a documented reason for downgrading packages. Add the
--no-install-recommends flag to the apt-get install command to avoid installing
unnecessary packages. After installation, add a command to clean up the apt
cache by removing /var/lib/apt/lists/* to reduce image size. Optionally, pin
package versions explicitly for reproducibility by specifying versions for gzip,
tmux, and vim packages.
… accidentially not included in the original commit)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🔭 Outside diff range comments (1)
assets/gzip/Dockerfile (1)
1-14
: Consider adding a non-root user and a HEALTHCHECKRunning as
root
and lacking aHEALTHCHECK
both trigger low-severity findings (CKV_DOCKER_2/3). While not blocking, adopting a non-root user and a simple health probe (CMD gzip --version
) improves security posture and CI signal quality.
♻️ Duplicate comments (1)
assets/gzip/Dockerfile (1)
8-12
: Harden theapt-get
layer: document the downgrade, pin all packages, disable recommends, and clean the cacheThis is essentially the same feedback given in the previous review – it remains unaddressed.
--allow-downgrades
is fine for repeatable benchmarks, but please add an inline comment explaining that requirement so future maintainers don’t delete it by “tidying up”.- Only
gzip
is version-pinned;tmux
andvim
will still float. Pin them too for deterministic results.- Add
--no-install-recommends
to keep the image slim.- Remove
/var/lib/apt/lists/*
in the same layer to cut ~25 MB.- Prefer
set -eux;
to fail fast on package or network errors.Example patch:
-# Install necessary packages -RUN apt-get update \ - && DEBIAN_FRONTEND=noninteractive apt-get install -y --allow-downgrades \ - tmux \ - vim \ - gzip=1.10-4ubuntu4.1 +# Install necessary packages +RUN set -eux; \ + # --allow-downgrades is required so benchmarks can target specific + # historical package versions for apples-to-apples comparisons. + apt-get update; \ + DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends --allow-downgrades \ + gzip=1.10-4ubuntu4.1 \ + tmux=3.2a-4 \ + vim-tiny=2:8.*; \ + rm -rf /var/lib/apt/lists/*
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
assets/gzip/Dockerfile
(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: PacificViking
PR: y-scope/log-archival-bench#5
File: assets/gzip/Dockerfile:0-0
Timestamp: 2025-07-30T13:55:45.493Z
Learning: In benchmarking Docker files, the --allow-downgrades flag in apt-get install commands is intentionally used to ensure benchmarks can run on specified package versions even when Ubuntu has newer versions, maintaining consistency for reliable benchmark results.
assets/gzip/Dockerfile (1)
Learnt from: PacificViking
PR: #5
File: assets/gzip/Dockerfile:0-0
Timestamp: 2025-07-30T13:55:45.493Z
Learning: In benchmarking Docker files, the --allow-downgrades flag in apt-get install commands is intentionally used to ensure benchmarks can run on specified package versions even when Ubuntu has newer versions, maintaining consistency for reliable benchmark results.
🪛 Checkov (3.2.334)
assets/gzip/Dockerfile
[LOW] 1-12: Ensure that HEALTHCHECK instructions have been added to container images
(CKV_DOCKER_2)
[LOW] 1-12: Ensure that a user for the container has been created
(CKV_DOCKER_3)
🪛 Hadolint (2.12.0)
assets/gzip/Dockerfile
[warning] 8-8: Pin versions in apt get install. Instead of apt-get install <package>
use apt-get install <package>=<version>
(DL3008)
[info] 8-8: Delete the apt-get lists after installing something
(DL3009)
[info] 8-8: Avoid additional packages by specifying --no-install-recommends
(DL3015)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🔭 Outside diff range comments (1)
assets/gzip/Dockerfile (1)
1-14
: Optional hardening: non-root user and HEALTHCHECKLow-severity checks (CKV_DOCKER_2, CKV_DOCKER_3). If this image runs processes, add a non-root user and a simple healthcheck; if it’s only for one-off benchmarks, feel free to skip.
♻️ Duplicate comments (1)
assets/gzip/Dockerfile (1)
7-12
: Tighten apt install hygiene; document why downgrades are allowed
- Add --no-install-recommends and clean apt lists to reduce image size (Hadolint DL3009, DL3015).
- Keep --allow-downgrades (per benchmarking rationale) but document it inline to pre-empt security scanner noise.
Apply:
-# Install necessary packages -RUN apt-get update \ - && DEBIAN_FRONTEND=noninteractive apt-get install -y --allow-downgrades \ - tmux \ - vim \ - gzip=1.10-4ubuntu4.1 +# Install necessary packages +# NOTE: --allow-downgrades is intentional to run benchmarks against specific versions for reproducibility. +RUN set -eux; \ + apt-get update; \ + DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends --allow-downgrades \ + tmux \ + vim \ + gzip=1.10-4ubuntu4.1; \ + rm -rf /var/lib/apt/lists/*
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
assets/gzip/Dockerfile
(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: PacificViking
PR: y-scope/log-archival-bench#5
File: assets/gzip/Dockerfile:0-0
Timestamp: 2025-07-30T13:55:45.517Z
Learning: In benchmarking Docker files, the --allow-downgrades flag in apt-get install commands is intentionally used to ensure benchmarks can run on specified package versions even when Ubuntu has newer versions, maintaining consistency for reliable benchmark results.
📚 Learning: 2025-07-30T13:55:45.517Z
Learnt from: PacificViking
PR: y-scope/log-archival-bench#5
File: assets/gzip/Dockerfile:0-0
Timestamp: 2025-07-30T13:55:45.517Z
Learning: In benchmarking Docker files, the --allow-downgrades flag in apt-get install commands is intentionally used to ensure benchmarks can run on specified package versions even when Ubuntu has newer versions, maintaining consistency for reliable benchmark results.
Applied to files:
assets/gzip/Dockerfile
🪛 Checkov (3.2.334)
assets/gzip/Dockerfile
[LOW] 1-12: Ensure that HEALTHCHECK instructions have been added to container images
(CKV_DOCKER_2)
[LOW] 1-12: Ensure that a user for the container has been created
(CKV_DOCKER_3)
🪛 Hadolint (2.12.0)
assets/gzip/Dockerfile
[warning] 8-8: Pin versions in apt get install. Instead of apt-get install <package>
use apt-get install <package>=<version>
(DL3008)
[info] 8-8: Delete the apt-get lists after installing something
(DL3009)
[info] 8-8: Avoid additional packages by specifying --no-install-recommends
(DL3015)
🔇 Additional comments (1)
assets/gzip/Dockerfile (1)
5-5
: Base image pinned to immutable digest — good for reproducibilityPinning to a digest is the right call for stable, comparable benchmarks. Consider scheduling periodic bumps to pick up security fixes.
&& DEBIAN_FRONTEND=noninteractive apt-get install -y --allow-downgrades \ | ||
tmux \ | ||
vim \ | ||
gzip | ||
gzip=1.10-4ubuntu4.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Reproducibility: consider pinning tmux and vim as well
You’ve pinned gzip; for full reproducibility, pin tmux and vim to Jammy archive versions too. If size matters more than full Vim, switch to vim-tiny.
🧰 Tools
🪛 Checkov (3.2.334)
[LOW] 1-12: Ensure that HEALTHCHECK instructions have been added to container images
(CKV_DOCKER_2)
[LOW] 1-12: Ensure that a user for the container has been created
(CKV_DOCKER_3)
🤖 Prompt for AI Agents
In assets/gzip/Dockerfile around lines 9 to 12, tmux and vim are installed
without version pins, which can cause reproducibility issues. Update the apt-get
install command to specify exact versions of tmux and vim from the Jammy
archive, similar to how gzip is pinned. Alternatively, if image size is a
concern, replace vim with vim-tiny and pin its version accordingly to ensure
consistent builds.
Description
As title. Other assets do not need to depend on CLP images.
Checklist
breaking change.
Validation performed
Summary by CodeRabbit