Skip to content
Draft
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions yscope-k8s/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
demo-assets/clp-config.yml
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Add companion ignore rules for derived or editor artefacts

You already ignore the generated clp-config.yml. It may be worth also excluding its backup variants and common editor swap files to keep the repo clean:

 demo-assets/clp-config.yml
+demo-assets/clp-config.yml.bak
+*.swp
+*.tmp
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
demo-assets/clp-config.yml
demo-assets/clp-config.yml
demo-assets/clp-config.yml.bak
*.swp
*.tmp
🤖 Prompt for AI Agents
In yscope-k8s/.gitignore at line 1, add ignore rules for backup and editor swap
files related to clp-config.yml, such as clp-config.yml~, .clp-config.yml.swp,
and other common temporary files, to keep the repository clean from derived or
editor artefacts.

27 changes: 27 additions & 0 deletions yscope-k8s/.helmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~

Comment on lines +4 to +19
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Minor style nit – drop the leading dots for Helm-relative paths

Unlike .gitignore, Helm’s ignore patterns are evaluated from the chart root, so foo/ is sufficient—there’s no need for ./foo or .foo. Tidying this up is optional but keeps the list consistent.

🤖 Prompt for AI Agents
In yscope-k8s/.helmignore between lines 4 and 19, remove the leading dots from
the directory and file patterns to match Helm's ignore pattern style, which
evaluates paths relative to the chart root. For example, change '.git/' to
'git/' and '.gitignore' to 'gitignore'. This adjustment tidies up the file and
aligns it with Helm's expected pattern format.

# Various IDEs
.project
.idea/
*.tmproj
.vscode/

# Demo assets
demo-assets/
24 changes: 24 additions & 0 deletions yscope-k8s/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
apiVersion: v2
name: presto-velox
description: A Helm chart for Kubernetes

# A chart can be either an 'application' or a 'library' chart.
#
# Application charts are a collection of templates that can be packaged into versioned archives
# to be deployed.
#
# Library charts provide useful utilities or functions for the chart developer. They're included as
# a dependency of application charts to inject those utilities and functions into the rendering
# pipeline. Library charts do not define any templates and therefore cannot be deployed.
type: application

# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.1.0

# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to
# follow Semantic Versioning. They should reflect the version the application is using.
# It is recommended to use it with quotes.
appVersion: "1.16.0"
104 changes: 104 additions & 0 deletions yscope-k8s/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# Setup local K8s cluster for presto + clp
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Keep a single H1 and demote the others
The doc currently defines five #-level headings, which violates MD025 and breaks the logical outline. Only the title should stay at #; demote the rest to ##.

-# Launch clp-package
+# ## Launch clp-package
...
-# Create k8s Cluster
+# ## Create k8s Cluster
...
-# Working with helm chart
+# ## Working with helm chart
...
-# Delete k8s Cluster
+# ## Delete k8s Cluster

Also applies to: 26-26, 53-53, 59-59, 94-94

🤖 Prompt for AI Agents
In yscope-k8s/README.md at lines 1, 26, 53, 59, and 94, the markdown headings
use multiple H1 (#) levels which violates MD025 and disrupts the document
structure. Keep only the main title at H1 (#) and change all other headings to
H2 (##) or lower as appropriate to maintain a proper logical outline.


## Install docker

Follow the guide here: [docker]

## Install kubectl

`kubectl` is the command-line tool for interacting with Kubernetes clusters. You will use it to
manage and inspect your k3d cluster.

Follow the guide here: [kubectl]

## Install k3d

k3d is a lightweight wrapper to run k3s (Rancher Lab's minimal Kubernetes distribution) in docker.

Follow the guide here: [k3d]

## Install Helm

Helm is the package manager for Kubernetes.

Follow the guide here: [helm]

# Launch clp-package
1. Find the clp-package for test on our official website [clp-json-v0.4.0]. We also put the dataset for demo here: `mongod-256MB-presto-clp.log.tar.gz`.

2. Untar it.

3. Replace the content of `/path/to/clp-json-package/etc/clp-config.yml` with the output of `demo-assets/init.sh <ip_addr>` where the `<ip_addr>` is the IP address of the host that you are running the clp-package.

4. Launch:
```bash
# You probably want to run in a 3.11 python environment
sbin/start-clp.sh
```
Comment on lines +31 to +37
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Insert blank lines around list items and fenced blocks
Lists (1.6.) and the following fenced bash blocks are missing the mandatory blank line before/after, triggering MD031/MD032 and rendering inconsistently.

 3. Replace the content …
-4. Launch:
-```bash
+# 
+4. Launch:
+
+```bash

Apply the same pattern to the other ordered-list sections.

🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

33-33: Lists should be surrounded by blank lines

(MD032, blanks-around-lists)


34-34: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

🤖 Prompt for AI Agents
In yscope-k8s/README.md around lines 31 to 37, the ordered list items and the
fenced bash code blocks lack the required blank lines before and after, causing
markdown linting errors MD031 and MD032 and inconsistent rendering. Add a blank
line before and after each list item and fenced code block, including the one
before the "4. Launch:" line and the fenced bash block following it. Apply this
spacing pattern consistently to all other ordered list sections in the file.


5. Compress:
```bash
# You can also use your own dataset
sbin/compress.sh --timestamp-key 't.dollar_sign_date' datasets/mongod-256MB-processed.log
```

6. Use the following command to update the CLP metadata database so that the worker can find the archives in right place:
```bash
# Install mysql-client if necessary
sudo apt update && sudo apt install -y mysql-client
Comment on lines +33 to +48
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Surround list items and fenced blocks with blank lines

Several ordered-list steps (e.g., 3–6) butt directly against code fences, triggering MD031/MD032 and rendering oddly. Insert a blank line before and after each fenced block and between list items.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~45-~45: Possible missing article found.
Context: ...hat the worker can find the archives in right place: ```bash # Install mysql-client i...

(AI_HYDRA_LEO_MISSING_THE)

🪛 markdownlint-cli2 (0.17.2)

33-33: Lists should be surrounded by blank lines

(MD032, blanks-around-lists)


34-34: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


39-39: Ordered list item prefix
Expected: 1; Actual: 5; Style: 1/1/1

(MD029, ol-prefix)


39-39: Lists should be surrounded by blank lines

(MD032, blanks-around-lists)


40-40: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


45-45: Ordered list item prefix
Expected: 1; Actual: 6; Style: 1/1/1

(MD029, ol-prefix)


45-45: Lists should be surrounded by blank lines

(MD032, blanks-around-lists)


46-46: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

🤖 Prompt for AI Agents
In yscope-k8s/README.md between lines 33 and 48, the ordered list items and
fenced code blocks are not separated by blank lines, causing markdown rendering
issues. Add a blank line before and after each fenced code block and between
each list item to ensure proper spacing and correct markdown formatting.

# Find the user and password in /path/to/clp-json-package/etc/credential.yml
mysql -h ${REPLACE_IP} -P 6001 -u ${REPLACE_USER} -p'${REPLACE_PASSWORD}' clp-db -e "UPDATE clp_datasets SET archive_storage_directory = '/var/data/archives/default';"
```
Comment on lines +45 to +51
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Minor grammar fix for clarity

-mysql … -e "UPDATE clp_datasets SET archive_storage_directory = '/var/data/archives/default';"
+mysql … -e "UPDATE clp_datasets SET archive_storage_directory = '/var/data/archives/default';"
                                               ^ add “the”

“…find the archives in the right place”

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~45-~45: Possible missing article found.
Context: ...hat the worker can find the archives in right place: ```bash # Install mysql-client i...

(AI_HYDRA_LEO_MISSING_THE)

🪛 markdownlint-cli2 (0.17.2)

45-45: Ordered list item prefix
Expected: 1; Actual: 6; Style: 1/1/1

(MD029, ol-prefix)


45-45: Lists should be surrounded by blank lines

(MD032, blanks-around-lists)


46-46: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

🤖 Prompt for AI Agents
In yscope-k8s/README.md around lines 45 to 51, update the phrase "find the
archives in right place" to "find the archives in the right place" by adding the
missing article "the" for grammatical correctness and clarity.


# Create k8s Cluster
Create a local k8s cluster with port forwarding
```bash
k3d cluster create yscope --servers 1 --agents 1 -v $(readlink -f /path/to/clp-json-package/var/data/archives):/var/data/archives
```

# Working with helm chart
## Install
In `yscope-k8s/templates/presto/presto-coordinator-config.yaml` replace the `${REPLACE_IP}` in `clp.metadata-db-url=jdbc:mysql://${REPLACE_IP}:6001` with the IP address of the host you are running the clp-package (basially match the IP address that you configured in the `etc/clp-config.yml` of the clp-package).

```bash
cd yscope-k8s

helm template .

helm install demo .
```

## Use cli:
After all containers are in "Running" states (check by `kubectl get pods`):
```bash
kubectl port-forward service/presto-coordinator 8080:8080
```

Then you can further forward the 8080 port to your local laptop, to access the Presto's WebUI by e.g., http://localhost:8080

To use presto-cli:
```bash
./presto-cli-0.293-executable.jar --catalog clp --schema default --server localhost:8080
```

Example query:
```
SELECT * FROM default LIMIT 1;
```
Comment on lines +88 to +90
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Add language hint to SQL block

The example query lacks a language tag, tripping MD040 and losing syntax highlighting.

-```
+```sql
 SELECT * FROM default LIMIT 1;

<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 markdownlint-cli2 (0.17.2)</summary>

88-88: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

---

88-88: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</details>

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

In yscope-k8s/README.md around lines 88 to 90, the SQL code block is missing a
language hint, causing markdown linting errors and no syntax highlighting. Add
the language tag "sql" immediately after the opening triple backticks to enable
proper syntax highlighting and fix the MD040 lint error.


</details>

<!-- fingerprinting:phantom:triton:cougar -->

<!-- This is an auto-generated comment by CodeRabbit -->


## Uninstall
```bash
helm uninstall demo
```

# Delete k8s Cluster
```bash
k3d cluster delete yscope
```


[clp-json-v0.4.0]: https://github.com/y-scope/clp/releases/tag/v0.4.0
[docker]: https://docs.docker.com/engine/install
[k3d]: https://k3d.io/stable/#installation
[kubectl]: https://kubernetes.io/docs/tasks/tools/#kubectl
[helm]: https://helm.sh/docs/intro/install/
38 changes: 38 additions & 0 deletions yscope-k8s/demo-assets/clp-config.yml.bak
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
package:
storage_engine: "clp-s"
database:
type: "mariadb"
host: "${REPLACE_IP}"
port: 6001
name: "clp-db"
query_scheduler:
host: "${REPLACE_IP}"
port: 6002
jobs_poll_delay: 0.1
num_archives_to_search_per_sub_job: 16
logging_level: "INFO"
queue:
host: "${REPLACE_IP}"
port: 6003
redis:
host: "${REPLACE_IP}"
port: 6004
query_backend_database: 0
compression_backend_database: 1
reducer:
host: "${REPLACE_IP}"
base_port: 6100
logging_level: "INFO"
upsert_interval: 100
results_cache:
host: "${REPLACE_IP}"
port: 6005
db_name: "clp-query-results"
stream_collection_name: "stream-files"
webui:
host: "localhost"
port: 6000
logging_level: "INFO"
log_viewer_webui:
host: "localhost"
port: 6006
Comment on lines +32 to +38
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Consider parameterizing webui and log_viewer_webui hosts as well

Most services rely on the ${REPLACE_IP} placeholder, but the two UI blocks are hard-coded to localhost. If the cluster is accessed remotely (e.g., via kubectl port-forward or an ingress), these addresses may become invalid. Exposing them as the same placeholder keeps the template fully relocatable.

-webui:
-  host: "localhost"
+webui:
+  host: "${REPLACE_IP}"
 ...
-log_viewer_webui:
-  host: "localhost"
+log_viewer_webui:
+  host: "${REPLACE_IP}"
🤖 Prompt for AI Agents
In yscope-k8s/demo-assets/clp-config.yml.bak around lines 32 to 38, the webui
and log_viewer_webui host values are hardcoded as "localhost". To make the
configuration relocatable and consistent with other services using the
${REPLACE_IP} placeholder, replace the "localhost" strings with the
${REPLACE_IP} placeholder for both webui and log_viewer_webui host entries.

17 changes: 17 additions & 0 deletions yscope-k8s/demo-assets/init.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
#!/usr/bin/env bash

Comment on lines +1 to +2
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Enable safer Bash defaults

Fail fast on any unexpected error or unset variable to avoid silently producing a partially-edited config.

 #!/usr/bin/env bash
+set -euo pipefail
🤖 Prompt for AI Agents
In yscope-k8s/demo-assets/init.sh at lines 1 to 2, the script lacks safer Bash
defaults that ensure it fails fast on errors or unset variables. Add "set -euo
pipefail" after the shebang line to enable these safer defaults, which will
cause the script to exit immediately on errors, unset variables, or failed
pipeline commands.

if [ "$#" -ne 1 ]; then
echo "Usage: $0 <ip-address>"
exit 1
fi

SCRIPT_PATH="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"

IP="$1"
FILE="${SCRIPT_PATH}/clp-config.yml"

cp "${FILE}.bak" "$FILE"

sed -i "s|\${REPLACE_IP}|$IP|g" "$FILE"
Comment on lines +13 to +15
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

sed -i is not portable across macOS vs GNU sed

The current one-argument form works on GNU sed; macOS requires an empty string after -i. Consider a small portability shim:

if sed --version >/dev/null 2>&1; then
  sed -i "s|${REPLACE_IP}|$IP|g" "$FILE"
else
  sed -i '' "s|${REPLACE_IP}|$IP|g" "$FILE"
fi

This prevents the script from failing on contributors’ Macs.

🤖 Prompt for AI Agents
In yscope-k8s/demo-assets/init.sh around lines 13 to 15, the sed command uses
the -i option without a backup suffix, which is not portable between GNU sed and
macOS sed. To fix this, add a conditional check for GNU sed by running sed
--version and use the one-argument -i option if true; otherwise, use the
macOS-compatible form with an empty string after -i. Replace the current sed
line with this conditional logic to ensure compatibility on both Linux and macOS
systems.


echo "Replaced \${REPLACE_IP} with $IP in $FILE"
Binary file not shown.
Binary file not shown.
8 changes: 8 additions & 0 deletions yscope-k8s/templates/object-store/aws-credentials.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
apiVersion: "v1"
kind: "Secret"
metadata:
name: "aws-credentials"
namespace: "default"
type: "Opaque"
data:
credentials: "W2RlZmF1bHRdCmF3c19hY2Nlc3Nfa2V5X2lkID0gbWluaW9hZG1pbgphd3Nfc2VjcmV0X2FjY2Vzc19rZXkgPSBtaW5pb2FkbWluCg=="
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Static credentials committed to VCS – potential secret leak

The secret contains real base64-encoded keys (minioadmin). Once merged, the data is permanently recorded in git history. Replace with templated placeholders and document generating the secret locally.

-  credentials: "W2RlZmF1bHRdCmF3c19hY2Nlc3Nfa2V5X2lkID0gbWluaW9hZG1pbgphd3Nfc2VjcmV0X2FjY2Vzc19rZXkgPSBtaW5pb2FkbWluCg=="
+  # Base64 of a generated ~/.aws/credentials file
+  credentials: {{ .Values.objectStore.minio.awsCredentials | b64enc | quote }}

Add the key in values.yaml and mark it with # pragma: allowlist secret to silence secret-scanner noise.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
data:
credentials: "W2RlZmF1bHRdCmF3c19hY2Nlc3Nfa2V5X2lkID0gbWluaW9hZG1pbgphd3Nfc2VjcmV0X2FjY2Vzc19rZXkgPSBtaW5pb2FkbWluCg=="
data:
# Base64 of a generated ~/.aws/credentials file
credentials: {{ .Values.objectStore.minio.awsCredentials | b64enc | quote }}
🧰 Tools
🪛 Checkov (3.2.334)

[LOW] 1-8: The default namespace should not be used

(CKV_K8S_21)

🤖 Prompt for AI Agents
In yscope-k8s/templates/object-store/aws-credentials.yaml at lines 7-8, the
base64-encoded static credentials containing real keys are committed, risking
secret leaks. Replace the hardcoded base64 string with templated placeholders
that reference values from values.yaml. Then, add the actual secret keys in
values.yaml with the comment # pragma: allowlist secret to document and allow
secret scanning exceptions. Also, update documentation to instruct generating
and injecting secrets locally instead of committing them.

157 changes: 157 additions & 0 deletions yscope-k8s/templates/object-store/bucket-creation.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
kind: "Job"
metadata:
name: "bucket-creation"
spec:
template:
spec:
containers:
# Container to deploy the log viewer. To inspect logs, use the following command:
# `kubectl logs job.batch/bucket-creation`
- name: "bucket-creation"
image: "amazon/aws-cli:latest"
command:
- "/bin/bash"
args:
- "/scripts/bucket-creation.sh"
env:
- name: "AWS_ENDPOINT_URL"
value: "http://{{ .Values.objectStore.minio.serviceName }}.default.svc.cluster.local:{{ .Values.objectStore.minio.apiPort }}"
- name: "BUCKET_NAME"
value: "{{ .Values.objectStore.bucketCreation.bucketName }}"
- name: "PUBLIC"
value: "{{ .Values.objectStore.bucketCreation.public }}"
volumeMounts:
- name: "aws-credentials-volume"
mountPath: "/root/.aws"
- name: "scripts-volume"
mountPath: "/scripts"
imagePullPolicy: "IfNotPresent"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Fix YAML formatting and pin container image

  1. YAMLlint flags the double space after the colon – keep one space only.
  2. Using amazon/aws-cli:latest makes builds non-reproducible and risks breaking when the image is updated. Pin to a specific tag (e.g., amazon/aws-cli:2.15.15) or digest.
-          image: "amazon/aws-cli:latest"
+          image: "amazon/aws-cli:2.15.15"

-            - name:  "scripts-volume"
+            - name: "scripts-volume"

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools
🪛 YAMLlint (1.37.1)

[error] 27-27: too many spaces after colon

(colons)

🤖 Prompt for AI Agents
In yscope-k8s/templates/object-store/bucket-creation.yaml around lines 24 to 29,
fix the YAML formatting by ensuring there is only one space after each colon,
removing any double spaces. Additionally, replace the container image tag
"latest" with a specific version tag like "2.15.15" or use a digest to pin the
image, ensuring reproducible and stable builds.

restartPolicy: "Never"
volumes:
- name: "aws-credentials-volume"
secret:
secretName: "aws-credentials"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Consider setting backoffLimit on the Job

With the default (6) a failing bucket-creation script will be retried for ~2 hours on k8s 1.29. Explicitly setting a lower backoffLimit (e.g., 1) shortens feedback loops when credentials or endpoint values are wrong.

🤖 Prompt for AI Agents
In yscope-k8s/templates/object-store/bucket-creation.yaml around lines 31 to 34,
the Job resource does not specify a backoffLimit, causing Kubernetes to retry
failed jobs multiple times over a long period. Add a backoffLimit field with a
lower value such as 1 to the Job spec to reduce retry attempts and shorten
feedback loops when the bucket-creation script fails.

- name: "scripts-volume"
configMap:
name: "bucket-creation"
---
apiVersion: v1
kind: "ConfigMap"
metadata:
name: "bucket-creation"
data:
bucket-creation.sh: |-
#!/usr/bin/env bash

# Create a bucket and optionally it configure with public read access
# on a S3-compatible object store such as MinIO
#
# Requirements:
#
# * AWS CLI authentication configured using any supported method---for example:
# * A credentials file in $HOME/.aws/credentials
# * AWS_CONFIG_FILE pointing to a custom credentials file
# * Environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY
# * Environment variables:
# * AWS_ENDPOINT_URL: The S3-compatible object store endpoint URL
# * BUCKET_NAME: The name of the bucket where the log viewer should be deployed
# * NOTE: This script will make the bucket publicly readable.
# * PUBLIC (Optional): If set to "true", configures bucket with public read policy
set -e
set -o pipefail
set -u

# Emits a log event to stderr with an auto-generated ISO timestamp as well as the given level
# and message.
#
# @param $1: Level string
# @param $2: Message to be logged
log() {
local -r LEVEL=$1
local -r MESSAGE=$2
echo "$(date --utc --date="now" +"%Y-%m-%dT%H:%M:%SZ") [${LEVEL}] ${MESSAGE}" >&2
}

# Waits for the S3 endpoint to be available, or exits if it's unavailable.
wait_for_s3_availability() {
# Check availability by listing available buckets
log "INFO" "Waiting until ${AWS_ENDPOINT_URL} endpoint becomes available."
local -r MAX_RETRIES=10
local -r RETRY_DELAY_IN_SECS=6
for ((retries = 0; retries < MAX_RETRIES; retries++)); do
if aws s3 ls --endpoint-url "$AWS_ENDPOINT_URL" >/dev/null; then
return
fi
log "WARN" "S3 API endpoint unavailable. Retrying in ${RETRY_DELAY_IN_SECS} seconds."

sleep "$RETRY_DELAY_IN_SECS"
done

if [[ $retries -eq $MAX_RETRIES ]]; then
log "ERROR" "Maximum retries reached. S3 API endpoint ${AWS_ENDPOINT_URL} didn't respond."
exit 1
fi
}

# Creates a bucket
create_bucket() {
# Create log-viewer bucket if it doesn't already exist
log "INFO" "Creating ${BUCKET_S3_URI} bucket."
if ! aws s3api head-bucket --endpoint-url "$AWS_ENDPOINT_URL" --bucket "$BUCKET_NAME" \
2>/dev/null; then
aws s3api create-bucket --endpoint-url "$AWS_ENDPOINT_URL" --bucket "$BUCKET_NAME"
fi
}

# Configures a bucket with public read access
configure_bucket() {
# Define and apply the bucket policy for public read access
log "INFO" "Applying public read access policy to ${BUCKET_S3_URI}"
local -r POLICY=$(
cat <<EOP
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::${BUCKET_NAME}/*"
}
]
}
EOP
)
if ! aws s3api put-bucket-policy \
--endpoint-url "$AWS_ENDPOINT_URL" \
--bucket "$BUCKET_NAME" \
--policy "$POLICY"; then
log "ERROR" "Failed to set bucket policy for ${BUCKET_S3_URI}"
exit 1
fi
}

# Validate required environment variables
readonly REQUIRED_ENV_VARS=(
# Example: "http://minio:9000"
"AWS_ENDPOINT_URL"

# Example: "logs"
"BUCKET_NAME"
)
for var in "${REQUIRED_ENV_VARS[@]}"; do
if ! [[ -v "$var" ]]; then
log "ERROR" "$var environment variable must be set."
exit 1
fi
done

readonly BUCKET_S3_URI="s3://${BUCKET_NAME}"

wait_for_s3_availability
create_bucket
if [[ "${PUBLIC:-false}" = "true" ]]; then
configure_bucket
fi

log "INFO" "Bucket ${BUCKET_NAME} created and configured successfully."
Loading
Loading