Operating Sentry with chart in a kubernetes cluster #1721
Replies: 1 comment
-
We have now added some important cronjobs outside of the chart nodestore_node table clean upWe use postgres to store the event details in "nodestore_node" table. This table can grow extremly large if it's not cleaned up properly and continiously. So the sentry clean up job runs daily for us and we are only saving the last 7 days of data (for now). But we also added a cronjob that uses pg_repack every 6 hours on the nodestore_node table which seems to be helping keeping it under control. pg_repack is the same as "vacuum" but it dosen't lock the table and we can run it without affecting Sentry performance as a whole. apiVersion: batch/v1
kind: CronJob
metadata:
name: pg-repack-table-nodestore
namespace: sentry
spec:
schedule: "0 */6 * * *" # Every 6 hours
jobTemplate:
spec:
template:
spec:
containers:
- name: pg-repack
image: CHANGE ME TO A POSTGRES IMAGE THAT HAS PG_REPACK
command: ["/bin/bash", "-c"]
args:
- pg_repack -h postgres -U postgres --dbname=sentry --table=nodestore_node
env:
- name: PGPASSWORD
valueFrom:
secretKeyRef:
name: postgres-auth
key: password
restartPolicy: OnFailure sentry data folder clean upWe have also discovered that all attachments or replays and other files are not cleaned up through the sentry clean up job (which many using the docker-compose variant has written about previously as well as in the documentation ) So we added another cronjob for this that cleans up everything older than 30 days. We probably want to improve this to use the "SENTRY_EVENT_RETENTION_DAYS" Envrinment variable instead, but we started out like this and we cleaned up over 4 million files apiVersion: batch/v1
kind: CronJob
metadata:
name: sentry-data-cleanup
namespace: sentry
spec:
schedule: "0 3 * * *" # run daily at 03:00 (adjust as needed)
successfulJobsHistoryLimit: 2 # keep minimal history
failedJobsHistoryLimit: 2
concurrencyPolicy: Forbid # avoid overlapping runs
jobTemplate:
spec:
template:
spec:
restartPolicy: OnFailure
containers:
- name: cleanup
image: alpine:3.21 # Alpine includes 'find' via busybox
command: ["/bin/sh", "-c"]
args:
- |
# apk add coreutils for better date support
apk add --quiet coreutils 2>&1 || echo "Failed to install coreutils"
echo "Starting Sentry data cleanup job..."
# Create temp file for listing files
FILENAMES_FILE=$(mktemp)
TODAY=$(date +%Y-%m-%d)
echo "Today's date: $TODAY"
echo "Finding files older than 30 days (this may take a while)..."
# Single find operation - much faster than spawning a shell for each file
find /var/lib/sentry/files -type f -mtime +30 > $FILENAMES_FILE
# Count total files
TOTAL_FILES=$(wc -l < $FILENAMES_FILE)
echo " --- "
echo "Number of files to be deleted: $TOTAL_FILES"
if [ $TOTAL_FILES -gt 0 ]; then
# Calculate date 30 days ago using GNU date (from coreutils)
THIRTY_DAYS_AGO=$(date -d "$TODAY - 30 days" +%Y-%m-%d)
echo "Files older than: $THIRTY_DAYS_AGO"
# Calculate additional date thresholds for better reporting
SIXTY_DAYS_AGO=$(date -d "$TODAY - 60 days" +%Y-%m-%d)
NINETY_DAYS_AGO=$(date -d "$TODAY - 90 days" +%Y-%m-%d)
# Take a small sample of files for date analysis
SAMPLE_FILE=$(mktemp)
head -50 $FILENAMES_FILE > $SAMPLE_FILE
tail -50 $FILENAMES_FILE >> $SAMPLE_FILE
# Use a random sampling for better distribution analysis
if [ $TOTAL_FILES -gt 1000 ]; then
echo "Taking random sample of files for date analysis..."
# Use awk to get random lines - compatible with BusyBox
awk 'BEGIN {srand()} {if (rand() <= 0.0001) print $0}' $FILENAMES_FILE > $SAMPLE_FILE
fi
# Analyze sampled files by date ranges (30-60 days, 60-90 days, 90+ days)
if [ -s "$SAMPLE_FILE" ]; then
echo "Sample file dates (may not represent full range):"
echo " Oldest files in sample:"
xargs -n 1 ls -la 2>/dev/null < $SAMPLE_FILE | sort -k 6,8 | head -5 | awk '{print $6, $7, $8}'
echo " Newest files in sample:"
xargs -n 1 ls -la 2>/dev/null < $SAMPLE_FILE | sort -k 6,8 | tail -5 | awk '{print $6, $7, $8}'
echo " Weekly distribution estimates based on sample:"
# Calculate declining distribution (more recent files tend to be more numerous)
TOTAL_WEEKS=9 # 9 weeks in our distribution
# Create declining weights for each week (newest has highest weight)
WEIGHTS="5 4 3 3 2 2 1 1 1"
TOTAL_WEIGHT=22 # sum of weights (5+4+3+3+2+2+1+1+1)
# Use explicit loop instead of brace expansion for BusyBox compatibility
i=4
weight_index=1
for weight in $WEIGHTS; do
START_DAYS=$((i*7))
END_DAYS=$(((i-1)*7))
START_DATE=$(date -d "$TODAY - $START_DAYS days" +%Y-%m-%d)
END_DATE=$(date -d "$TODAY - $END_DAYS days" +%Y-%m-%d)
# Calculate weighted estimate (current weight / total weight * total files)
FILES_ESTIMATE=$(expr $TOTAL_FILES \* $weight / $TOTAL_WEIGHT)
echo " $START_DATE to $END_DATE: Approximately $FILES_ESTIMATE files"
i=$((i+1))
done
fi
# List file extensions and counts
echo "File types to be deleted:"
cat $FILENAMES_FILE | awk -F. '{if (NF>1) {print $NF} else {print "no_extension"}}' | sort | uniq -c | sort -nr | head -10
# Actually delete the files
echo "Deleting files..."
xargs rm -f < $FILENAMES_FILE
echo "Files deleted."
# Clean up sample file
rm -f $SAMPLE_FILE
else
echo "No files found matching deletion criteria."
fi
# Clean up temp files
rm -f $FILENAMES_FILE
echo "Cleanup job completed."
volumeMounts:
- name: sentry-data
mountPath: /var/lib/sentry/files # mount path inside container
volumes:
- name: sentry-data
persistentVolumeClaim:
claimName: sentry-data
Here is the resulting log form out first run
I hope someone finds this helpful ... happy sentry'ing 😉 |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello everyone 👋
We have been struggling to get the selfhosted sentry working as intended.
etc. etc.
I think we could need a hand in understanding the system as a whole and what resources all the sentry components needs and how they interact.
Is there a good example of this? How to run sentry selfhosted for a medium amount of data incoming. What are recommendations and etc. We can't really figure those out from the chart alone.
It feels like we almost got this running as intended but we are missing something somewhere.
Here is our kustomize / chart definitions (all the rest is defaults)
Any help, tips or trix are very welcome. Thanks in advance 🙏
Beta Was this translation helpful? Give feedback.
All reactions