-
Notifications
You must be signed in to change notification settings - Fork 940
Gcs database #12817
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Gcs database #12817
Changes from 2 commits
5d7ffa4
950bd1a
bcb41e0
61732af
5cfcb44
1b9a20c
ff4e7f1
8065eee
b7ca03a
c3b0cc5
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,7 +8,7 @@ | |
import sys | ||
|
||
import requests | ||
from db_s3_utils import ( | ||
from db_utils import ( | ||
DATA_PATH, | ||
DB_FILE, | ||
JSON_DATA_FILE_NAME, | ||
|
@@ -21,9 +21,12 @@ | |
BUCKET_NAME = os.getenv("AWS_DB_S3_BUCKET", "bedrock-db-dev") | ||
REGION_NAME = os.getenv("AWS_DB_REGION", "us-west-2") | ||
S3_BASE_URL = f"https://s3-{REGION_NAME}.amazonaws.com/{BUCKET_NAME}" | ||
GCS_BASE_URL = f"https://storage.googleapis.com/{BUCKET_NAME}" | ||
DOWNLOAD_FROM_GCS = os.getenv("DOWNLOAD_FROM_GCS", 'False').lower() in ('true', '1', 't') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Our tests don't check There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds good, this line might be a little over the top too. Open to feedback on how you all like to handle boolean values as env variables. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In other parts of the code base we use everett. If we used it here this could be written as There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. oh cool, thanks @robhudson pushed up a change that pulls that in. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Welp, maybe that is why looks to fail the tests: Traceback (most recent call last):
File "/app/./bin/run-db-download.py", line 10, in <module>
from bedrock.base.config_manager import config
ModuleNotFoundError: No module named 'bedrock' |
||
|
||
|
||
def get_file_url(filename): | ||
base_url = GCS_BASE_URL if DOWNLOAD_FROM_GCS else S3_BASE_URL | ||
return "/".join([S3_BASE_URL, filename]) | ||
|
||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -10,7 +10,7 @@ | |
|
||
import boto3 | ||
from boto3.exceptions import Boto3Error | ||
from db_s3_utils import ( | ||
from db_utils import ( | ||
DB_FILE, | ||
JSON_DATA_FILE, | ||
JSON_DATA_FILE_NAME, | ||
|
@@ -19,10 +19,12 @@ | |
get_prev_db_data, | ||
set_db_data, | ||
) | ||
from google.cloud import storage | ||
|
||
CACHE = {} | ||
BUCKET_NAME = os.getenv("AWS_DB_S3_BUCKET", "bedrock-db-dev") | ||
REGION_NAME = os.getenv("AWS_DB_S3_REGION", "us-west-2") | ||
UPLOAD_TO_GCS = os.getenv("UPLOAD_TO_GCS", 'False').lower() in ('true', '1', 't') | ||
|
||
|
||
# Requires setting some environment variables: | ||
|
@@ -41,6 +43,15 @@ def s3_client(): | |
return s3 | ||
|
||
|
||
def gcs_client(): | ||
gcs = CACHE.get("gcs_client") | ||
if not gcs: | ||
gcs = storage.Client() | ||
CACHE["gcs_client"] = gcs | ||
|
||
return gcs | ||
|
||
|
||
def delete_s3_obj(filename): | ||
s3 = s3_client() | ||
s3.delete_object(Bucket=BUCKET_NAME, Key=filename) | ||
|
@@ -63,6 +74,18 @@ def upload_db_data(db_data): | |
except Boto3Error: | ||
return f"ERROR: Failed to upload the new database info file: {db_data}" | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we want anything around the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My idea was to do kind of a multi step deploy:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep, sounds good to me. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @bkochendorfer @pmac Can we get this rolled out to prod? Good to do ahead of the postgres/cms tango |
||
if UPLOAD_TO_GCS: | ||
gcs = gcs_client() | ||
bucket = gcs.bucket(BUCKET_NAME) | ||
|
||
# upload the database | ||
db_file = bucket.blob(db_data["file_name"]) | ||
db_file.upload_from_filename(DB_FILE, predefined_acl="public-read") | ||
|
||
# upload the json metadata | ||
db_file_info = bucket.blob(JSON_DATA_FILE_NAME) | ||
db_file_info.upload_from_filename(JSON_DATA_FILE, predefined_acl="public-read") | ||
|
||
return 0 | ||
|
||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we have
config
now, we could optionally convert these (or anywhere else in this file where we useos.getenv
). An example of a string value would just be:Sorry to tack on so much extra.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem at all!