Skip to content

Feat: use PostgreSQL for the database backend #131

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

lalver1
Copy link
Member

@lalver1 lalver1 commented Jun 18, 2025

Closes #126, closes #138, closes #139

This PR migrates the Django database from SQLite to PostreSQL. Django's settings.py file now uses the PostgreSQL back-end and redundant database-related shell scripts are removed.

Locally and for some CI workflows, the ensure_db command is used to check that the database is configured, and only creates it if needed (i.e. the first time). The command also runs migrations and creates a superuser if needed, to keep all database setup in the same place. Note that in the PostgreSQL server there are 2 databases: POSTGRES_DB (for administrative and maintenance purposes) and DJANGO_DB (for the PeMS application). Also, the compose file includes a healthcheck for the postgres service. web and pgweb will wait until postgres has emitted a healthy signal to ensure we don't get an Admin connection to PostgreSQL fail error for web and a Error: connection refused error for pgweb when there is no postgres service already running when web or pgweb are started. This usually happens when the containers are first created, such as in the tests-ui CI workflow.

On the Cloud, an Aurora Serverless database is created using

copilot storage init -l workload -t Aurora -w web -n postgres-web --engine PostgreSQL --initial-db postgres

and it is deployed each time the web copilot service is deployed. Once the environments are stabilized, the deployment of the database will be tied to the deployment of the environment instead (issue #137 ). The database is accessible from inside the service's container via the POSTGRESWEB_SECRET environment variable that holds the name of database and the credentials for accessing it. This environment variable is automatically injected into the web service's container and is used by Django's settings.py.

Local behavior

Django App container

Compose's CMD ( bin/setup.sh && exec bin/start.sh) overrides the Docker file's CMD. There are 2 distinct scenarios related to starting up the database:

should_reset=true

  • setup.sh
    • connect to POSTGRES_DB as admin, reset (delete) DJANGO_DB users and database, create DJANGO_DB users and database, run migrations, create Django superuser (ensure_db --reset)
    • load database JSON fixtures
  • start.sh
    • prepare static files (collectstatic)
    • start the web server (nginx)
    • start the application server (gunicorn)

should_reset=false

  • setup.sh
    • connect to POSTGRES_DB as admin, create DJANGO_DB users and database if they don't already exist, run migrations, create Django superuser if it doesn't already exist (ensure_db)
    • load database JSON fixtures
  • start.sh
    • prepare static files (collectstatic)
    • start the web server (nginx)
    • start the application server (gunicorn)

Dev container

devcontainer.json runs bin/setup.sh when the container is started. Note that REMOTE_CONTAINERS=true by default in the dev container so should_reset=true in bin/setup.sh

  • connect to POSTGRES_DB as admin, reset DJANGO_DB users and database, create DJANGO_DB users and database, run migrations, create Django superuser (ensure_db --reset)
  • load database JSON fixtures

Cloud behavior

Django App container

Dockerfile's CMD is bin/start.sh but the web service's manifest replaces it with bin/start_aws.sh

  • download the database fixtures from the S3 bucket
  • setup.sh
    • connect to POSTGRES_DB as admin, create DJANGO_DB users and database if they don't already exist, run migrations, create Django superuser if it doesn't already exist (ensure_db)
    • load database JSON fixtures
  • start.sh
    • prepare static files (collectstatic)
    • start the web server (nginx)
    • start the application server (gunicorn)

Reviewing

Local

App container

  1. Run docker compose up. Verify that web and pgweb wait to start until postgres is ready. Database is created from scratch, ensure the app works in your browser.
  2. Run docker compose down and delete pems_pgdata volume if you need to remove the containers to start from scratch next time.

Dev container

There are no special instructions for the dev container, just start it as usual and use the debugger to ensure that the app is accessible.

Cloud

The PeMS app, running with an Aurora Serverless storage, is deployed at http://pems-d-publi-tdvc9tnwbm5n-1785238641.us-west-2.elb.amazonaws.com

@lalver1 lalver1 self-assigned this Jun 18, 2025
@lalver1 lalver1 force-pushed the feat/django-postgres-db branch from d426a89 to 9346872 Compare June 18, 2025 20:41
Copy link

github-actions bot commented Jun 18, 2025

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  pems
  settings.py 126-127
  urls.py
  pems/core
  context_processors.py
  middleware.py
  urls.py
  pems/core/management/commands
  ensure_db.py 73, 87-89, 99
  pems/districts
  urls.py
Project Total  

This report was generated by python-coverage-comment-action

Copy link

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  pems
  settings.py
  pems/core/management/commands
  reset_db.py
Project Total  

This report was generated by python-coverage-comment-action

lalver1 added 3 commits June 20, 2025 19:10
- configure connection settings from the environment
- add ability to configure a root CA certificate(s)
  from the server to verify the client connection
- creates an admin connection
- resets database users and databases
- validates the database configuration
- checks if a user/database exists
- creates a user/database
- checks schema permissions
- runs migrations
- ensures a superuser exists
@lalver1 lalver1 force-pushed the feat/django-postgres-db branch 2 times, most recently from 461bd51 to 824a263 Compare June 20, 2025 22:58
lalver1 added 2 commits June 20, 2025 23:56
since ensure_db implements the same functionality, the .sh commands
are cleaned up to avoid redundancy.
in the pytest workflow, running migrations is not needed since
tests will migrate the database, but we still need to collect static
files and generate the static files manifest so that the
workflow does not fail.
introduce a healthcheck for the postgres service to ensure the
web and pgweb services do not start until the postgres server is ready.
this ensures we don't get an Admin connection to PostgreSQL fail
for web and an Error: connection refused error for pgweb when there
is no postgres service already running when web or pgweb are started.
This usually happens when the containers are first created.
@lalver1 lalver1 force-pushed the feat/django-postgres-db branch from 824a263 to efaa9a6 Compare June 23, 2025 16:15
- adds an Aurora Serverless storage to the copilot web service
- udpates the Dockerfile to include certs for PostgreSQL verify-full
  (uses the AWS Global RDS CA Bundle in case us-west-2 or us-east-1
  is used)
- supports database creation for local and dev environments
  (the dev environment uses the copilot-injected POSTGRESWEB_SECRET
  environment variable to connect to the database)
- updates the start_aws.sh script
@lalver1 lalver1 marked this pull request as ready for review June 24, 2025 13:39
@lalver1 lalver1 requested a review from a team as a code owner June 24, 2025 13:39
Copy link
Member

@thekaveman thekaveman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this is looking really good! I was able to run everything locally without any issues.

Some small changes requested. Few questions/notes as well.

compose.yml Outdated
- pgdata:/var/lib/postgresql/data

pgweb:
container_name: pgweb
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you grabbed this as-is from the CDRC work, but I realized there's a conflict.

If both projects have this container_name attribute, then the second one fails to come up:

$ docker compose up pgweb
[+] Running 2/2
 ✔ Container pems-postgres-1  Running                                                                                                             0.0s 
 ✘ Container pgweb            Error response from daemon: Conflict. The container name "/pgweb" is already in use by contain...                   0.0s 
Error response from daemon: Conflict. The container name "/pgweb" is already in use by container "55c3ddad7fe81c74fbbc944894685f33bdf5041399c851e8c6ab62f94ed0f0f6". You have to remove (or rename) that container to be able to reuse that name.

If I remove this line, it comes up just fine. I'll make a ticket to remove it from CDRC as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for noticing this! Removed it in 332fb4a

@@ -0,0 +1,309 @@
import os
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should definitely find a way to share this command between projects! This is good for now 👍

Copy link
Member Author

@lalver1 lalver1 Jun 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure! The tricky thing that I am seeing so far is that we may only need a subset of the ensure_db "underscore functions" and we may need to reorder how they are run in handle() for AWS copilot to be able to use it. For example, the copilot storage init command creates the PostgreSQL cluster, the django database, and a user (automatically named postgres by copilot but really it is the user that Django's settings.py uses to interact with the default database). So we don't need to interact with the postgres maintenance database or user (in fact, I think the maintenance database may not be reachable unless we change the CloudFormation template). So because of this slightly different setup, the ensure_db command may need to be refactored to something more general for it to be shared across projects, but I think it's definitely something we should do.

@@ -91,6 +91,9 @@ ENV GUNICORN_CONF "/$USER/run/gunicorn.conf.py"
# overwrite default nginx.conf
COPY appcontainer/nginx.conf /etc/nginx/nginx.conf

# copy certs for PostgreSQL verify-full
COPY appcontainer/certs/aws_global_postgres_ca_bundle.pem app/certs/aws_global_postgres_ca_bundle.pem
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noting a small change I made to how the cert bundle is added to the CDRC image:

This ultimately was because of a change related to the Buildkit caching, where I wasn't copying the entire source directory in anymore during Docker build. That caused some cascading issues with the runtime directory being different from e.g. /cdt/app, and one of the problems was this cert bundle couldn't be found by Django.

This solves it by not making Django have to construct the path, instead it happens at Docker build time and is baked into the image as an env var.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! If it's ok, I'll add this small change to #132 which is about implementing Buildkit caching.

bin/start_aws.sh Outdated
python manage.py loaddata $DJANGO_DB_FIXTURES
else
echo "No JSON fixtures to load"
fi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like start_aws.sh should work similarly if it calls out to setup.sh after downloading the fixtures from S3? Rather than repeating all the migration/loading?

Otherwise, I don't see ensure_db being called in AWS?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I struggled with the organization of this shell script for a bit and even though it works, I don't really like the setup because we are not using ensure_db. But because of how copilot sets up the postgres database (few more details in this comment) I wasn't able to directly reuse ensure_db. Copilot essentially does a lot of the setup that ensure_db does (like making the database and the user) and the only things that start_aws.sh needed to do was download the fixtures, migrate, and load the JSON fixtures.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following #131 (comment), 06d8ca9 fixes how Aurora Serverless is set up.

postgreswebDBName:
Type: String
Description: The name of the initial database to be created in the Aurora Serverless v2 cluster.
Default: django
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the postgres management database also created by default here?

Since the job of our ensure_db command is to create the Django user/database, this might be interfering no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I think copilot automatically creates the postgres management database, but we can't reach it, and probably don't need to due to how copilot sets up the postgres server. We only get access to the default Django database through the user (automatically named postgres, unfortunately, since it is not the maintenance/admin user) that copilot creates.

So because of how copilot sets up the database, we don't actually use ensure_db for AWS.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might be the root of these issues.

It seems like this Default is actually the management database. You should be able to get to it, using the admin user/password, there is no reason you shouldn't be able to. It's your database cluster!

We definitely do not want Django using the postgres user. This is another reason I think you should be relying on ensure_db for all Django database setup.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯
There was a point during setting this up that I was approaching it how you describe it, but the docs seemed to suggest otherwise and I switched direction 😭 . Your explanation makes sense and I just confirmed it by exploring the database cluster so I'll make a few more changes so that we actually use ensure_db and set it up correctly.

pgweb is used by other projects, and having a fixed container
name can lead to conflicts.
pems/settings.py Outdated
return env


def aws_db_credentials():
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this small function since its behavior is used in this file and in ensure_db. If we do have more functions similar to this one, we can make a utils-type file for runtime environment related functions. But since we only have this one I decided to leave the definition here, although we can move it to somewhere else if it makes more sense.

lalver1 added 2 commits June 26, 2025 22:04
this commit fixes the original Aurora Serverless setup by setting
the initial database that copilot creates to postgres, and using the
ensure_db.py script to configure the Django database.

- copilot creates an Aurora Serverless storage cluster with an initial
  database called 'postgres'
- the web service manifest defines secrets related to the Django
  database
- to create the Django database, settings.py is updated to set the
  database environment variables from the single environment variable
  automatically created by copilot
- the start_aws.sh script now uses ensure_db.py
- the Aurora Data API is enabled to allow us to interact with the
  Aurora database by running SQL queries on the AWS console
test_get_object() in TestDistrictView now follows the CBV testing
pattern.
the Pytest GH workflow is also updated to remove the static files
collection step, as client.get() is no longer used in the test;
previously, it was necessary to collect static files for the
workflow to succeed.
@lalver1 lalver1 force-pushed the feat/django-postgres-db branch from 06d8ca9 to 721edc3 Compare June 26, 2025 22:55
@lalver1
Copy link
Member Author

lalver1 commented Jun 27, 2025

When you get a chance this is ready for a re-review @thekaveman. I tested it locally and by redeploying to AWS ✅
This is a summary of the changes in the last 2 commits:

  • fc0c42e
    • the changes to ensure_db.py were removed
    • the logic to set up the environment variables for the database is all inside settings.py in set_aws_db_credentials(). The function was also simplified, it does not need to return the credentials anymore since they are being directly set in the environment variables. HOST and PORT are still set inside the conditional RUNTIME_ENVIRONMENT in settings.py since ensure_db gets these 2 values from settings.py and the rest of the credentials from the environment variables.
  • 721edc3
    • TestDistrictView now follows our CBV testing convention. Since client.get() was removed, I removed the Collect static files step in the tests-pytest workflow because it is not needed anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants