-
Notifications
You must be signed in to change notification settings - Fork 0
deployment
This document provides instructions for deploying the Semantic Medallion Data Platform to Digital Ocean. The deployment process involves setting up the infrastructure using Terraform and then deploying the application to use the provisioned resources.
Before deploying, ensure you have the following:
- Terraform (version 1.0.0 or later)
- Digital Ocean account
- Digital Ocean API token
- Python 3.9+
- Poetry
First, set up the infrastructure using Terraform:
-
Navigate to the Terraform directory:
cd infrastructure/terraform
-
Create a
terraform.tfvars
file from the example:cp terraform.tfvars.example terraform.tfvars
-
Edit the
terraform.tfvars
file to add your Digital Ocean API token:# Open with your favorite editor nano terraform.tfvars
-
Initialize Terraform:
terraform init
-
Plan the infrastructure changes:
terraform plan -out=tfplan
-
Apply the infrastructure changes:
terraform apply tfplan
-
After successful application, Terraform will output connection details for your PostgreSQL database. Save these details for the next step.
Create a .env
file in the project root with the connection details from the previous step:
# Database Connection (from Terraform outputs)
POSTGRES_HOST=<postgres_host output>
POSTGRES_PORT=<postgres_port output>
POSTGRES_USER=<postgres_user output>
POSTGRES_PASSWORD=<postgres_password output>
POSTGRES_DB=<postgres_database output>
# API Keys
NEWSAPI_KEY=your_newsapi_key_here # Get your key from https://newsapi.org/
Run the database initialization script to create the necessary schemas and tables:
python -m semantic_medallion_data_platform.config.initialize_db
Load the initial data into the database:
# Load known entities
python -m semantic_medallion_data_platform.bronze.brz_01_extract_known_entities --raw_data_filepath data/known_entities/
# Extract news articles (optional)
python -m semantic_medallion_data_platform.bronze.brz_01_extract_newsapi --days_back 7
Process the data through the medallion architecture layers:
# Process known entities with NLP
python -m semantic_medallion_data_platform.silver.slv_02_transform_nlp_known_entities
# Process news articles with NLP (if you extracted news articles)
python -m semantic_medallion_data_platform.silver.slv_02_transform_nlp_newsapi
# Create entity mappings
python -m semantic_medallion_data_platform.silver.slv_03_transform_entity_to_entity_mapping
The infrastructure can be deployed to different environments by setting the environment
variable in
terraform.tfvars
:
-
Development:
environment = "dev"
-
Staging:
environment = "staging"
-
Production:
environment = "prod"
Each environment will have its own separate database cluster.
To update the infrastructure after making changes to the Terraform files:
cd infrastructure/terraform
terraform plan -out=tfplan # Preview changes
terraform apply tfplan # Apply changes
To update the application code:
- Pull the latest code from the repository
- Install any new dependencies:
poetry install
- Run any necessary database migrations or data processing scripts
You can monitor your database cluster through the Digital Ocean dashboard:
- Log in to your Digital Ocean account
- Navigate to Databases
- Select your database cluster
- View metrics such as CPU usage, memory usage, and disk usage
Digital Ocean automatically creates daily backups of your database cluster. You can also create manual backups:
- Log in to your Digital Ocean account
- Navigate to Databases
- Select your database cluster
- Click on "Backups"
- Click "Create Backup"
Digital Ocean handles most database maintenance tasks automatically, including:
- Security patches
- Minor version upgrades
- Failover testing
For major version upgrades, you will need to create a new database cluster and migrate your data.
-
Database connection issues:
- Verify that the database cluster is running
- Check that you're using the correct connection details in your
.env
file - Ensure your firewall allows connections to the database port
-
Application errors:
- Check the application logs for error messages
- Verify that all required environment variables are set
- Ensure that the database schemas and tables are properly created
-
Infrastructure provisioning issues:
- Check the Terraform logs for error messages
- Verify that your Digital Ocean API token has the necessary permissions
- Ensure that you have sufficient quota in your Digital Ocean account
If you need to rollback to a previous version of the infrastructure:
- Restore from a database backup (if necessary)
- Use Terraform to revert to a previous state:
cd infrastructure/terraform terraform plan -out=tfplan -target=<resource_to_rollback> terraform apply tfplan
If you need to completely destroy the infrastructure:
cd infrastructure/terraform
terraform destroy
Note that this will permanently delete all resources, including the database and its data.
Home | Architecture | Development | Deployment | Infrastructure
© 2025 ByteMeDirk • Report Issues • Last updated: June 10, 2025