- 
                Notifications
    You must be signed in to change notification settings 
- Fork 3
          Version Control ECS Task Definition for graphql-engine in Moped
          #1707
        
          New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…orkflow - Introduced functions for determining and validating ECS task definitions based on the branch name. - Enhanced the migration metadata deployment workflow to include steps for checking out code, configuring AWS credentials, and validating the ECS task definition. - Updated comments for clarity and organization in the helper script.
- Replaced the previous AWS CLI installation method with the official installation instructions for AWS CLI v2. - Commented out the old installation and configuration steps for clarity and future reference.
- Renamed the function for validating ECS task definitions to better reflect its purpose, changing it to register the task definition. - Updated the logic to register the ECS task definition using AWS CLI instead of validating it. - Adjusted comments for clarity regarding the registration process and its success or failure.
- Introduced a new function to update the ECS service with the latest task definition. - Added environment variables for the ECS cluster and service. - Enhanced logging to provide more context during the update process. - Updated the main task definition update process to include the service update step.
…on comparison results with emojis
| ✅ Deploy Preview for atd-moped-main ready!
 To edit notification comments on pull requests, go to your Netlify project configuration. | 
| echo "GR: ${GITHUB_REF}" | ||
| echo "PWD: $(pwd)" | ||
| source $(pwd)/.github/workflows/migrations-metadata-deployment-helper.sh | ||
| # run_migration_process | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to avoid blasting main's database stack with the migrations and metadata in this branch as we dev and test, so this line is commented out. It needs to be commented back in just prior to merger (or as a direct push on main should we forget.)
| run: | | ||
| curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" | ||
| unzip awscliv2.zip | ||
| sudo ./aws/install --update | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was surprised to learn that we use the AWS cli version 1 in our old, existing CI here, but in retrospect, the timing is not too surprising. V2 came out in 2020. V1 is still supported, but V2 is the future.
Anyway, that's why I --update here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
H/T to @mddilley who spent some time with me looking as these task definitions until we were both satisfied that we were comfortable with them being released into the public. Thank you Mike 🙏
|  | ||
| # Extract just the taskDefinition object and remove AWS-managed fields | ||
| # These fields are added by AWS and shouldn't be compared | ||
| jq --sort-keys '.taskDefinition | del(.taskDefinitionArn, .revision, .status, .requiresAttributes, .compatibilities, .registeredAt, .registeredBy, .deregisteredAt, .tags)' \ | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sort keys is important here to make the ordering of KV pairs in the JSON deterministic, because we actually compare these json blobs with diff as if they were code, and not by doing a == inside JS or python or what-have-you.
Anyway - with jq here, we strip out things that are intrinsic (such as those items supplied by AWS) and we only look at what we're actually defining.
An upside of doing it this way is that it becomes trivial for us to include the output of that diff in the logs, which is a super easy way for a dev to figure out what change caused the new task definition to be deployed into the service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's really cool to see how you solved this. How are you feeling about how future proof this is, and also if concerns about false positives due to minor formatting differences?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@frankhereford I might be way overthinking, but I'm curious if we could consider deploying the task definition if the version-controlled file changes at all instead of comparing to what is currently deployed. Since we will be reviewing any code changes in a PR, we could give ourselves responsibility for making sure that we understand the changes we are pushing to be deployed in the service (like we do right now during the launch parties).
I'm also thinking about needing to maintain this list of fields that are added by AWS in the future. I really could be worrying to much but wanted to get some discussion going on this in case I am overlooking something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're doing some renaming of some files here to give them names that better represent what they do, not what they act upon.
| branches: | ||
| - main | ||
| - production | ||
| - frank/version-control-ecs-task-definitions | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line needs to be removed before merger, and it is what allows us to test this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements version control for ECS task definitions for the graphql-engine service in Moped. The changes enable automatic monitoring and deployment of task definition updates through CI/CD, replacing manual task definition management with a declarative approach.
Key changes:
- Adds new ECS task definition JSON files for staging and production environments
- Introduces automated task definition deployment workflow that detects changes, registers updates, and manages service deployments
- Consolidates and enhances the existing migration workflow to include task definition management
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description | 
|---|---|
| moped-database/ecs_task_definitions/staging.graphql-engine.ecs-td.json | New task definition for staging graphql-engine service with 2GB memory allocation | 
| moped-database/ecs_task_definitions/production.graphql-engine.ecs-td.json | New task definition for production graphql-engine service with 8GB memory allocation | 
| .github/workflows/migrations-metadata-deployment.yml | Enhanced workflow that now handles both migrations/metadata and ECS task definition deployments | 
| .github/workflows/migrations-metadata-deployment-helper.sh | New comprehensive helper script containing functions for migrations, task definition registration, cleanup, and service updates | 
| .github/workflows/aws-moped-migrations-helper.sh | Removed legacy helper script, functionality moved to migrations-metadata-deployment-helper.sh | 
| .github/workflows/atd_moped_database.yml | Removed legacy workflow, replaced by migrations-metadata-deployment.yml | 
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am releasing my comments before I test the workflow after John. 🥽 🧪 I mentioned you in one question that I'm still thinking through.
| }, | ||
| "memory": 2048, | ||
| "memoryReservation": 2048, | ||
| "mountPoints": [ ], | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't have a team-wide formatter for json that we're set on so I want to get feedback from others. I noticed that when I ran "Prettify JSON" from VSCode it remove the spaces between these brackets. I don't think it is blocking the merge of this since we will always review a code diff when updating this file anyways.
Mostly wondering if there is a good extension that I'm missing out on (or yet another one that doesn't work in my editor like with markdown 🫠 😅)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a side-by-side of the JSON in staging in the AWS console and noticed some small differences with compatibilities and requiresAttributes not present in this file but then I looked at the docs and see that we should be validating for Fargate only and that Fargate doesn't support the requiresAttributes parameter anyways. Makes sense to remove since we've always used Fargate and TIL! 🙏
| { | ||
| "name": "MOPED_API_ACTIONS_URL", | ||
| "valueFrom": "arn:aws:ssm:us-east-1:295525487728:parameter/MOPED_PRODUCTION_FARGATE_HASURA_ACTIONS_URL" | ||
| }, | ||
| { | ||
| "name": "MOPED_API_APIKEY", | ||
| "valueFrom": "arn:aws:ssm:us-east-1:295525487728:parameter/MOPED_PRODUCTION_FARGATE_HASURA_API_KEY" | ||
| } | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can make a spin-off issue snackoo to cover this but this reminds me that we have some cleanup to do in these secrets. I believe that these were used by our previous SQS-powered activity log events process.
| ################################################################################ | ||
|  | ||
| # | ||
| # Download the Hasura settings from the AWS Secrets Manager | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for cleaning up this last confusing reference to Zappa 😵💫🙏 and improving the comments ✨
|  | ||
| # Extract just the taskDefinition object and remove AWS-managed fields | ||
| # These fields are added by AWS and shouldn't be compared | ||
| jq --sort-keys '.taskDefinition | del(.taskDefinitionArn, .revision, .status, .requiresAttributes, .compatibilities, .registeredAt, .registeredBy, .deregisteredAt, .tags)' \ | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@frankhereford I might be way overthinking, but I'm curious if we could consider deploying the task definition if the version-controlled file changes at all instead of comparing to what is currently deployed. Since we will be reviewing any code changes in a PR, we could give ourselves responsibility for making sure that we understand the changes we are pushing to be deployed in the service (like we do right now during the launch parties).
I'm also thinking about needing to maintain this list of fields that are added by AWS in the future. I really could be worrying to much but wanted to get some discussion going on this in case I am overlooking something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This so cool, Frank. My tests work exactly as expected. The diff logging is super fancy. It is great getting to learn some dev ops magic through your PRs 🙌
Glad to see your big bold reminder to re-enable the migrations job and remove this branch from the paths trigger 😉
🚢 🚢
| # | ||
| # Deregisters old task definitions, keeping only the last 3 active ones | ||
| # | ||
| function cleanup_old_task_definitions() { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a really nice touch, and i'm mostly taking it for granted that this works as promised 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@frankhereford one last thought on this. I wonder if we might want to increase to preserving maybe 10 of the most recent revisions. Reason being that although the latest task definition may install successfully, it is still possible that the definition has a faulty configuration that causes the container deployment to fail. If something goes sideways, it might be a relief to have a working task def available in AWS so we can recover quickly while we triage the issue. This would be excessive caution, of course, because we do have task definitions version controlled in Github as well.
|  | ||
| # Extract just the taskDefinition object and remove AWS-managed fields | ||
| # These fields are added by AWS and shouldn't be compared | ||
| jq --sort-keys '.taskDefinition | del(.taskDefinitionArn, .revision, .status, .requiresAttributes, .compatibilities, .registeredAt, .registeredBy, .deregisteredAt, .tags)' \ | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's really cool to see how you solved this. How are you feeling about how future proof this is, and also if concerns about false positives due to minor formatting differences?

Associated issues
This PR aims to close cityofaustin/atd-data-tech#24398.
Intent
This PR extends our
graphql-engineCI such that it monitors the task definitions for ourgraphql-engineservices in our Moped clusters, and installs and deploys them when they change.Testing
As always, testing CI is a little different.
First, announce that you're testing to the devs. This is a one-person-testing-at-a-time type of PR.
When you are confident that it's just you working on this PR, then fire up your code editor, check out this branch and:
2048to4096. Remember to change it back when you're done please.Hard-ish Mode
Ship list