Skip to content

Add buildkite_add_trigger_step action #648

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
May 20, 2025
Merged

Conversation

AliSoftware
Copy link
Contributor

@AliSoftware AliSoftware commented May 15, 2025

Closes AINFRA-98

What does it do?

This implements a new fastlane action that allows to append a trigger step to the current build so that Buildkite would trigger a new, dependent sub-build of the current build.

How does this differ from other existing buildkite_* actions we already have?
  • This differs from the buildkite_trigger_build existing action—which uses the Buildkite REST API to trigger a build from outside a Buildkite build, typically when running a lane from your local terminal
  • This differs from calling buildkite-agent pipeline upload #{pipeline_file} to upload the steps of a YML pipeline file directly on the current Build, as instead it actually triggers a child build from the current build. The main different with this is that the triggered child build starts with a brand new build context, especially the BUILDKITE_BRANCH that the new build is started on will be the branch you asked the trigger to run on (typically the current branch you were checked out at the time you called that action), even if the current build started on a different branch and then switched to a new branch since, or pushed new commits since.

Why is this needed?

This will be needed especially during the release process of our apps, when we have tasks like the code-freeze that is responsible for creating a new branch + bumping the version, before compiling a beta build from the new branch and commit that just got pushed.

This action will be particularly useful for the case of our repos hosted in private GitHub Enterprise instances (e.g. Tumblr), for which their initial pipeline upload step to upload the root YAML pipeline file of a build also takes care of mirroring the latest commit of the BUILDKITE_BRANCH that that build starts on to an instance accessible by our other CI agents / builders, so that those CI agents can use that mirror repo as a source to checkout the source code they need to operate.

Details

Because of that particular mechanism, if a CI agent that runs in the LAN where the GH:E instance lives (and thus use the real repo) creates a new branch (e.g. release/* during code-freeze) and/or pushes a new commit (e.g. version bump), other agents that operate outside the LAN of the GH:E instance and that only use the GitHub.com mirror of the repo won't see the new branch or commits until the mirror is synchronized with the latest GH:E as part of a brand new CI build.

This is why just uploading the steps of the pipeline file directly on the current build won't work for those cases of repos hosted on GH:E; even if the commands that are in charge of building the beta were to git checkout release/${RELEASE_VERSION} and git pull (like we do for other repos hosted in GitHub.com), since the new branch and commit are only on the GH:E repo but not on the mirror at that time, that wouldn't help. Only triggering a separate CI build (as a child build of the current one) would allow our architecture to re-sync the git mirrors and the new jobs running on other agents to see the new branch and commit.

Testing

I created a use-trigger-action branch on Tumblr-iOS which:

  • Switched the Gemfile to point release-toolkit to this PR's branch
  • Update the Fastfile's trigger_buildkite_build helper to call this new buildkite_add_trigger_step action instead of invoking the trigger-build.sh custom script
  • Create a dummy fake_code_freeze lane that:
    • Creates a test/AINFRA-98 branch from the current branch
    • Simulates a dummy version bump and push it
    • Then calls trigger_buildkite_build helper method (and this this action) to trigger the fake-beta-build.yml pipeline
    • Which in turns just prints some git debug information instead of doing an actual build

Then I used the "New Build" button on Buildkite to trigger a build on that use-trigger-action branch using PIPELINE=release-pipelines/fake-code-freeze.yml

Notice how:

Checklist before requesting a review

  • Run bundle exec rubocop to test for code style violations and recommendations.
  • Add Unit Tests (aka specs/*_spec.rb) if applicable.
  • Run bundle exec rspec to run the whole test suite and ensure all your tests pass.
  • Make sure you added an entry in the CHANGELOG.md file to describe your changes under the appropriate existing ### subsection of the existing ## Trunk section.
  • If applicable, add an entry in the MIGRATION.md file to describe how the changes will affect the migration from the previous major version and what the clients will need to change and consider.

@dangermattic
Copy link
Collaborator

dangermattic commented May 15, 2025

1 Warning
⚠️ This PR is larger than 500 lines of changes. Please consider splitting it into smaller PRs for easier and faster reviews.

Generated by 🚫 Danger

When the tests are run from the Buildkite CI itself, the `BUILDKITE_PIPELINE_SLUG` env var would be set while the test runs, so the unit test case that is responsible for testing the case of `buildkite_pipeline_slug` parameter missing would not work as expected.

So we need to ensure we temporarily unset the env var for the duration of the test being run, so that it actually runs in a context we expect it to.
Defaulting to the current build's `BUILDKITE_STEP_KEY` if defined
@AliSoftware AliSoftware self-assigned this May 15, 2025
@AliSoftware AliSoftware marked this pull request as ready for review May 16, 2025 17:07
@AliSoftware AliSoftware requested review from iangmaia and a team May 16, 2025 17:07
pipeline_file = params[:pipeline_file]
build_name = File.basename(pipeline_file, '.yml')
message = params[:message] || build_name
branch = params[:branch] || `git rev-parse --abbrev-ref HEAD`.strip
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love the convenience and conciseness of using the backticks, but part of me says we'd be better off using the Git gem, specially thinking about error handling 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest I've been pulling back a bit about using the Git gem in release-toolkit, because since we started using it I've noticed a couple of things in its implementation details that caused issues.

In particular:

  • Using the Git gem to check if a branch exists locally has a side effect of the gem fetching and catching the list of all the branches in the repo, local and remote, before checking if the branch we asked about was in the array of cached branches
  • Which in some repos can take some time, at least more time than just checking if a local branch of a specific given name exists with a simple git branch --list "#{branch_name}"
  • And also causes errors in some cases because the implementation of the Git gem to parse the list of branches from git branch --list --all to cache the whole list uses some RegEx string parsing, and some of those fail to account for special character in branch names… including spaces.
  • We actually encountered errors not that long ago about some release automation failing just because it happened that one of the remote branches in a repo contained a space, even if that branch had nothing to do at all with what the release process was manipulating.

So after those experiences, performance issues, and bugs, my trust in this Git gem has lowered a bit since we first started considering it 😓

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and some of those fail to account for special character in branch names… including spaces

Indeed, I've seen it happening as well! I totally understand the losing of confidence in the lib.

I think it's really nice that the Git gem abstracts the details of git calls and the arguments... but at the same time we lose control and it might yield commands or have an internal implementation that might have side effects we don't know / cannot control.

One of my main worries is error handling -- I think using backticks will just "swallow" any eventual error? We can always also use sh (which is nice when we have arguments) but for simple cases it may be not strictly needed?

I'm fine on deciding to go in either direction as long as we're sure about the trade offs and keep it consistent 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's a fair point. And actually my initial implementation tried to use Open3.capture3 to run that command just like I use for running the buildkite-agent command… but then that overcomplicated the code and I figured it wasn't worth it.

I like the idea of using sh though. iinm we can even provide options to Fastlane's sh helper to tell it not to print the details of such shell command call in the output nor consider that call as being an action when it prints the summary table at the end of a lane… will check if that works.

Copy link
Contributor Author

@AliSoftware AliSoftware May 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched to using sh in 3568553, and tested how it looks in a real world use case via https://github.tumblr.net/Tumblr/ios/pull/29788/commits/147498ea3cd8f74aa3eddd762a1f018f7678d919 and https://buildkite.com/automattic/tumblr-ios/builds/33625/steps?sid=0196ed6b-2fad-4257-80d5-96ed9e40dab2#0196ed6b-2fe4-4e42-baad-c4a9f64dd40c/212-216:

[07:19:38 -0400]: ----------------------------------------
[07:19:38 -0400]: --- Step: buildkite_add_trigger_step ---
[07:19:38 -0400]: ----------------------------------------
[07:19:38 -0400]: $ git rev-parse --abbrev-ref HEAD
[07:19:38 -0400]: ▸ test/AINFRA-98
[07:19:39 -0400]: Added a trigger step to the current Buildkite build to start a new build for fake-beta-build.yml on branch test/AINFRA-98


+-----------------------------------------------------+
|                  fastlane summary                   |
+------+--------------------------------+-------------+
| Step | Action                         | Time (in s) |
+------+--------------------------------+-------------+
| 1    | setup_ci                       | 0           |
| 2    | git checkout -b test/AINFRA-98 | 0           |
| 3    | git_commit                     | 0           |
| 4    | push_to_git_remote             | 1           |
| 5    | is_ci                          | 0           |
| 6    | buildkite_add_trigger_step     | 1           |
+------+--------------------------------+-------------+

I guess it's ok for the command and its output to be printed in the logs then, since they are hidden under a collapsible group anyway (thanks to #638) and don't appear in the fastlane summary table 👍

Copy link
Contributor

@iangmaia iangmaia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this!
Just a couple of small comments but LGTM 👍

@AliSoftware AliSoftware merged commit b8748c6 into trunk May 20, 2025
6 checks passed
@AliSoftware AliSoftware deleted the buildkite-append-trigger-step branch May 20, 2025 11:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants