Add `buildkite_add_trigger_step` action #648

AliSoftware · 2025-05-15T18:01:49Z

Closes AINFRA-98

What does it do?

This implements a new fastlane action that allows to append a trigger step to the current build so that Buildkite would trigger a new, dependent sub-build of the current build.

How does this differ from other existing buildkite_* actions we already have?

This differs from the buildkite_trigger_build existing action—which uses the Buildkite REST API to trigger a build from outside a Buildkite build, typically when running a lane from your local terminal
This differs from calling buildkite-agent pipeline upload #{pipeline_file} to upload the steps of a YML pipeline file directly on the current Build, as instead it actually triggers a child build from the current build. The main different with this is that the triggered child build starts with a brand new build context, especially the BUILDKITE_BRANCH that the new build is started on will be the branch you asked the trigger to run on (typically the current branch you were checked out at the time you called that action), even if the current build started on a different branch and then switched to a new branch since, or pushed new commits since.

Why is this needed?

This will be needed especially during the release process of our apps, when we have tasks like the code-freeze that is responsible for creating a new branch + bumping the version, before compiling a beta build from the new branch and commit that just got pushed.

This action will be particularly useful for the case of our repos hosted in private GitHub Enterprise instances (e.g. Tumblr), for which their initial pipeline upload step to upload the root YAML pipeline file of a build also takes care of mirroring the latest commit of the BUILDKITE_BRANCH that that build starts on to an instance accessible by our other CI agents / builders, so that those CI agents can use that mirror repo as a source to checkout the source code they need to operate.

Details

Because of that particular mechanism, if a CI agent that runs in the LAN where the GH:E instance lives (and thus use the real repo) creates a new branch (e.g. release/* during code-freeze) and/or pushes a new commit (e.g. version bump), other agents that operate outside the LAN of the GH:E instance and that only use the GitHub.com mirror of the repo won't see the new branch or commits until the mirror is synchronized with the latest GH:E as part of a brand new CI build.

This is why just uploading the steps of the pipeline file directly on the current build won't work for those cases of repos hosted on GH:E; even if the commands that are in charge of building the beta were to git checkout release/${RELEASE_VERSION} and git pull (like we do for other repos hosted in GitHub.com), since the new branch and commit are only on the GH:E repo but not on the mirror at that time, that wouldn't help. Only triggering a separate CI build (as a child build of the current one) would allow our architecture to re-sync the git mirrors and the new jobs running on other agents to see the new branch and commit.

Testing

I created a use-trigger-action branch on Tumblr-iOS which:

Switched the Gemfile to point release-toolkit to this PR's branch
Update the Fastfile's trigger_buildkite_build helper to call this new buildkite_add_trigger_step action instead of invoking the trigger-build.sh custom script
Create a dummy fake_code_freeze lane that:
- Creates a test/AINFRA-98 branch from the current branch
- Simulates a dummy version bump and push it
- Then calls trigger_buildkite_build helper method (and this this action) to trigger the fake-beta-build.yml pipeline
- Which in turns just prints some git debug information instead of doing an actual build

Then I used the "New Build" button on Buildkite to trigger a build on that use-trigger-action branch using PIPELINE=release-pipelines/fake-code-freeze.yml

Notice how:

The "Trigger fake-beta-build on test/AINFRA-98" properly shows in the Canvas view as dependent on the job that created it, making it less confusing that the current setup
The child build that got triggered ran on the expected commit, being able to find on the GitHub.com mirror the commit that was pushed to GH:E by the parent build

Checklist before requesting a review

Run bundle exec rubocop to test for code style violations and recommendations.
Add Unit Tests (aka specs/*_spec.rb) if applicable.
Run bundle exec rspec to run the whole test suite and ensure all your tests pass.
Make sure you added an entry in the CHANGELOG.md file to describe your changes under the appropriate existing ### subsection of the existing ## Trunk section.
~~If applicable, add an entry in the MIGRATION.md file to describe how the changes will affect the migration from the previous major version and what the clients will need to change and consider.~~

dangermattic · 2025-05-15T18:02:28Z

	1 Warning
⚠️	This PR is larger than 500 lines of changes. Please consider splitting it into smaller PRs for easier and faster reviews.

Generated by 🚫 Danger

When the tests are run from the Buildkite CI itself, the `BUILDKITE_PIPELINE_SLUG` env var would be set while the test runs, so the unit test case that is responsible for testing the case of `buildkite_pipeline_slug` parameter missing would not work as expected. So we need to ensure we temporarily unset the env var for the duration of the test being run, so that it actually runs in a context we expect it to.

Defaulting to the current build's `BUILDKITE_STEP_KEY` if defined

iangmaia · 2025-05-19T19:35:59Z

lib/fastlane/plugin/wpmreleasetoolkit/actions/common/buildkite_add_trigger_step_action.rb

+        pipeline_file = params[:pipeline_file]
+        build_name = File.basename(pipeline_file, '.yml')
+        message = params[:message] || build_name
+        branch = params[:branch] || `git rev-parse --abbrev-ref HEAD`.strip


I love the convenience and conciseness of using the backticks, but part of me says we'd be better off using the Git gem, specially thinking about error handling 🤔

To be honest I've been pulling back a bit about using the Git gem in release-toolkit, because since we started using it I've noticed a couple of things in its implementation details that caused issues.

In particular:

Using the Git gem to check if a branch exists locally has a side effect of the gem fetching and catching the list of all the branches in the repo, local and remote, before checking if the branch we asked about was in the array of cached branches

Which in some repos can take some time, at least more time than just checking if a local branch of a specific given name exists with a simple git branch --list "#{branch_name}"

And also causes errors in some cases because the implementation of the Git gem to parse the list of branches from git branch --list --all to cache the whole list uses some RegEx string parsing, and some of those fail to account for special character in branch names… including spaces.

We actually encountered errors not that long ago about some release automation failing just because it happened that one of the remote branches in a repo contained a space, even if that branch had nothing to do at all with what the release process was manipulating.

So after those experiences, performance issues, and bugs, my trust in this Git gem has lowered a bit since we first started considering it 😓

and some of those fail to account for special character in branch names… including spaces

Indeed, I've seen it happening as well! I totally understand the losing of confidence in the lib.

I think it's really nice that the Git gem abstracts the details of git calls and the arguments... but at the same time we lose control and it might yield commands or have an internal implementation that might have side effects we don't know / cannot control.

One of my main worries is error handling -- I think using backticks will just "swallow" any eventual error? We can always also use sh (which is nice when we have arguments) but for simple cases it may be not strictly needed?

I'm fine on deciding to go in either direction as long as we're sure about the trade offs and keep it consistent 👍

Yeah that's a fair point. And actually my initial implementation tried to use Open3.capture3 to run that command just like I use for running the buildkite-agent command… but then that overcomplicated the code and I figured it wasn't worth it.

I like the idea of using sh though. iinm we can even provide options to Fastlane's sh helper to tell it not to print the details of such shell command call in the output nor consider that call as being an action when it prints the summary table at the end of a lane… will check if that works.

Switched to using sh in 3568553, and tested how it looks in a real world use case via https://github.tumblr.net/Tumblr/ios/pull/29788/commits/147498ea3cd8f74aa3eddd762a1f018f7678d919 and https://buildkite.com/automattic/tumblr-ios/builds/33625/steps?sid=0196ed6b-2fad-4257-80d5-96ed9e40dab2#0196ed6b-2fe4-4e42-baad-c4a9f64dd40c/212-216:

[07:19:38 -0400]: ---------------------------------------- [07:19:38 -0400]: --- Step: buildkite_add_trigger_step --- [07:19:38 -0400]: ---------------------------------------- [07:19:38 -0400]: $ git rev-parse --abbrev-ref HEAD [07:19:38 -0400]: ▸ test/AINFRA-98 [07:19:39 -0400]: Added a trigger step to the current Buildkite build to start a new build for fake-beta-build.yml on branch test/AINFRA-98 +-----------------------------------------------------+ | fastlane summary | +------+--------------------------------+-------------+ | Step | Action | Time (in s) | +------+--------------------------------+-------------+ | 1 | setup_ci | 0 | | 2 | git checkout -b test/AINFRA-98 | 0 | | 3 | git_commit | 0 | | 4 | push_to_git_remote | 1 | | 5 | is_ci | 0 | | 6 | buildkite_add_trigger_step | 1 | +------+--------------------------------+-------------+

I guess it's ok for the command and its output to be printed in the logs then, since they are hidden under a collapsible group anyway (thanks to #638) and don't appear in the fastlane summary table 👍

lib/fastlane/plugin/wpmreleasetoolkit/actions/common/buildkite_add_trigger_step_action.rb

spec/buildkite_add_trigger_step_action_spec.rb

CHANGELOG.md

See https://github.com/wordpress-mobile/release-toolkit/pull/648\#discussion_r2096368024

Co-authored-by: Ian G. Maia <iangmaia@users.noreply.github.com>

iangmaia

Thanks for this!
Just a couple of small comments but LGTM 👍

AliSoftware added 2 commits May 15, 2025 20:01

Action implementation

aabdd56

Add unit tests

2eeb61f

AliSoftware added 3 commits May 15, 2025 20:40

Add support for depends_on parameter

e959bd0

Defaulting to the current build's `BUILDKITE_STEP_KEY` if defined

Fix rubocop violation

e14bcf0

AliSoftware self-assigned this May 15, 2025

Fix pipeline upload error detection

48c9bc3

AliSoftware marked this pull request as ready for review May 16, 2025 17:07

AliSoftware requested review from iangmaia and a team May 16, 2025 17:07

Add CHANGELOG entry

296fc86

iangmaia reviewed May 19, 2025

View reviewed changes

lib/fastlane/plugin/wpmreleasetoolkit/actions/common/buildkite_add_trigger_step_action.rb Show resolved Hide resolved

iangmaia reviewed May 20, 2025

View reviewed changes

lib/fastlane/plugin/wpmreleasetoolkit/actions/common/buildkite_add_trigger_step_action.rb Outdated Show resolved Hide resolved

iangmaia reviewed May 20, 2025

View reviewed changes

lib/fastlane/plugin/wpmreleasetoolkit/actions/common/buildkite_add_trigger_step_action.rb Outdated Show resolved Hide resolved

iangmaia reviewed May 20, 2025

View reviewed changes

spec/buildkite_add_trigger_step_action_spec.rb Outdated Show resolved Hide resolved

iangmaia reviewed May 20, 2025

View reviewed changes

CHANGELOG.md Show resolved Hide resolved

AliSoftware and others added 2 commits May 20, 2025 13:00

Catch early if we're running the action outside a Buildkite job

eb52f77

See https://github.com/wordpress-mobile/release-toolkit/pull/648\#discussion_r2096368024

Apply suggestions from code review

b421157

Co-authored-by: Ian G. Maia <iangmaia@users.noreply.github.com>

iangmaia approved these changes May 20, 2025

View reviewed changes

Use sh instead of backticks to get current branch

3568553

AliSoftware merged commit b8748c6 into trunk May 20, 2025
6 checks passed

AliSoftware deleted the buildkite-append-trigger-step branch May 20, 2025 11:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add `buildkite_add_trigger_step` action #648

Add `buildkite_add_trigger_step` action #648

Uh oh!

AliSoftware commented May 15, 2025 •

edited

Loading

Uh oh!

dangermattic commented May 15, 2025 •

edited

Loading

Uh oh!

iangmaia May 19, 2025

Uh oh!

AliSoftware May 20, 2025

Uh oh!

iangmaia May 20, 2025

Uh oh!

AliSoftware May 20, 2025

Uh oh!

AliSoftware May 20, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

iangmaia left a comment

Uh oh!

Uh oh!

Uh oh!

Add buildkite_add_trigger_step action #648

Add buildkite_add_trigger_step action #648

Uh oh!

Conversation

AliSoftware commented May 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does it do?

Why is this needed?

Testing

Checklist before requesting a review

Uh oh!

dangermattic commented May 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

iangmaia May 19, 2025

Choose a reason for hiding this comment

Uh oh!

AliSoftware May 20, 2025

Choose a reason for hiding this comment

Uh oh!

iangmaia May 20, 2025

Choose a reason for hiding this comment

Uh oh!

AliSoftware May 20, 2025

Choose a reason for hiding this comment

Uh oh!

AliSoftware May 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

iangmaia left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Add `buildkite_add_trigger_step` action #648

Add `buildkite_add_trigger_step` action #648

AliSoftware commented May 15, 2025 •

edited

Loading

dangermattic commented May 15, 2025 •

edited

Loading

AliSoftware May 20, 2025 •

edited

Loading