Skip to content

Add stacks plugin proto #36931

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open

Add stacks plugin proto #36931

wants to merge 11 commits into from

Conversation

Uk1288
Copy link
Contributor

@Uk1288 Uk1288 commented Apr 28, 2025

Fixes #

  • This PR implements the subcommand terraform stacks
  • The new command terraform stacks exposes some Terraform Stack operations through the cli. The available subcommands depend on the stacks plugin implementation.

NOTE: the terraform stacks -help will be implemented in a separate PR

Target Release

1.13.x

CHANGELOG entry

  • The new command terraform stacks exposes some Terraform Stack operations through the cli. The available subcommands depend on the stacks plugin implementation. Use terraform stacks -help to see available commands.

  • This change is user-facing and I added a changelog entry.

  • This change is not user-facing.

NOTE: Test steps are written in the project slack channel

Copy link
Contributor

github-actions bot commented Apr 28, 2025

Changelog Warning

Currently this PR would target a v1.13 release. Please add a changelog entry for in the .changes/v1.13 folder, or discuss which release you'd like to target with your reviewer. If you believe this change does not need a changelog entry, please add the 'no-changelog-needed' label.

@Uk1288 Uk1288 force-pushed the TF-25178-add-stacks-plugin-proto branch from 03462d8 to 3bc1359 Compare May 5, 2025 20:08
@Uk1288 Uk1288 marked this pull request as ready for review May 5, 2025 21:37
@Uk1288 Uk1288 requested review from a team as code owners May 5, 2025 21:37
Copy link
Contributor

@Maed223 Maed223 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Smoke testing is looking good from the perspective of the Stacks plugin 🚀 ! Mostly just one point about about an additional bit of needed info in the StacksPluginConfig

Otherwise I'm realizing that we aren't fully honoring the -json flag here when we have diagnostics returned. Though it's an awkward situation since not all stacks subcommands implement a json view. Don't think it's a massive problem since we'd only return diags here if the hostname/token env vars aren't configured, or if the plugin dev override is in use. Wanted to call that out, but I don't have a strong opinion on how/if to handle it.

@Maed223
Copy link
Contributor

Maed223 commented May 12, 2025

Found out the cause (and the fixes) of the occasional nil panics we were seeing in the prototype, stemming from services here not being properly managed. One note is that in addition to what we were seeing in stacks validate, I also saw a somewhat similar issue in stacks init with the packages and dependencies server. In any case we can get those both solved here!

Accessing the service before it's fully set up

In stacks init we were seeing the rpc error: code = Unimplemented desc = unknown service terraform1.packages.Packages error from trying to access the service before it gets fully set up.

This looks to be a bit of a race condition since we start the services within asynchronous go routines:

dependenciesBrokerID := c.Broker.NextId()
go c.Broker.AcceptAndServe(dependenciesBrokerID, dependenciesServerFunc)

packagesBrokerID := c.Broker.NextId()
go c.Broker.AcceptAndServe(packagesBrokerID, packagesServerFunc)

stacksBrokerID := c.Broker.NextId()
go c.Broker.AcceptAndServe(stacksBrokerID, stacksServerFunc)

but registerBrokers immediately returns the broker IDs, so there's no guarantee that the servers are fully initialized and ready to accept connections when the Stacks Plugin tries to use them.

Solution

Implement a mechanism to ensure all broker services are fully initialized before proceeding:

func (c GRPCStacksClient) registerBrokers(stdout, stderr io.Writer) brokerIDs {
    // ... existing code ...
    
    // Create channels to signal when each service is ready
    dependenciesReady := make(chan struct{})
    packagesReady := make(chan struct{})
    stacksReady := make(chan struct{})
    
    dependenciesServerFunc := func(opts []grpc.ServerOption) *grpc.Server {
        s = grpc.NewServer(opts...)
        dependencies.RegisterDependenciesServer(s, dependenciesServer)
        dependenciesServer.ActivateRPCServer(newDependenciesServer(handles, c.Services))
        close(dependenciesReady) // Signal that this service is ready
        return s
    }
    
    // Similar modifications for other services...
    
    // Wait for all services to be ready
    <-dependenciesReady
    <-packagesReady
    <-stacksReady
    
    return brokerIDs{...}
}

The service closing before the complete execution of the command

This was the class of problem we were seeing in the stacks validate panic.

In registerBrokers, within each server function we are assigning the new server to the existing var s *grpc.Server pointer variable that's outside of each function scope. Each server function overwrites that variable, causing it to lose it's last reference making it eligible for garbage collection. So this leads us to a situation where only the last server function has a stable reference, and the earlier ones could be garbage collected once their goroutines completed setting up the services. This seems to be why we only saw the panics for the dependencies and packages services (the stacks service being the last server function created).

Solution

Get rid of the var s *grpc.Server and create independent variables in each function's scope:

dependenciesServerFunc := func(opts []grpc.ServerOption) *grpc.Server {
    s := grpc.NewServer(opts...)  //  the := creating a new variable
    // ... other code ...
    return s
}

ctx: ctx,
serviceURL: serviceURL,
httpClient: retryableClient,
pluginName: "cloudplugin",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you are scrubbing the term "cloudplugin" from this package, which I think is a good idea. But it's not clear to me if you went as far as you could go or if you just missed a few of them. This is just one example.

IMO, it's OK to remove or manipulate all cloudplugin features from terraform and shift those over to stacks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went about the rearrangement by creating a base client so that each plugin could use an instance of the base client, that's why there is still both cloudclient and stacksclient in the package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants