Skip to content

Implement retry flaky steps #684

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ This document is formatted according to the principles of [Keep A CHANGELOG](htt

## Unreleased

### Added
- Support retry flaky steps with `ErrRetry` error - ([684](https://github.com/cucumber/godog/pull/684) - [nhv2796](https://github.com/nhv2796))
- CLI option to specify number of retry attempts - ([684](https://github.com/cucumber/godog/pull/684) - [nhv2796](https://github.com/nhv2796))

### Added
- Step text is added to "step is undefined" error - ([669](https://github.com/cucumber/godog/pull/669) - [vearutop](https://github.com/vearutop))

Expand Down
19 changes: 19 additions & 0 deletions _examples/retry-flaky/features/retry.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
Feature: Godog should be able to retry flaky tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whilst the principle here is good, I'd lean more on you implementing the CCK for things like this, so that you can use the conformance element there

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i quite don't understand your comment here, could you please explain me further?

In order to help Go developers deal with flaky tests
As a test suite
I need to be able to return godog.Err to mark which steps should be retry
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more a question, but usually in feature files if it's a technical term you're referencing you'd use backticks or something else.

But maybe you don't do that here 🤷‍♂️

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're right, let me add it


Scenario: Test cases that pass aren't retried
Given a step that always passes

Scenario: Test cases that fail are retried if within the limit
Given a step that passes the second time

Scenario: Test cases that fail will continue to retry up to the limit
Given a step that passes the third time

Scenario: Test cases won't retry after failing more than the limit
Given a step that always fails

Scenario: Test cases won't retry when the status is UNDEFINED
Given a non-existent step
3 changes: 3 additions & 0 deletions _examples/retry-flaky/godogs.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
package main

func main() { /* usual main func */ }
53 changes: 53 additions & 0 deletions _examples/retry-flaky/godogs_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
package main

import (
"context"
"errors"
"testing"

"github.com/cucumber/godog"
"github.com/stretchr/testify/assert"
)

func Test_RetryFlaky(t *testing.T) {
suite := godog.TestSuite{
Name: "retry flaky tests",
ScenarioInitializer: InitializeScenario,
Options: &godog.Options{
Format: "pretty",
Paths: []string{"features/retry.feature"},
MaxRetries: 2,
},
}

// expect it to fail
assert.Equal(t, 1, suite.Run())
}

func InitializeScenario(sc *godog.ScenarioContext) {
sc.Step(`^a step that always passes`, func(ctx context.Context) (context.Context, error) {
return ctx, nil
})

secondTimePass := 0
sc.Step(`^a step that passes the second time`, func(ctx context.Context) (context.Context, error) {
secondTimePass++
if secondTimePass < 2 {
return ctx, godog.ErrRetry
}
return ctx, nil
})

thirdTimePass := 0
sc.Step(`^a step that passes the third time`, func(ctx context.Context) (context.Context, error) {
thirdTimePass++
if thirdTimePass < 3 {
return ctx, godog.ErrRetry
}
return ctx, nil
})

sc.Step(`^a step that always fails`, func(ctx context.Context) (context.Context, error) {
return ctx, errors.New("must fail")
})
}
2 changes: 2 additions & 0 deletions internal/flags/flags.go
Original file line number Diff line number Diff line change
Expand Up @@ -46,4 +46,6 @@ built-in formatters are:
specify SEED to reproduce the shuffling from a previous run
--random=5738`)
flagSet.Lookup(prefix + "random").NoOptDefVal = "-1"

flagSet.IntVar(&opts.MaxRetries, prefix+"retry", opts.MaxRetries, "specify the number of times to retry failing tests (default: 0)")
}
4 changes: 4 additions & 0 deletions internal/flags/flags_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ func Test_BindFlagsShouldRespectFlagDefaults(t *testing.T) {
assert.False(t, opts.Strict)
assert.False(t, opts.NoColors)
assert.Equal(t, int64(0), opts.Randomize)
assert.Equal(t, 0, opts.MaxRetries)
}

func Test_BindFlagsShouldRespectFlagOverrides(t *testing.T) {
Expand All @@ -37,6 +38,7 @@ func Test_BindFlagsShouldRespectFlagOverrides(t *testing.T) {
Strict: true,
NoColors: true,
Randomize: 11,
MaxRetries: 1,
}
flagSet := pflag.FlagSet{}

Expand All @@ -51,6 +53,7 @@ func Test_BindFlagsShouldRespectFlagOverrides(t *testing.T) {
"--optOverrides.strict=false",
"--optOverrides.no-colors=false",
"--optOverrides.random=2",
"--optOverrides.retry=3",
})

assert.Equal(t, "junit", opts.Format)
Expand All @@ -61,4 +64,5 @@ func Test_BindFlagsShouldRespectFlagOverrides(t *testing.T) {
assert.False(t, opts.Strict)
assert.False(t, opts.NoColors)
assert.Equal(t, int64(2), opts.Randomize)
assert.Equal(t, 3, opts.MaxRetries)
}
3 changes: 3 additions & 0 deletions internal/flags/options.go
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,9 @@ type Options struct {

// ShowHelp enables suite to show CLI flags usage help and exit.
ShowHelp bool

// MaxRetries is the number of times you can retry failing tests. Default is 0 retry.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// MaxRetries is the number of times you can retry failing tests. Default is 0 retry.
// MaxRetries is the number of times you can retry failing tests. Default is 0 retries.

MaxRetries int
}

type Feature struct {
Expand Down
1 change: 1 addition & 0 deletions internal/formatters/fmt.go
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ var (
undefined = models.Undefined
pending = models.Pending
ambiguous = models.Ambiguous
retry = models.Retry
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whilst this is godog, and you are allowed to do your own thing, if your ultimate goal is cucumber conformance, this would need to go.

We have a concept called "flaky" which is something that passed after retries, maybe this can be renamed or adapted I don't know

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh i understand, this variable should indeed called Flaky.

do you think in this example, to mark that a step needs to retry and return the error godog.ErrFlaky, is also a good choice or we should come up with a different meaningful error?

)

type sortFeaturesByName []*models.Feature
Expand Down
9 changes: 9 additions & 0 deletions internal/models/results.go
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,10 @@ type PickleStepResult struct {
Def *StepDefinition

Attachments []PickleAttachment

// RunAttempt tells of what run attempt this step result belongs to.
// Default is 1 since every run is the first run.
RunAttempt int
}

// NewStepResult ...
Expand All @@ -55,6 +59,7 @@ func NewStepResult(
PickleStepID: pickleStepID,
Def: match,
Attachments: attachments,
RunAttempt: 1,
}
}

Expand All @@ -74,6 +79,8 @@ const (
Pending
// Ambiguous ...
Ambiguous
// Retry ...
Retry
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might / should be altered too

)

// Color ...
Expand Down Expand Up @@ -105,6 +112,8 @@ func (st StepResultStatus) String() string {
return "pending"
case Ambiguous:
return "ambiguous"
case Retry:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here also

return "retry"
default:
return "unknown"
}
Expand Down
23 changes: 22 additions & 1 deletion internal/storage/storage.go
Original file line number Diff line number Diff line change
Expand Up @@ -217,7 +217,7 @@ func (s *Storage) MustGetPickleResults() (prs []models.PickleResult) {
return prs
}

// MustGetPickleStepResult will retrieve a pickle strep result by id and panic on error.
// MustGetPickleStepResult will retrieve a pickle step result by id and panic on error.
func (s *Storage) MustGetPickleStepResult(id string) models.PickleStepResult {
v := s.mustFirst(tablePickleStepResult, tablePickleStepResultIndexPickleStepID, id)
return v.(models.PickleStepResult)
Expand Down Expand Up @@ -279,6 +279,15 @@ func (s *Storage) MustGetFeatures() (fs []*models.Feature) {
return
}

// GetPickleStepResultByStepID will retrieve a pickle step result by id and panic on error.
func (s *Storage) GetPickleStepResultByStepID(id string) models.PickleStepResult {
v := s.first(tablePickleStepResult, tablePickleStepResultIndexPickleStepID, id)
if v != nil {
return v.(models.PickleStepResult)
}
return models.PickleStepResult{}
}

type stepDefinitionMatch struct {
StepID string
StepDefinition *models.StepDefinition
Expand Down Expand Up @@ -336,3 +345,15 @@ func (s *Storage) mustGet(table, index string, args ...interface{}) memdb.Result

return it
}

func (s *Storage) first(table, index string, args ...interface{}) interface{} {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am writing this comment to let you know I don't understand this bit. But I'm assuming it does something important 😅

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function is similar to the mustFirst function here, which is retrieving the first record in the database, but that record must exist or the program will panic.

so here the first function is a less strict version, in case the record does not exist, it will just return nil.

I use it for this function to query the Pickle step result by ID. I need it to return nil or empty pointer so that I know when to insert a new pickle step result and when to update the current result status.

txn := s.db.Txn(readMode)
defer txn.Abort()

v, err := txn.First(table, index, args...)
if err != nil {
panic(err)
}

return v
}
131 changes: 131 additions & 0 deletions retry_flaky_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
package godog

import (
"bytes"
"context"
"fmt"
"io"
"testing"

"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)

func Test_RetryFlakySteps(t *testing.T) {
output := new(bytes.Buffer)

featureContents := []Feature{
{
Name: "RetryFlaky",
Contents: []byte(`
Feature: retry flaky steps
Scenario: Test cases that pass aren't retried
Given a step that always passes

Scenario: Test cases that fail are retried if within the limit
Given a step that passes the second time

Scenario: Test cases that fail will continue to retry up to the limit
Given a step that passes the third time

Scenario: Test cases won't retry after failing more than the limit
Given a step that always fails

Scenario: Test cases won't retry when the status is UNDEFINED
Given a non-existent step
`),
},
}

opts := Options{
NoColors: true,
Format: "pretty",
Output: output,
FeatureContents: featureContents,
MaxRetries: 3,
}

status := TestSuite{
Name: "retry flaky",
ScenarioInitializer: func(ctx *ScenarioContext) {
ctx.Step(`^a step that always passes`, func(ctx context.Context) (context.Context, error) {
return ctx, nil
})

secondTimePass := 0
ctx.Step(`^a step that passes the second time`, func(ctx context.Context) (context.Context, error) {
secondTimePass++
if secondTimePass < 2 {
return ctx, fmt.Errorf("unexpected network connection, %w", ErrRetry)
}
return ctx, nil
})

thirdTimePass := 0
ctx.Step(`^a step that passes the third time`, func(ctx context.Context) (context.Context, error) {
thirdTimePass++
if thirdTimePass < 3 {
return ctx, fmt.Errorf("unexpected network connection, %w", ErrRetry)
}
return ctx, nil
})

fifthTimePass := 0
ctx.Step(`^a step that always fails`, func(ctx context.Context) (context.Context, error) {
fifthTimePass++
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like duplicate info from above. Is there a way to avoid defining this twice.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I forgot, this file is unit tests for the framework itself. So this case here is to cover the case step retried all attempts but still fail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So based on what I've seen earlier, this should ideally live in a separate folder called compatibility and you should look towards CCK conformance imo.

Maybe come and discuss this on discord. See what the desired avenue is in the go community

if fifthTimePass < 5 {
return ctx, fmt.Errorf("must fail, %w", ErrRetry)
}
return ctx, nil
})
},
Options: &opts,
}.Run()

const expected = `Feature: retry flaky steps

Scenario: Test cases that pass aren't retried # RetryFlaky:3
Given a step that always passes # retry_flaky_test.go:51 -> github.com/cucumber/godog.Test_RetryFlakySteps.func1.1

Scenario: Test cases that fail are retried if within the limit # RetryFlaky:6
Given a step that passes the second time # retry_flaky_test.go:56 -> github.com/cucumber/godog.Test_RetryFlakySteps.func1.2

Scenario: Test cases that fail will continue to retry up to the limit # RetryFlaky:9
Given a step that passes the third time # retry_flaky_test.go:65 -> github.com/cucumber/godog.Test_RetryFlakySteps.func1.3

Scenario: Test cases won't retry after failing more than the limit # RetryFlaky:12
Given a step that always fails # retry_flaky_test.go:74 -> github.com/cucumber/godog.Test_RetryFlakySteps.func1.4
must fail, retry step

Scenario: Test cases won't retry when the status is UNDEFINED # RetryFlaky:15
Given a non-existent step

--- Failed steps:

Scenario: Test cases won't retry after failing more than the limit # RetryFlaky:12
Given a step that always fails # RetryFlaky:13
Error: must fail, retry step


5 scenarios (3 passed, 1 failed, 1 undefined)
5 steps (3 passed, 1 failed, 1 undefined)
0s

You can implement step definitions for undefined steps with these snippets:

func aNonexistentStep() error {
return godog.ErrPending
}

func InitializeScenario(ctx *godog.ScenarioContext) {
ctx.Step(` + "`^a non-existent step$`" + `, aNonexistentStep)
}

`

actualOutput, err := io.ReadAll(output)
require.NoError(t, err)

assert.Equal(t, exitFailure, status)
assert.Equal(t, expected, string(actualOutput))
}
4 changes: 4 additions & 0 deletions run.go
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,8 @@ type runner struct {

storage *storage.Storage
fmt Formatter

maxRetries int
}

func (r *runner) concurrent(rate int) (failed bool) {
Expand All @@ -69,6 +71,7 @@ func (r *runner) concurrent(rate int) (failed bool) {
storage: r.storage,
defaultContext: r.defaultContext,
testingT: r.testingT,
maxRetries: r.maxRetries,
},
}
if r.testSuiteInitializer != nil {
Expand Down Expand Up @@ -283,6 +286,7 @@ func runWithOptions(suiteName string, runner runner, opt Options) int {
runner.strict = opt.Strict
runner.defaultContext = opt.DefaultContext
runner.testingT = opt.TestingT
runner.maxRetries = opt.MaxRetries

// store chosen seed in environment, so it could be seen in formatter summary report
os.Setenv("GODOG_SEED", strconv.FormatInt(runner.randomSeed, 10))
Expand Down
Loading