Skip to content

Refactor end-to-end tests #525

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 3 additions & 4 deletions controllers/disruption_controller_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -41,12 +41,11 @@ func listChaosPods(instance *chaosv1beta1.Disruption) (corev1.PodList, error) {
ls := labels.NewSelector()

// create requirements
targetPodRequirement, _ := labels.NewRequirement(chaostypes.TargetLabel, selection.In, []string{"foo", "bar", "minikube"})
disruptionNameRequirement, _ := labels.NewRequirement(chaostypes.DisruptionNameLabel, selection.Equals, []string{instance.Name})
disruptionNamespaceRequirement, _ := labels.NewRequirement(chaostypes.DisruptionNamespaceLabel, selection.Equals, []string{instance.Namespace})

// add requirements to label selector
ls = ls.Add(*targetPodRequirement, *disruptionNamespaceRequirement, *disruptionNameRequirement)
ls = ls.Add(*disruptionNamespaceRequirement, *disruptionNameRequirement)

// get matching pods
if err := k8sClient.List(context.Background(), &l, &client.ListOptions{
Expand Down Expand Up @@ -134,7 +133,7 @@ var _ = Describe("Disruption Controller", func() {
disruption = &chaosv1beta1.Disruption{
ObjectMeta: metav1.ObjectMeta{
Name: "foo",
Namespace: "default",
Namespace: namespace,
},
Spec: chaosv1beta1.DisruptionSpec{
DryRun: true,
Expand Down Expand Up @@ -224,7 +223,7 @@ var _ = Describe("Disruption Controller", func() {
disruption = &chaosv1beta1.Disruption{
ObjectMeta: metav1.ObjectMeta{
Name: "foo",
Namespace: "default",
Namespace: namespace,
},
Spec: chaosv1beta1.DisruptionSpec{
DryRun: false,
Expand Down
31 changes: 24 additions & 7 deletions controllers/suite_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,8 @@ import (
// http://onsi.github.io/ginkgo/ to learn more about Ginkgo.

const (
timeout = time.Second * 45
timeout = time.Second * 45
hostTopologyKey = "kubernetes.io/hostname"
)

var (
Expand All @@ -56,6 +57,9 @@ var (
instanceKey types.NamespacedName
targetPodA *corev1.Pod
targetPodB *corev1.Pod
namespace string
nodeLabel map[string]string
testPodImage string
)

func TestAPIs(t *testing.T) {
Expand Down Expand Up @@ -99,19 +103,25 @@ var _ = BeforeSuite(func(done Done) {
Expect(err).ToNot(HaveOccurred())
}()

instanceKey = types.NamespacedName{Name: "foo", Namespace: "default"}
// wait for the cache to sync
time.Sleep(10 * time.Second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What cache are we talking about here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was running these tests vs a remote cluster and I was getting an error message along the lines of:

the cache is not started, can not read objects

Here we create the Kubernetes client. This will watch for and cache objects. I'm assuming in remote clusters this takes some time.

I can drop this tbh as it is not an issue on minikube and may have to do with our clusters. But it took me a while to debug!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you recall on which test you were getting this error? I would prefer to have a check in a Eventually clause somewhere in the tests setup so we do not blindly wait for 10s but start when we can start reading objects (if that is possible of course).
We will soon run those tests both against minikube and remote clusters too so we'll likely face the same issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't remember, sorry... It was at the very beginning though, either as part of the setup or when deploying the first CR.
I implemented this as a temporary workaround but I imagine we could use the waitForCacheSync method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've reproduced this.

This is because we have more logic in our tests; we run tests for network Availability Zone failures. For this we need to check in which Node/Availability Zone the controller gets deployed at so we do not target that one. We achieve this through Affinity rules on the test Pods.

The error message is the following:

Unexpected error:
      <*fmt.wrapError | 0xc0008ec540>: {
          msg: "can't list controller pods: the cache is not started, can not read objects",
          err: <*cache.ErrCacheNotStarted | 0x323dbb0>{},
      }
      can't list controller pods: the cache is not started, can not read objects
  occurred

I have a feeling this was happening even before adding that logic though. Ultimately, in any calls where we are listing Resources we need to make sure the cache is ready.

I've pushed an update to the PR. I hope that makes more sense.


// set up enviroment-specific properties
setupEnvironment()

instanceKey = types.NamespacedName{Name: "foo", Namespace: namespace}
targetPodA = &corev1.Pod{
ObjectMeta: metav1.ObjectMeta{
Name: "foo",
Namespace: "default",
Namespace: namespace,
Labels: map[string]string{
"foo": "bar",
},
},
Spec: corev1.PodSpec{
Containers: []corev1.Container{
{
Image: "k8s.gcr.io/pause:3.4.1",
Image: testPodImage,
Name: "ctn1",
VolumeMounts: []corev1.VolumeMount{
{
Expand All @@ -121,7 +131,7 @@ var _ = BeforeSuite(func(done Done) {
},
},
{
Image: "k8s.gcr.io/pause:3.4.1",
Image: testPodImage,
Name: "ctn2",
VolumeMounts: []corev1.VolumeMount{
{
Expand Down Expand Up @@ -150,15 +160,15 @@ var _ = BeforeSuite(func(done Done) {
targetPodB = &corev1.Pod{
ObjectMeta: metav1.ObjectMeta{
Name: "bar",
Namespace: "default",
Namespace: namespace,
Labels: map[string]string{
"foo": "bar",
},
},
Spec: corev1.PodSpec{
Containers: []corev1.Container{
{
Image: "k8s.gcr.io/pause:3.4.1",
Image: testPodImage,
Name: "ctn1",
VolumeMounts: []corev1.VolumeMount{
{
Expand Down Expand Up @@ -209,6 +219,13 @@ var _ = AfterSuite(func() {
Expect(testEnv.Stop()).To(BeNil())
})

// setupEnvironment sets up environment-specific properties
func setupEnvironment() {
namespace = "chaos-engineering"
testPodImage = "k8s.gcr.io/pause:3.4.1"
nodeLabel = map[string]string{hostTopologyKey: "minikube"}
}

// podsAreRunning returns true when all the given pods have all their containers running
func podsAreRunning(pods ...*corev1.Pod) (bool, error) {
for _, pod := range pods {
Expand Down