-
Notifications
You must be signed in to change notification settings - Fork 56
Description
Describe the Bug:
I am trying to create a static mirror pod on a node that is running AL2 and is connecting to an EKS control plane. When I point the kubelet to the staticPodPath
, I get the following error message in the kubelet on startup
Dec 30 02:44:48 ip-192-168-81-58.us-west-2.compute.internal kubelet[1495]: E1230 02:44:48.535524 1495 kubelet.go:1899] "Failed creating a mirror pod for" err="admission webhook \"mpod.vpc.k8s.aws\" denied the request: Failed to get Matching SGP for Pods, rejecting event" pod="default/static-web-ip-192-168-81-58.us-west-2.compute.internal"
Digging deeper into why this happened, I see that this error log gets fired here: https://github.com/aws/amazon-vpc-resource-controller-k8s/blob/master/webhooks/core/pod_webhook.go#L188. Looking at the GetMatchingSecurityGroupForPods() function, I can see that this will error out and cause denial in the webhook when the webhook is unable to find the service account for the pod. Since the service account for the pod doesn't exist for static pods, I'm suspecting that the lack of the ability for looking up the unspecified service account here is causing failure on pod creation.
From reading through this issue, static pods implicitly don't rely on any API objects since they can't assume that the apiserver even exists when they come up. It seems like the webhook here makes an assumption that these service account names always exist in pods, which seems to be true almost all of the time, except in the case of static pods.
Expected Behavior:
Static pods should be able to create an apiserver representation of themselves without any failure.
How to reproduce it (as minimally and precisely as possible):
- Create an EC2 instance running the EKS-optimized AMI on AL2
- Use the following userData (or similar) when creating the instance
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="//"
--//
Content-Type: text/x-shellscript; charset="us-ascii"
mkdir -p /etc/kubernetes/manifests/
echo "$(jq '.staticPodPath="/etc/kubernetes/manifests/"' /etc/kubernetes/kubelet/kubelet-config.json)" > /etc/kubernetes/kubelet/kubelet-config.json
cat <<EOF >/etc/kubernetes/manifests/static-web.yaml
apiVersion: v1
kind: Pod
metadata:
name: static-web
namespace: default
spec:
containers:
- name: web
image: nginx
EOF
--//
Content-Type: text/x-shellscript; charset="us-ascii"
#!/bin/bash -xe
exec > >(tee /var/log/user-data.log|logger -t user-data -s 2>/dev/console) 2>&1
/etc/eks/bootstrap.sh <cluster-name> --apiserver-endpoint <apiserver-endpoint> --b64-cluster-ca <cluster-ca> \
--dns-cluster-ip '10.100.0.10' \
--use-max-pods false
--//--
- Wait for the node to join and the instance to start. Then, run
journalctl -u kubelet
after SSM-ing into the node to see failures creating the static pods.
Additional Context:
As a workaround right now, I'm just having to disable the mutating webhook with kubectl delete mutatingwebhookconfiguration vpc-resource-mutating-webhook
to unblock me from creating static pods.
Environment:
- Kubernetes version (use
kubectl version
):v1.28.4-eks-8cb36c9
- CNI Version:
v1.12.5-eksbuild.2
- OS (Linux/Windows): Linux