mountpoint pods are evicted and no new mountpoint pods are created

/kind bug
> **_NOTE:_**  If this is a filesystem related bug, please take a look at the [Mountpoint repo](https://github.com/awslabs/mountpoint-s3) to submit a [bug report](https://github.com/awslabs/mountpoint-s3/issues/new?assignees=&labels=bug&projects=&template=bug-report.yml)

**What happened?**
After a mountpoint pod is evicted due to limit exceed of local-cahe, a new mountpoint does not created
pod can not access the mounted s3 bucket

(Note It's not a shortage of Kubernetes resources.)

The new mountpoint pod itself is not being deployed.

<img width="1493" height="186" alt="Image" src="https://github.com/user-attachments/assets/31aff3c2-b8bd-429b-b248-1719fdc230f4" />

**csi-controller log**
```
{"level":"info","ts":"2025-09-04T05:40:25Z","logger":"controller-runtime.metrics","msg":"Starting metrics server"}
{"level":"info","ts":"2025-09-04T05:40:25Z","logger":"controller-runtime.metrics","msg":"Serving metrics server","bindAddress":":8080","secure":false}
{"level":"info","ts":"2025-09-04T05:40:25Z","msg":"Starting EventSource","controller":"aws-s3-csi-controller","controllerGroup":"","controllerKind":"Pod","source":"kind source: *v1.Pod"}
{"level":"info","ts":"2025-09-04T05:40:25Z","msg":"Starting Controller","controller":"aws-s3-csi-controller","controllerGroup":"","controllerKind":"Pod"}
{"level":"info","ts":"2025-09-04T05:40:25Z","msg":"Starting workers","controller":"aws-s3-csi-controller","controllerGroup":"","controllerKind":"Pod","worker count":1}
{"level":"info","ts":"2025-09-04T05:40:25Z","msg":"Pod not found - ignoring","controller":"aws-s3-csi-controller","controllerGroup":"","controllerKind":"Pod","Pod":{"name":"s3-csi-node-24slm","namespace":"kube-system"},"namespace":"kube-system","name":"s3-csi-node-24slm","reconcileID":"45d3bb08-25e2-499e-b5d8-5725557e66e7","pod":{"name":"s3-csi-node-24slm","namespace":"kube-system"}}
{"level":"info","ts":"2025-09-04T05:40:25Z","msg":"Pod not found - ignoring","controller":"aws-s3-csi-controller","controllerGroup":"","controllerKind":"Pod","Pod":{"name":"s3-csi-controller-84d897fd95-2btxf","namespace":"kube-system"},"namespace":"kube-system","name":"s3-csi-controller-84d897fd95-2btxf","reconcileID":"d5e451ef-6527-4501-9801-64a34bfb892c","pod":{"name":"s3-csi-controller-84d897fd95-2btxf","namespace":"kube-system"}}
{"level":"info","ts":"2025-09-04T05:40:25Z","msg":"MountpointS3PodAttachment already has this workload UID","controller":"aws-s3-csi-controller","controllerGroup":"","controllerKind":"Pod","Pod":{"name":"s3-test-deployment-6b4f4dcfc5-qklx4","namespace":"default"},"namespace":"default","name":"s3-test-deployment-6b4f4dcfc5-qklx4","reconcileID":"e40ab8b7-c9e7-472c-aa41-77f93de65fd3","workloadPod":{"name":"s3-test-deployment-6b4f4dcfc5-qklx4","namespace":"default"},"pvc":"s3-pvc","workloadUID":"dc96f449-7091-4b8b-b955-7496cebdaa98","s3pa":"s3pa-2p4hz","spec.persistentVolumeName":"s3-pv","spec.workloadFSGroup":"","spec.authenticationSource":"pod","spec.workloadServiceAccountName":"s3-test-sa","spec.volumeID":"s3-csi-driver-volume","spec.mountOptions":"region ap-northeast-2,allow-delete,allow-overwrite","spec.workloadNamespace":"default","spec.workloadServiceAccountIAMRoleARN":"","spec.nodeName":"ip-10-52-100-94.ap-northeast-2.compute.internal"}
{"level":"info","ts":"2025-09-04T05:43:58Z","msg":"Pod failed","controller":"aws-s3-csi-controller","controllerGroup":"","controllerKind":"Pod","Pod":{"name":"mp-pgmm7","namespace":"mount-s3"},"namespace":"mount-s3","name":"mp-pgmm7","reconcileID":"490a13be-0dfe-48d0-8df7-4a4fb2f1e16e","mountpointPod":"mp-pgmm7","reason":"Evicted"}
{"level":"info","ts":"2025-09-04T05:43:59Z","msg":"Pod failed","controller":"aws-s3-csi-controller","controllerGroup":"","controllerKind":"Pod","Pod":{"name":"mp-pgmm7","namespace":"mount-s3"},"namespace":"mount-s3","name":"mp-pgmm7","reconcileID":"779759b7-1b46-482e-959a-ba7652fc8468","mountpointPod":"mp-pgmm7","reason":"Evicted"}

```

**csi-driver log**
```
I0904 05:46:56.791275       1 node.go:209] NodeGetCapabilities: called with args 
I0904 05:46:56.793113       1 node.go:82] NodePublishVolume: new request: volume_id:"s3-csi-driver-volume" target_path:"/var/lib/kubelet/pods/dc96f449-7091-4b8b-b955-7496cebdaa98/volumes/kubernetes.io~csi/s3-pv/mount" volume_capability:<mount:<mount_flags:"region ap-northeast-2" mount_flags:"allow-delete" mount_flags:"allow-overwrite" > access_mode:<mode:MULTI_NODE_MULTI_WRITER > > volume_context:<key:"authenticationSource" value:"pod" > volume_context:<key:"bucketName" value:"prismd-s3-csi-test" > volume_context:<key:"cache" value:"emptyDir" > volume_context:<key:"cacheEmptyDirMedium" value:"" > volume_context:<key:"cacheEmptyDirSizeLimit" value:"1Gi" > volume_context:<key:"csi.storage.k8s.io/ephemeral" value:"false" > volume_context:<key:"csi.storage.k8s.io/pod.name" value:"s3-test-deployment-6b4f4dcfc5-qklx4" > volume_context:<key:"csi.storage.k8s.io/pod.namespace" value:"default" > volume_context:<key:"csi.storage.k8s.io/pod.uid" value:"dc96f449-7091-4b8b-b955-7496cebdaa98" > volume_context:<key:"csi.storage.k8s.io/serviceAccount.name" value:"s3-test-sa" > volume_context:<key:"mountpointContainerResourcesLimitsCpu" value:"500m" > volume_context:<key:"mountpointContainerResourcesLimitsMemory" value:"1Gi" > volume_context:<key:"mountpointContainerResourcesRequestsCpu" value:"500m" > volume_context:<key:"mountpointContainerResourcesRequestsMemory" value:"1Gi" > volume_context:<key:"stsRegion" value:"ap-northeast-2" > 
I0904 05:46:56.793225       1 node.go:148] NodePublishVolume: mounting prismd-s3-csi-test at /var/lib/kubelet/pods/dc96f449-7091-4b8b-b955-7496cebdaa98/volumes/kubernetes.io~csi/s3-pv/mount with options [--allow-delete --allow-overwrite --allow-root --region=ap-northeast-2]
E0904 05:47:11.794046       1 pod_mounter.go:138] Failed to wait for Mountpoint Pod "mp-pgmm7" to be ready for "/var/lib/kubelet/pods/dc96f449-7091-4b8b-b955-7496cebdaa98/volumes/kubernetes.io~csi/s3-pv/mount": mppod/watcher: mountpoint pod not ready. Seems like Mountpoint Pod is not in 'Running' status. You can see it's status and any potential failures by running: `kubectl describe pods -n mount-s3 mp-pgmm7`
E0904 05:47:11.794195       1 driver.go:170] GRPC error: rpc error: code = Internal desc = Could not mount "prismd-s3-csi-test" at "/var/lib/kubelet/pods/dc96f449-7091-4b8b-b955-7496cebdaa98/volumes/kubernetes.io~csi/s3-pv/mount": Failed to wait for Mountpoint Pod "mp-pgmm7" to be ready: mppod/watcher: mountpoint pod not ready. Seems like Mountpoint Pod is not in 'Running' status. You can see it's status and any potential failures by running: `kubectl describe pods -n mount-s3 mp-pgmm7`
I0904 05:47:16.194803       1 node.go:209] NodeGetCapabilities: called with args
I0904 05:47:25.866190       1 reflector.go:389] pkg/mod/k8s.io/client-go@v0.31.3/tools/cache/reflector.go:243: forcing resync
```

**node mount status**
```
# mount | grep mountpoint-s3
mountpoint-s3 on /var/lib/kubelet/plugins/s3.csi.aws.com/mnt/mp-pgmm7 type fuse (rw,nosuid,nodev,noatime,user_id=0,group_id=0,default_permissions,allow_other)

```

**What you expected to happen?**
When a mountpoint pod is evicted, the csi controller creates a new mountpoint pod
The pod can access the mounted s3 bucket through the new mountpoint pod

**How to reproduce it (as minimally and precisely as possible)?**

**Anything else we need to know?**:
I don't know what information is needed to track the issue
If you tell me, I will provide the necessary information.
I need your help

**Environment**
- Kubernetes version (use `kubectl version`): EKS v1.30
- Driver version: v2.0.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

mountpoint pods are evicted and no new mountpoint pods are created #575

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

mountpoint pods are evicted and no new mountpoint pods are created #575

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions