Skip to content

Upload is broken (9: Bad file descriptor) #12875

@sepich

Description

@sepich

What happened:

POST with 10mb body leads to dropped request with error:

2025/02/20 15:24:11 [crit] 517#517: *74177 pread() "/tmp/nginx/client-body/0000000041" failed (9: Bad file descriptor) while sending request to upstream

What you expected to happen:

Http request succeeds.

NGINX Ingress controller version (exec into the pod and run /nginx-ingress-controller --version):
v1.12.0

Kubernetes version (use kubectl version):
v1.30.8

Environment:

  • Cloud provider or hardware configuration: EKS/GKE/on-prem

How to reproduce this issue:

Could be reproduced in kind:

kind create cluster
kubectl apply -f https://kind.sigs.k8s.io/examples/ingress/deploy-ingress-nginx.yaml
kubectl -n ingress-nginx port-forward ingress-nginx-controller-696d4c4c5-x7kl6 8080:80

Create test Ingress accepting uploads:

yaml spec
apiVersion: apps/v1
kind: Deployment
metadata:
  name: echo
spec:
  selector:
    matchLabels:
      app: echo
  template:
    metadata:
      labels:
        app: echo
    spec:
      containers:
        - image: inanimate/echo-server
          name: echo
      enableServiceLinks: false
---
apiVersion: v1
kind: Service
metadata:
  name: echo
spec:
  selector:
    app: echo
  ports:
    - port: 8080
      targetPort: 8080
      protocol: TCP
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    nginx.ingress.kubernetes.io/proxy-body-size: "0"
  #    nginx.ingress.kubernetes.io/proxy-request-buffering: "off"
  name: echo
spec:
  ingressClassName: nginx
  rules:
    - http:
        paths:
          - backend:
              service:
                name: echo
                port:
                  number: 8080
            path: /
            pathType: ImplementationSpecific

Upload file, see the connection dropped:

dd if=/dev/urandom of=10M bs=1M count=10
curl -d @10M http://localhost:8080 > /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (7) Failed to connect to localhost port 8080 after 0 ms: Couldn't connect to server

In logs:

$ k -n ingress-nginx logs --tail=5 ingress-nginx-controller-696d4c4c5-x7kl6

I0220 16:14:51.539815      11 event.go:377] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"kube-system", Name:"echo", UID:"eb48935e-1c17-40d5-9563-599b29c8d844", APIVersion:"networking.k8s.io/v1", ResourceVersion:"1177", FieldPath:""}): type: 'Normal' reason: 'Sync' Scheduled for sync
127.0.0.1 - - [20/Feb/2025:16:57:25 +0000] "GET / HTTP/1.1" 200 1779 "-" "curl/8.7.1" 77 0.029 [kube-system-echo-8080] [] 10.244.0.8:8080 1779 0.029 200 3d0c1d9f60cec40e163e0f2713eb9821
2025/02/20 16:57:38 [warn] 573#573: *32687 a client request body is buffered to a temporary file /tmp/nginx/client-body/0000000001, client: 127.0.0.1, server: _, request: "POST / HTTP/1.1", host: "localhost:8080"
2025/02/20 16:57:38 [crit] 573#573: *32687 pread() "/tmp/nginx/client-body/0000000001" failed (9: Bad file descriptor) while sending request to upstream, client: 127.0.0.1, server: _, request: "POST / HTTP/1.1", upstream: "http://10.244.0.8:8080/", host: "localhost:8080"
127.0.0.1 - - [20/Feb/2025:16:57:39 +0000] "POST / HTTP/1.1" 200 3959 "-" "curl/8.7.1" 10363454 1.035 [kube-system-echo-8080] [] 10.244.0.8:8080 3939 0.003 502 73dfb8ab83944211aadfc70382080cd8

Per debug logs it happens after upstream already answered 200 OK, and response is forwarded back to client:

2025/02/20 17:21:31 [debug] 1114#1114: *101569 http body new buf t:1 f:0 0000FFFF802C1690, pos 0000FFFF802C1690, size: 8192 file: 0, size: 0
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http write client request body, bufs 0000FFFF80287290
2025/02/20 17:21:31 [debug] 1114#1114: *101569 add cleanup: 0000FFFF802872A0
2025/02/20 17:21:31 [debug] 1114#1114: *101569 hashed path: /tmp/nginx/client-body/0000000002
2025/02/20 17:21:31 [debug] 1114#1114: *101569 temp fd:12
2025/02/20 17:21:31 [warn] 1114#1114: *101569 a client request body is buffered to a temporary file /tmp/nginx/client-body/0000000002, client: 127.0.0.1, server: _, request: "POST / HTTP/1.1", host: "localhost:8080"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 write: 12, 0000FFFF802C1690, 8192, 0
...
2025/02/20 17:21:31 [debug] 1114#1114: *101569 read: 12, 0000FFFF802C1690, 8192, 360448
...
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http proxy status 200 "200 OK"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http proxy header: "Content-Type: text/plain"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http proxy header: "X-Real-Server: echo-server"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http proxy header: "Date: Thu, 20 Feb 2025 17:21:31 GMT"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http proxy header: "Connection: close"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http proxy header: "Transfer-Encoding: chunked"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http proxy header done
2025/02/20 17:21:31 [debug] 1114#1114: *101569 lua header filter for user lua code, uri "/"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 code cache lookup (key='nhlf_9f4fbd7160880a0dbc1df5df3dfe0b59', ref=-1)
2025/02/20 17:21:31 [debug] 1114#1114: *101569 code cache miss (key='nhlf_9f4fbd7160880a0dbc1df5df3dfe0b59', ref=-1)
2025/02/20 17:21:31 [debug] 1114#1114: *101569 lua capture header filter, uri "/"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 headers more header filter, uri "/"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 HTTP/1.1 200 OK
Date: Thu, 20 Feb 2025 17:21:31 GMT
Content-Type: text/plain
Transfer-Encoding: chunked
Connection: keep-alive
X-Real-Server: echo-server

2025/02/20 17:21:31 [debug] 1114#1114: *101569 write new buf t:1 f:0 0000FFFF80287F70, pos 0000FFFF80287F70, size: 162 file: 0, size: 0
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http write filter: l:0 f:0 s:162
2025/02/20 17:21:31 [debug] 1114#1114: *101569 file cleanup: fd:12
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http proxy filter init s:200 h:0 c:1 l:-1
2025/02/20 17:21:31 [debug] 1114#1114: *101569 tcp_nodelay
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunked byte: 38 s:0
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunked byte: 30 s:1
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunked byte: 30 s:1
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunked byte: 0D s:1
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunked byte: 0A s:3
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunked byte: 57 s:4
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http proxy out buf 0000FFFF80199592 2048
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunked byte: 0D s:5
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunked byte: 0A s:6
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunked byte: 64 s:0
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunked byte: 63 s:1
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunked byte: 62 s:1
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunked byte: 0D s:1
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunked byte: 0A s:3
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunked byte: 2C s:4
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http proxy out buf 0000FFFF80199D99 1879
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http upstream process non buffered downstream
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http output filter "/?"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http copy filter: "/?"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 lua capture body filter, uri "/"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http postpone filter "/?" 0000FFFF80288150
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunk: 2048
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunk: 1879
2025/02/20 17:21:31 [debug] 1114#1114: *101569 write old buf t:1 f:0 0000FFFF80287F70, pos 0000FFFF80287F70, size: 162 file: 0, size: 0
2025/02/20 17:21:31 [debug] 1114#1114: *101569 write new buf t:1 f:0 0000FFFF802881F0, pos 0000FFFF802881F0, size: 5 file: 0, size: 0
2025/02/20 17:21:31 [debug] 1114#1114: *101569 write new buf t:0 f:0 0000000000000000, pos 0000FFFF80199592, size: 2048 file: 0, size: 0
2025/02/20 17:21:31 [debug] 1114#1114: *101569 write new buf t:0 f:0 0000000000000000, pos 0000FFFF80199D99, size: 1879 file: 0, size: 0
2025/02/20 17:21:31 [debug] 1114#1114: *101569 write new buf t:0 f:0 0000000000000000, pos 0000AAAABA2B2288, size: 2 file: 0, size: 0
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http write filter: l:0 f:1 s:4096
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http write filter limit 2097152
2025/02/20 17:21:31 [debug] 1114#1114: *101569 writev: 4096 of 4096
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http write filter 0000000000000000
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http copy filter: 0 "/?"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 recv: eof:0, avail:0
2025/02/20 17:21:31 [debug] 1114#1114: *101569 event timer add: 15: 60000:17774858
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http upstream request: "/?"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http upstream send request handler
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http upstream send request
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http upstream send request body
2025/02/20 17:21:31 [debug] 1114#1114: *101569 chain writer in: 0000FFFF80287B90
2025/02/20 17:21:31 [debug] 1114#1114: *101569 writev: 1920 of 1920
2025/02/20 17:21:31 [debug] 1114#1114: *101569 chain writer out: 0000000000000000
2025/02/20 17:21:31 [debug] 1114#1114: *101569 read: -1, 0000FFFF802C1690, 8192, 368640
2025/02/20 17:21:31 [crit] 1114#1114: *101569 pread() "/tmp/nginx/client-body/0000000002" failed (9: Bad file descriptor) while sending request to upstream, client: 127.0.0.1, server: _, request: "POST / HTTP/1.1", upstream: "http://10.244.0.8:8080/", host: "localhost:8080"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http next upstream, 2
2025/02/20 17:21:31 [debug] 1114#1114: *101569 free keepalive peer
2025/02/20 17:21:31 [debug] 1114#1114: *101569 lua balancer free peer, tries: 2
2025/02/20 17:21:31 [debug] 1114#1114: *101569 finalize http upstream request: 502
2025/02/20 17:21:31 [debug] 1114#1114: *101569 finalize http proxy request

So, 0000FFFF802C1690 - is the buffer for /tmp/nginx/client-body/0000000002
Last read was at pos 360448, and then it tries to continue reads from pos 360448+8192=368640 to same buffer but from fh=-1?

Anything else we need to know:

We noticed the issue in both prod GKE and EKS clusters with ingress-nginx v1.11.2.
Tested that v1.12.0 and latest image with nginx 1.27 still have the issue.
The only found current workaround is to add:
nginx.ingress.kubernetes.io/proxy-request-buffering: "off"
But that slows down all uploads to 2Mbit/s comparing to usual >300Mbit/s with buffering enabled, and is not acceptable.

Possibly related:
#4474
openresty/lua-nginx-module#2374

Also tried, without success:

  • reduce nginx workers to 1
  • bumping client_body_buffer_size can make 10Mb file pass, but then 11Mb fails
  • client_body_in_file_only on;
  • all variants of chunked_transfer_encoding, proxy_buffering

How can I help fixing this issue? Do you have more ideas to try?

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/supportCategorizes issue or PR as a support question.lifecycle/frozenIndicates that an issue or PR should not be auto-closed due to staleness.needs-prioritytriage/needs-informationIndicates an issue needs more information in order to work on it.

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions