Skip to content

spop: On load some connections (?) dont get accepted #15

@MarcWort

Description

@MarcWort

Hi,
I am working on our own implementation of coraza-spoa and wanted to use your library as it seems promising in your microbenchmarks. Unfortunately while testing against our prod load I noticed that about 15% of the requests on one frontend throws spoe errors like this:

Apr 30 14:09:08.000 n095182 haproxy[318703]: SPOE: [coraza] <GROUP:check-request> sid=15496 st=1 99 483/1294                                                                                                                        
Apr 30 14:09:08.000 n095182 haproxy[318703]: SPOE: [coraza] <GROUP:check-request> sid=15497 st=1 99 484/1295                                                                                                                        
Apr 30 14:09:08.000 n095182 haproxy[318703]: SPOE: [coraza] <GROUP:check-request> sid=15498 st=1 99 485/1296                                                                                                                        
Apr 30 14:09:08.000 n095182 haproxy[318703]: SPOE: [coraza] <GROUP:check-request> sid=15499 st=1 99 486/1297                                                                                                                        
Apr 30 14:09:08.000 n095182 haproxy[318703]: SPOE: [coraza] <GROUP:check-request> sid=15539 st=257 0 487/1298                                                                                                                                                                                                                           
Apr 30 14:09:08.000 n095182 haproxy[318703]: SPOE: [coraza] <GROUP:check-request> sid=15573 st=257 0 492/1303                                                                                                                       
Apr 30 14:09:08.000 n095182 haproxy[318703]: SPOE: [coraza] <GROUP:check-request> sid=15574 st=257 0 493/1304                                                                                                                       
Apr 30 14:09:08.000 n095182 haproxy[318703]: SPOE: [coraza] <GROUP:check-request> sid=15649 st=1 101 494/1316                                                                                                                       
Apr 30 14:09:08.000 n095182 haproxy[318703]: SPOE: [coraza] <GROUP:check-request> sid=15650 st=1 101 495/1317                                                                                                                       
Apr 30 14:09:08.000 n095182 haproxy[318703]: SPOE: [coraza] <GROUP:check-request> sid=15651 st=1 101 496/1318                                                                                                                       
Apr 30 14:09:08.000 n095182 haproxy[318703]: SPOE: [coraza] <GROUP:check-request> sid=15653 st=1 101 497/1319                                                                                                                       
Apr 30 14:09:08.000 n095182 haproxy[318703]: SPOE: [coraza] <GROUP:check-request> sid=15654 st=1 101 498/1320                                                                                                                       
Apr 30 14:09:08.000 n095182 haproxy[318703]: SPOE: [coraza] <GROUP:check-request> sid=15655 st=1 101 499/1321                                                                                                                       
Apr 30 14:09:08.000 n095182 haproxy[318703]: SPOE: [coraza] <GROUP:check-request> sid=15656 st=1 101 500/1322                                                                                                                       
Apr 30 14:09:08.000 n095182 haproxy[318703]: SPOE: [coraza] <GROUP:check-request> sid=15657 st=1 101 501/1323                                                                                                                       
Apr 30 14:09:08.000 n095182 haproxy[318703]: SPOE: [coraza] <GROUP:check-request> sid=15658 st=1 101 502/1324                                                                                                                       
Apr 30 14:09:08.000 n095182 haproxy[318703]: SPOE: [coraza] <GROUP:check-request> sid=15659 st=1 101 503/1325                                                                                                                       
Apr 30 14:09:08.000 n095182 haproxy[318703]: SPOE: [coraza] <GROUP:check-request> sid=15789 st=257 0 504/1326                                                                                                                       
Apr 30 14:09:08.000 n095182 haproxy[318703]: SPOE: [coraza] <GROUP:check-request> sid=15818 st=257 0 505/1327                                                                                                                       
Apr 30 14:09:09.000 n095182 haproxy[318703]: SPOE: [coraza] <GROUP:check-request> sid=15901 st=257 0 506/1328                                                                                                                       
Apr 30 14:09:09.000 n095182 haproxy[318703]: SPOE: [coraza] <GROUP:check-request> sid=15927 st=257 0 507/1329                                                                                                                       

The pattern is first some long-running checks with st=1 and then st=257.
No matter how long I configure timeout processing 5000ms it always hits the timeout.

It is reproducible with your https://github.com/DropMorePackets/haproxy-go/tree/master/spop/example/header-to-body example with a minor change to make errors more visible:

     filter spoe engine engine config engine.cfg
 
+    http-request deny deny_status 503 if { var(txn.engine.error) -m int gt 0 }
     default_backend test_backend
➜  ~ oha --insecure -n 50 -c 50 -p 1 "http://localhost:8123/?SCAN=AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"

Summary:
  Success rate: 100.00%
  Total:        0.5028 secs
  Slowest:      0.5025 secs
  Fastest:      0.0007 secs
  Average:      0.0311 secs
  Requests/sec: 99.4417

  Total data:   5.18 KiB
  Size/request: 106 B
  Size/sec:     10.30 KiB

Response time histogram:
  0.001 [1]  |
  0.051 [46] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.101 [0]  |
  0.151 [0]  |
  0.201 [0]  |
  0.252 [0]  |
  0.302 [0]  |
  0.352 [0]  |
  0.402 [0]  |
  0.452 [0]  |
  0.503 [3]  |■■

Response time distribution:
  10.00% in 0.0008 secs
  25.00% in 0.0009 secs
  50.00% in 0.0010 secs
  75.00% in 0.0012 secs
  90.00% in 0.0015 secs
  95.00% in 0.5023 secs
  99.00% in 0.5025 secs
  99.90% in 0.5025 secs
  99.99% in 0.5025 secs


Details (average, fastest, slowest):
  DNS+dialup:   0.0003 secs, 0.0001 secs, 0.0011 secs
  DNS-lookup:   0.0000 secs, 0.0000 secs, 0.0001 secs

Status code distribution:
  [200] 47 responses
  [503] 3 responses

With oha it varies from few 5% up to 70% spoe errors.

header-to-body meanwhile logs:

2025/04/30 16:28:45 scheduler.go:95: write tcp 127.0.0.1:9000->127.0.0.1:42360: write: broken pipe
2025/04/30 16:28:45 scheduler.go:95: write tcp 127.0.0.1:9000->127.0.0.1:42692: write: connection reset by peer
2025/04/30 16:28:45 scheduler.go:95: write tcp 127.0.0.1:9000->127.0.0.1:42692: write: broken pipe
2025/04/30 16:28:45 scheduler.go:95: disconnect frame with code 3: frame is too big
2025/04/30 16:28:45 scheduler.go:95: write tcp 127.0.0.1:9000->127.0.0.1:42692: write: broken pipe
2025/04/30 16:28:45 scheduler.go:95: disconnect frame with code 3: frame is too big
2025/04/30 16:28:45 scheduler.go:95: disconnect frame with code 3: frame is too big
2025/04/30 16:28:45 agent.go:67: disconnect frame with code 3: frame is too big
2025/04/30 16:28:45 scheduler.go:95: write tcp 127.0.0.1:9000->127.0.0.1:42394: write: broken pipe
2025/04/30 16:28:45 scheduler.go:95: write tcp 127.0.0.1:9000->127.0.0.1:42394: write: broken pipe
2025/04/30 16:28:45 scheduler.go:95: write tcp 127.0.0.1:9000->127.0.0.1:42394: write: broken pipe
2025/04/30 16:28:45 scheduler.go:95: write tcp 127.0.0.1:9000->127.0.0.1:42394: write: broken pipe
2025/04/30 16:28:45 scheduler.go:95: write tcp 127.0.0.1:9000->127.0.0.1:42394: write: broken pipe
2025/04/30 16:28:45 scheduler.go:95: disconnect frame with code 3: frame is too big
2025/04/30 16:28:45 scheduler.go:95: write tcp 127.0.0.1:9000->127.0.0.1:42394: write: broken pipe
2025/04/30 16:28:45 scheduler.go:95: write tcp 127.0.0.1:9000->127.0.0.1:42394: use of closed network connection
2025/04/30 16:28:45 scheduler.go:95: write tcp 127.0.0.1:9000->127.0.0.1:42394: write: broken pipe
2025/04/30 16:28:45 scheduler.go:95: disconnect frame with code 3: frame is too big
2025/04/30 16:28:45 agent.go:67: disconnect frame with code 3: frame is too big
2025/04/30 16:28:45 scheduler.go:95: disconnect frame with code 3: frame is too big
2025/04/30 16:28:45 agent.go:67: disconnect frame with code 3: frame is too big
2025/04/30 16:28:45 scheduler.go:95: disconnect frame with code 3: frame is too big
2025/04/30 16:28:45 agent.go:67: disconnect frame with code 3: frame is too big

While playing around it seems that some messages never get accepted by the offload engine.
It is like that at least from HAProxy 2.9.7 to newest 3.1.7.

I also noticed this behavior when using https://github.com/corazawaf/coraza-spoa with even more errors than header-to-body, it may be load related?

Note: I am not that much familiar with the protocol and may use some terms incorrectly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions