-
Notifications
You must be signed in to change notification settings - Fork 66
Description
Hello!
We're using souin over at pico (repo) for our caching system. Lately, we've been noticing goroutine leaks from souin over time. It doesn't look like it's terribly bad, but we're worried about implications of this long term.
For example, here's a graph from two different instances of pgs w/ souin middleware:

It does look like this is associated with some memory leak, but it's hard to tell by how much and I'd prefer to not throw in more confounders. Here's how memory looks though (in MBs). First resident memory:

And next virtual:

I went ahead and grabbed a dump of the goroutines stacks via a SIGABRT
on the process. Those logs can be found here
I've traced it back to the middleware, specifically on the error channel here:
souin/pkg/middleware/middleware.go
Lines 942 to 971 in 842d55d
go func(vr *http.Request, cw *CustomWriter) { | |
prometheus.Increment(prometheus.NoCachedResponseCounter) | |
errorCacheCh <- s.Upstream(cw, vr, next, requestCc, cachedKey, uri) | |
}(req, customWriter) | |
select { | |
case <-req.Context().Done(): | |
switch req.Context().Err() { | |
case baseCtx.DeadlineExceeded: | |
rw.Header().Set("Cache-Status", cacheName+"; fwd=bypass; detail=DEADLINE-EXCEEDED") | |
customWriter.Rw.WriteHeader(http.StatusGatewayTimeout) | |
_, _ = customWriter.Rw.Write([]byte("Internal server error")) | |
s.Configuration.GetLogger().Infof("Internal server error on endpoint %s: %v", req.URL, s.Storers) | |
return baseCtx.DeadlineExceeded | |
case baseCtx.Canceled: | |
return baseCtx.Canceled | |
default: | |
return nil | |
} | |
case v := <-errorCacheCh: | |
switch v { | |
case nil: | |
_, _ = customWriter.Send() | |
case Upstream50xError: | |
_, _ = customWriter.Send() | |
return nil | |
} | |
return v | |
} | |
} |
My best guess is the request context is completed prior to the upstream request finishing. This results in the error channel never being read and the request waiting. We probably want a way to cancel the upstream request when the context resolves and a defer to close that channel. We need to pay special attention to what is sent if it is closed though.
I'd provide a PR but I'm not super familiar with the souin codebase and if there are any issues posed with a change like this. Let me know if there's anything I can help you with though!