-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Draft rate limiter with example usage #4121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
bc943ba
7a3dd9b
f21a6bf
5e7e557
439b5d3
11a9719
af5fd71
0e16c4d
bd34300
f275298
fa086eb
d193ac6
2b85ae2
c8cc442
7754b78
915d8b3
9c2d55b
a4a1cdc
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,122 @@ | ||
package common | ||
|
||
import ( | ||
"errors" | ||
"fmt" | ||
"net/http" | ||
"sync" | ||
"time" | ||
|
||
"github.com/trufflesecurity/trufflehog/v3/pkg/context" | ||
) | ||
|
||
// RateLimit represents a rate limiting implementation. A rate limiter | ||
// comprises 0 or more limits. Policies should be goroutine safe. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As a dumb guy, I benefit from extremely clear vocabulary. This comment has four different domain nouns:
But I can't tell how they relate to each other :( Is a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh no this is perfect (also I got rid of "policy" and that was an imperfect editing step). Let me tighten it up a bit 👍🏻 |
||
// | ||
// Importantly, limits can assume they're only ever used on a single API, and | ||
// thus can be used in more than one rate limiter. For example, if an API has 2 | ||
// endpoints, one accepts 5r/s and another accepts 1r/s, but both have a limit | ||
// of total 500r/month, the policy implementing the 500r/month limit should be | ||
// able to be used in both of the 2 rate limiters for the 5r/s and 1r/s limits. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nice There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm also too dumb for this comment. What is the difference between an endpoint "accepting" 5r/s and having a "limit" of 500r/mo? And why does a limit being used for a single API mean that it can be used in more than one rate limiter? (Why would a limit used for multiple APIs not be usable in more than one rate limiter?) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Haha ok let me 🔨 on it a little |
||
type RateLimit interface { | ||
// Execute and update execute and update a policy, respectively. These | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What do "execute" and "update" mean in the domain? I'm inferring that "execute" means "perform an API operation in a way that respects all configured rate limits, accounting for any known relevant state" and "update" means "update relevant rate limit state," but I had to think about it, and I don't know if I'm correct. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It might clarify things to write a separate doc comment for each. (Maybe not, though.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good idea yeah; gave this a shot 👍🏻 |
||
// should: | ||
// - Be goroutine safe | ||
// - Check if ctx has been canceled | ||
// - Not modify req/res | ||
// If they return an error, it's combined with errors from the | ||
// execution/updating of the other limits. Other limits will still be | ||
// executed/updated. | ||
// | ||
// If waiting/sleeping is required, Execute should do it. Keep in mind, | ||
// however, that each policy's Execute method is called serially, so Execute | ||
// should *NEVER* sleep for a duration--it should only sleep until a time. | ||
// This also means that if an API only returns durations, Update must | ||
// immediately convert them into times, and it's recommended to pad these | ||
// somewhat. | ||
Execute(ctx context.Context, req *http.Request, now time.Time) error | ||
Update(ctx context.Context, res *http.Response) error | ||
} | ||
|
||
// RateLimiter provides a facility for rate limiting HTTP requests. To use it: | ||
// - Create a RateLimiter with its limits | ||
// - Call .Do instead of what you would normally call to make a request | ||
// - Process the response (returned from .Do) as normal | ||
type RateLimiter struct { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm going to suggest this be There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Some side conversation happened on this topic in slack. I leave it up to your excellent judgement. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (cc: @rosecodym) I started to |
||
limits []RateLimit | ||
} | ||
|
||
// Returns a new rate limiter with the given limits. | ||
func NewRateLimiter(limits ...RateLimit) *RateLimiter { | ||
return &RateLimiter{limits: limits} | ||
} | ||
|
||
// Makes an HTTP request subject to the rate limiter's limits. | ||
func (rl *RateLimiter) Do( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there a way to make sure that we don't repeatedly rate limit in a cycle forever? Or is that something that we have to force onto clients of this library? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah the idea is the call to I also ran down how Go's HTTP client handles contexts and basically you build it into the request if you want it. This isn't ideal; I didn't do a survey but I'd be (very) surprised if our Sources did this consistently. I'm pretty sure there's a default timeout somewhere, but dunno what it is (almost certainly very long). Maybe tossing in another
One thing we could do is have |
||
ctx context.Context, | ||
req *http.Request, | ||
makeRequest func() (*http.Response, error), | ||
) (*http.Response, error) { | ||
if len(rl.limits) == 0 { | ||
return makeRequest() | ||
} | ||
|
||
now := time.Now() | ||
|
||
for i, lim := range rl.limits { | ||
if err := ctx.Err(); err != nil { | ||
return nil, err | ||
} | ||
|
||
// [NOTE] It's maybe better to do this asynchronously, in case an errant | ||
// limit sleeps for a duration instead of until a specific time, but | ||
// I haven't thought through that. | ||
if err := lim.Execute(ctx, req, now); err != nil { | ||
return nil, fmt.Errorf( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure how I'd mechanize this, but I think I'd really want some non-fatal observability. What's getting rate limited? How often? How much time are we spending sleeping? Plugging this into whatever it is we use for metrics and then setting some sanity check alerting would be really nice. I'm worried that things will start to get rate limited, there won't be anything visible, and things will just slow down or completely block without us knowing. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh definitely 💯 . One of the things I'm hoping this gives us is unified metrics, etc. on rate limiting. I think we wouldn't merge something like this w/o that, at least I wouldn't want to. Something I was thinking about is how useful it is for limits to have names. Like, I had that thought and I was waiting for validation that it was correct, and I think this is it. The options I can think of for this are:
If it's possible to have the same OK thinking through it, I'm leaning towards
Yeah actually that last point pushes me over the edge, OK |
||
"error executing rate limit policy %d: %w", | ||
i, | ||
err, | ||
) | ||
} | ||
} | ||
|
||
res, err := makeRequest() | ||
if err != nil { | ||
return nil, fmt.Errorf("error making HTTP request: %w", err) | ||
} | ||
|
||
// [NOTE] errgroup.Group oddly isn't what we want here. It presumes you want | ||
// to stop all other processing if a single task fails (we don't), and | ||
// that functionality is the only reason to use it instead of a | ||
// WaitGroup. | ||
wg := &sync.WaitGroup{} | ||
updateErrorLock := &sync.Mutex{} | ||
var updateError error = nil | ||
|
||
for i, lim := range rl.limits { | ||
wg.Add(1) | ||
go func(i int, lim RateLimit) { | ||
defer wg.Done() | ||
|
||
if err := lim.Update(ctx, res); err != nil { | ||
err = fmt.Errorf("error updating rate limit policy %d: %w", i, err) | ||
|
||
updateErrorLock.Lock() | ||
if updateError == nil { | ||
updateError = err | ||
} else { | ||
updateError = errors.Join(updateError, err) | ||
} | ||
updateErrorLock.Unlock() | ||
} | ||
}(i, lim) | ||
} | ||
|
||
wg.Wait() | ||
|
||
if updateError != nil { | ||
return nil, fmt.Errorf("error updating rate limits: %w", updateError) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What kinds of errors might we see here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Interesting question... hmmm, off the top of my head:
|
||
} | ||
|
||
return res, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have to be this generic? Can we give this its own package? (serious question, I'm not very good at Go)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think giving it its own package is probably a good idea--in particular we can then add useful
RateLimit
s.