Skip to content

Draft rate limiter with example usage #4121

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 18 commits into from
Closed

Conversation

camgunz
Copy link
Contributor

@camgunz camgunz commented May 6, 2025

Martin's comments have made me realize I should say this is just to elicit feedback; we'd at least add metrics and tests before actually merging this, and porting Postman's rate limiting would be done separately.

Description:

Adds a base rate limiter implementation and a usage example for demonstration.

Checklist:

  • Tests passing (make test-community)?
  • Lint passing (make lint this requires golangci-lint)?

Copy link
Contributor

@martinlocklear martinlocklear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! I'm in love with anything that supports us being a better neighbor - thanks for this!

I feel quite strongly that anything that ends up anywhere with a name like common should be well covered with unit tests and (hopefully) examples for how clients of this library would write integration tests using this library. Is there anything blocking us from doing that here?

Really appreciated the documentation along the way - super helpful.

// thus can be used in more than one rate limiter. For example, if an API has 2
// endpoints, one accepts 5r/s and another accepts 1r/s, but both have a limit
// of total 500r/month, the policy implementing the 500r/month limit should be
// able to be used in both of the 2 rate limiters for the 5r/s and 1r/s limits.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also too dumb for this comment. What is the difference between an endpoint "accepting" 5r/s and having a "limit" of 500r/mo? And why does a limit being used for a single API mean that it can be used in more than one rate limiter? (Why would a limit used for multiple APIs not be usable in more than one rate limiter?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha ok let me 🔨 on it a little

// - Create a RateLimiter with its limits
// - Call .Do instead of what you would normally call to make a request
// - Process the response (returned from .Do) as normal
type RateLimiter struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to suggest this be ClientSideRateLimiter just to be very clear what this is intended to do, but it's not a hill to die on. I struggled without context to get out of the server-side mindset when reviewing this, and I imagine future devs might make the same mistake when they reach for a server side rate limiting library and see "rate limiting".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some side conversation happened on this topic in slack. I leave it up to your excellent judgement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(cc: @rosecodym) I started to %s/RateLimit/SourceIntegrationRateLimit/g but then I thought what if instead of common we put this in sources?

// limit sleeps for a duration instead of until a specific time, but
// I haven't thought through that.
if err := lim.Execute(ctx, req, now); err != nil {
return nil, fmt.Errorf(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how I'd mechanize this, but I think I'd really want some non-fatal observability. What's getting rate limited? How often? How much time are we spending sleeping? Plugging this into whatever it is we use for metrics and then setting some sanity check alerting would be really nice.

I'm worried that things will start to get rate limited, there won't be anything visible, and things will just slow down or completely block without us knowing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh definitely 💯 . One of the things I'm hoping this gives us is unified metrics, etc. on rate limiting. I think we wouldn't merge something like this w/o that, at least I wouldn't want to.


Something I was thinking about is how useful it is for limits to have names. Like, I had that thought and I was waiting for validation that it was correct, and I think this is it. The options I can think of for this are:

  • instead of accepting (effectively) an array of limits in NewRateLimiter, use a map[string]RateLimit parameter instead
  • require RateLimit to have a .Name method

If it's possible to have the same RateLimit in multiple RateLimiters, I don't really know what's useful here. It'd be annoying if two RateLimits had the same name (.Name would allow this unless we explicitly check for it), but maybe it's also annoying for the same RateLimit to have different names in different RateLimiters.

OK thinking through it, I'm leaning towards map[string]RateLimit:

  • it's easy to trace what name is assigned to a RateLimit in calls to NewRateLimiter
  • naming uniqueness is strongly signaled and subsequently enforced
  • names may be better, i.e. you're more likely to get "max 500r/month" as a name than "TokenBucketRateLimiter"

Yeah actually that last point pushes me over the edge, OK

}

// Makes an HTTP request subject to the rate limiter's limits.
func (rl *RateLimiter) Do(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to make sure that we don't repeatedly rate limit in a cycle forever? Or is that something that we have to force onto clients of this library?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the idea is the call to ctx.Err checks if the context is canceled (i.e. hit a timeout or a deadline) before we execute a RateLimit.

I also ran down how Go's HTTP client handles contexts and basically you build it into the request if you want it. This isn't ideal; I didn't do a survey but I'd be (very) surprised if our Sources did this consistently. I'm pretty sure there's a default timeout somewhere, but dunno what it is (almost certainly very long). Maybe tossing in another ctx.Err check before the makeRequest() call mitigates that? Feels prudent to add.

Or is that something that we have to force onto clients of this library?

One thing we could do is have RateLimiter enforce its own timeouts, but I think that's a little too risky or unexpected. Re: your other comment, hopefully metrics/alerts cover our needs here (e.g. we can see a limit/source/customer is acting up and do something about it)

@camgunz
Copy link
Contributor Author

camgunz commented May 6, 2025

I feel quite strongly that anything that ends up anywhere with a name like common should be well covered with unit tests and (hopefully) examples for how clients of this library would write integration tests using this library. Is there anything blocking us from doing that here?

Oh this would be lousy with tests for sure. I'm currently wondering if it should be its own package so we can put common RateLimits here and test them thoroughly as well.

Comment on lines 13 to 14
// RateLimit represents a rate limiting implementation. A rate limiter
// comprises 0 or more limits. Policies should be goroutine safe.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a dumb guy, I benefit from extremely clear vocabulary. This comment has four different domain nouns:

  • "rate limiting implementation" [aka RateLimit]
  • "rate limiter"
  • "limit"
  • "policy"

But I can't tell how they relate to each other :( Is a RateLimit a limit? Is a limit a policy? If so, why are multiple terms being used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh no this is perfect (also I got rid of "policy" and that was an imperfect editing step). Let me tighten it up a bit 👍🏻

// thus can be used in more than one rate limiter. For example, if an API has 2
// endpoints, one accepts 5r/s and another accepts 1r/s, but both have a limit
// of total 500r/month, the policy implementing the 500r/month limit should be
// able to be used in both of the 2 rate limiters for the 5r/s and 1r/s limits.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also too dumb for this comment. What is the difference between an endpoint "accepting" 5r/s and having a "limit" of 500r/mo? And why does a limit being used for a single API mean that it can be used in more than one rate limiter? (Why would a limit used for multiple APIs not be usable in more than one rate limiter?)

// of total 500r/month, the policy implementing the 500r/month limit should be
// able to be used in both of the 2 rate limiters for the 5r/s and 1r/s limits.
type RateLimit interface {
// Execute and update execute and update a policy, respectively. These
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do "execute" and "update" mean in the domain? I'm inferring that "execute" means "perform an API operation in a way that respects all configured rate limits, accounting for any known relevant state" and "update" means "update relevant rate limit state," but I had to think about it, and I don't know if I'm correct.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might clarify things to write a separate doc comment for each. (Maybe not, though.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea yeah; gave this a shot 👍🏻

@@ -0,0 +1,122 @@
package common
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to be this generic? Can we give this its own package? (serious question, I'm not very good at Go)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think giving it its own package is probably a good idea--in particular we can then add useful RateLimits.

wg.Wait()

if updateError != nil {
return nil, fmt.Errorf("error updating rate limits: %w", updateError)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What kinds of errors might we see here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting question... hmmm, off the top of my head:

  • unexpected HTTP status
  • looked for a header/headers but didn't get one/them
  • found a header(s) but value was bad
    • this could be anything from "'90 seconds' is not an integer" to "the API asked me to wait six years... I'm not doing that"
  • updating the state (e.g. if we're persisting it somewhere) failed
  • context timed out

@camgunz
Copy link
Contributor Author

camgunz commented May 8, 2025

And why does a limit being used for a single API mean that it can be used in more than one rate limiter? (Why would a limit used for multiple APIs not be usable in more than one rate limiter?)

I renamed some stuff and changed some comments that hopefully make this clearer. But, yeah overall I'm struggling to demonstrate how I want someone to use/implement this. The Postman example here (i.e. ratelimiter-per-source) is what I'm thinking, but someone could do a centralized one a la:

// in some far away file
var CloudRateLimiter *RateLimiter

type CloudRateLimits struct {
// ...
}

func (crl *CloudRateLimits) MaybeWait(
  ctx *context.Context,
  req *http.Request,
  now time.Time,
) error {
  switch req.URL.Host {
    case "api.postman.com":
      return crl.MaybeWaitPostman(ctx, req, now)
    case api.jira.com":
     return crl.MaybeWaitJira(ctx, req, now)
    // ...
  }
}  

func init() {
  CloudRateLimiter = NewRateLimiter(NewCloudRateLimits())
}

Maybe someone would argue the centralized version is better? I think these are the downsides:

  • Modifying a rate limit for a Source is a lot more likely to break rate limiting for another/all other Sources
  • Locality is by definition a lot worse
  • An errant RateLimit for one Source is a lot more likely to break rate limiting for another/all Sources
  • It entwines three separate things, making most of the code in the module/file irrelevant to you (except it is relevant, because any RateLimit can break your rate limiting):
    • Common rate limiting functionality (the loop in RateLimiter)
    • The limits for the API/Source you're interested in
    • The limits for all the other APIs/Sources you're not interested in

One way to more or less prohibit (at least strongly signal against) this is to accept a hostname parameter when building RateLimiter and RateLimit. Its purpose wouldn't be functional, rather it'd be strictly for safety: we'd simply check that all the RateLimits have the same hostname, and that all requests sent through .Do have the same hostname. People might feel like it's bowling with bumpers, but the security term is belt and suspenders, which I'm pro.

Let me play around w/ that and work on the naming a little more.

@camgunz
Copy link
Contributor Author

camgunz commented May 8, 2025

OK I did a bunch of stuff; happy to undo it or do different things. This is fun!

@camgunz
Copy link
Contributor Author

camgunz commented May 9, 2025

Mothballing for now 🦋

@camgunz camgunz closed this May 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants