Support a separate timeout for listing pending requests #149

kenfreeman · 2025-05-06T14:56:54Z

Listing pending requests can block the proxy agent. Support a shorter timeout for that method (/pending) than for /request and /response, which are very sensitive to the use case.

Also did a little cleanup

This reverts commit 043a010.

ojarjur · 2025-05-06T22:17:22Z

agent/agent.go

 var (
 	proxy                     = flag.String("proxy", "", "URL (including scheme) of the inverting proxy")
 	proxyTimeout              = flag.Duration("proxy-timeout", 60*time.Second, "Client timeout when sending requests to the inverting proxy")
+	listRequestsTimeout       = flag.Duration("list-requests-timeout", 0*time.Second, "Client timeout for listing requests from the inverting proxy. Defaults to proxy-timeout value")


Implementations of the proxy have to coordinate with the existing proxy-timeout flag specifically for the purposes of the calls to list pending requests (the proxy has to timeout those calls before the agent times them out or else there is a risk of requests getting dropped).

As such, I'd rather that we keep using the proxy-timeout flag value for the calls to list pending requests, and make the new flag be used for all the other calls.

E.G. the new flag could be something like proxy-client-timeout and be used where the proxyTimeout variable was previously used.

That being said, we would want to update the doc string for the existing proxy-timeout flag to indicate its update scope.

The reason I went in the direction I did is because users will set their timeouts based on the expected load and latency of their backends. So it made sense to me to keep the existing flag for that purpose.

That said, I'm happy to make the change you suggest. I just want to be sure that any users who don't set the new value get the current behavior. So we could create a proxy-client-timeout but use proxy-timeout if proxy-client-timeout isn't set.

I am a bit confused by the terminology. What is a "proxy client"?

I see the proxy-agent doing 3 things: (1) getting a list of pending requests from the IVP; (2) Pulling an individual request from the IVP; (3) Forwarding a request to a backend and waiting for a result. I think #1 and #2 are both proxy clients, since they call the IVP URL.

(I understand #3 goes through a host proxy, but I think of that as an implementation detail. When I hear "proxy" in this context, I think of just the IVP.)

Would it make sense to support 3 separate timeouts?

I am a bit confused by the terminology. What is a "proxy client"?

This would be used as the timeout set on the HTTP client used to communicate with the proxy.

I just want to be sure that any users who don't set the new value get the current behavior.

Agreed. I think the best way to do that is to require the value of the new flag to be larger than the value of the existing flag, explicitly raising it if it is not.

Would it make sense to support 3 separate timeouts?

I don't think so. The agent makes three types of calls to the Inverting Proxy:

List the IDs of pending requests

Fetch the contents of a single request

Post the response for a single request

You could think of this as two broad actions:

Poll for new requests

Process a single request (read it and then post back the response)

So, having two separate timeouts makes sense, but having three would be overkill because the two other actions are both just part of processing a single request.

So, an alternative might be to name the new flag something like request-timeout or request-forwarding-timeout.

Thinking about it more, I think I like that better because it makes the existing proxy-timeout name make more sense in comparison, as in one case we are waiting for an individual request/response, but in the other we are only waiting on the Inverting Proxy.

Makes sense. In short, the change would:

Define a new flag called request-forwarding-timeout with a default value of 0

Set its value to the proxy timeout if no value was provided, or if a value less than the proxy timeout was provided

Use proxy timeout for getting the list of pending requests. Implement this with a context with a timeout.

Use the new request-forwarding-timeout for reading requests and posting the response. This can be implemented by setting a timeout on the HTTP connection.

Change the help message on proxy-timeout from "Client timeout when sending requests to the inverting proxy" to "Timeout for polling the inverting proxy for new requests" or similar

Add a help message on request-forwarding-timeout to something like "Timeout for forwarding individual requests to the backend and returning a response"

WDYT?

kenfreeman · 2025-05-08T16:27:41Z

Updated the PR. PTAL

ojarjur

Thank you so much for putting in the effort for this fix.

Overall, this looks good to me, but I had a few minor suggestions for improvements below.

ojarjur · 2025-05-09T15:48:20Z

agent/agent.go

+	requestForwardingTimeoutToUse := *proxyTimeout
+	if *requestForwardingTimeout > *proxyTimeout {
+		requestForwardingTimeoutToUse = *requestForwardingTimeout
+	}
+
+	client.Timeout = requestForwardingTimeoutToUse


client.Timeout = max(*proxyTimeout, *requestForwardingTimeout)

Done. Using a variable so we can also log the value.

ojarjur · 2025-05-09T16:02:42Z

agent/agent_test.go

+	ctx, cancel := context.WithCancel(context.Background())
+	defer cancel()
+
+	backendHomeDir, err := ioutil.TempDir("", "backend-home")


We can use t.TempDir() now since we've increased the minimum Go version to be greater than 1.15: https://pkg.go.dev/testing#T.TempDir

ojarjur · 2025-05-09T16:08:58Z

agent/agent_test.go

+		[]string{"${GOPATH}/bin/proxy-forwarding-agent"},
+		"--backend=testBackend",
+		"--proxy", proxyURL+"/",
+		"--proxy-timeout=10ms", // Use a very short timeout to force a timeout error


This is a race condition which increases the risk of test flakiness.

Have you run the test repeatedly to see how reliable it is?

If you've run it more than 100 times without failure, then I think that's good enough, but otherwise let's see if we can add something to force the timeout to always occur (such as adding a sleep before the first request is sent).

I moved the test logic into a function which is run 20 times with a short timeout (timeouts are expected) and 20 with a long timeout (timeouts should not occur). I manually ran 100 times and the test was not flaky. I decreased the count because I don't want to overly slow down test execution.

ojarjur · 2025-05-13T19:11:16Z

agent/agent_test.go

+	requestForwardingTimeout := "60s"
+	wantTimeout := true
+
+	// Run the test many times to guard against flakiness


I don't think we should be doing this within the test itself.

Running a test multiple times is better done just using the --count flag to go test.

Moreover, I don't think running in a loop would reduce flakiness; it would make any failure in any iteration cause the entire test to fail.

That seems like it would exacerbate any potential sources of flakiness rather than minimizing them.

Removed the loop

ojarjur · 2025-05-13T19:13:24Z

agent/agent_test.go

+		timeoutTest(t, proxyTimeout, requestForwardingTimeout, wantTimeout)
+	}
+
+	// Now test with a long-ish timeout to ensure we don't get timeouts


This should be in a separate test so that if one passes and the other fails the test report makes it immediately clear where the failure was.

kenfreeman added 3 commits May 5, 2025 13:24

Support a custom list-pending-requests timeout

043a010

Also did a little cleanup

Revert "Support a custom list-pending-requests timeout"

5479066

This reverts commit 043a010.

Support a custom list-pending-requests timeout

a129f4e

ojarjur reviewed May 6, 2025

View reviewed changes

kenfreeman added 2 commits May 8, 2025 16:20

Support a custom list-pending-requests timeout

2dc4b43

Support a custom list-pending-requests timeout

8bcfe0d

ojarjur reviewed May 9, 2025

View reviewed changes

Support a custom list-pending-requests timeout

ca84ddf

ojarjur reviewed May 13, 2025

View reviewed changes

kenfreeman added 2 commits May 13, 2025 19:34

Support a custom list-pending-requests timeout

e402686

Support a custom list-pending-requests timeout

da428e5

ojarjur approved these changes May 13, 2025

View reviewed changes

ojarjur merged commit 4c6aeb4 into google:master May 13, 2025
2 checks passed

Support a separate timeout for listing pending requests #149

Support a separate timeout for listing pending requests #149

Uh oh!

Conversation

kenfreeman commented May 6, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kenfreeman commented May 8, 2025

Uh oh!

ojarjur left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants