Replies: 1 comment
-
Nearly all of my experience with tonic is via http2. For reference, at my day job, we have a go sdk customers use (on their own servers) that connects via normal tls http2 to remote servers (that I own) running tonic. P999 latency for payloads of this size is on the order of 1-3 milliseconds, assuming they're in the same cloud and depending on whether or not they're in the same AZ. I'm not saying this to brag, but to reassure you that multilingual grpc with < 1s latency is a very reasonable ask 😆 ❤️ First and easiest, a tokio trivia: Your let result = tokio::spawn(server).await; This makes your server run on a runtime thread, so any task it spawns will be from within the runtime - and that makes the tasks start much more quickly. I don't think this is costing you a second though. Second and probably a wild goose chase, is your go client making only 1 connection for all the concurrency? I've had to go to great lengths to make high concurrency on 1 Third and probably the root, is that 200 milliseconds is too long to hold a cooperative multitasking thread. Tokio doesn't know which task you want to execute first, so in the abstract, they run in unordered fashion, and your complete work items delay other work items. In Rust, tasks don't have any way to implicitly, preemptively yield in the middle of their work. They will occupy their thread unconditionally until they reach an Make your cpu work async: You make that 200-300ms "validation" code asynchronous, and assuming you're doing a bunch of computation in some loop, you would put an await in there every thousand iterations or something to that effect, like this: tokio::task::consume_budget().await; Run your cpu work off the runtime: Not all thread-bound work is created equal. Consider how easily a cpu accomplishes thread::sleep() or reading bytes from a drive, versus a hard loop computing the nth digit of pi. I am not sure what kind of work your validation work is, so if I had to give you a silver bullet recommendation, I'd ask you to move that 200-300ms to a different thread pool. You can do that by: let validation_result = tokio::task::spawn_blocking(async move {
run_200ms_validation(the_request)
}); or by creating a Lastly, some obligatory engineering theory because I can't help myself: If you have 100 computations that each take 0.2 seconds, that's 20 seconds of CPU time that has come from somewhere. By Little's Law, a 16 core CPU can only retire a CPU-bound 200ms task at a rate of:
80 requests per second. If you make requests at a higher rate than that, you are queueing and your latency will increase - irrespective of libraries, tech stacks, architectures, or anything else. |
Beta Was this translation helpful? Give feedback.
-
Thank you very much for your excellent work on this library.
Question
My guess is that there is a limit on concurrent requests somehow and thus the client is blocking on either send or recieve but I am unclear where / why
Background
- My Tonic Server Code
Timing Background
Thank you very much
Beta Was this translation helpful? Give feedback.
All reactions