Replies: 1 comment
-
If geth is on the same system as the node js script, you may explore IpcProvider to check if the bottleneck is due to communication over a network socket. IPC simply uses OS API for communication between two processes so if there would be any bottleneck due to network communication it might be avoided. If the bottleneck is due to disk reads, you'd have to explore a faster SSD option (e.g. NVMe).
Since your system could go over 80% CPU and 50% memory, the bottleneck could be due to disk. But to be really sure you can briefly experiment with a higher capacity.
Can't think of a faster way but I'm curious if your system has 8 cores, then does doing 10K concurrent requests vs 5k concurrent requests make any difference? Because at max, they would be resolved parallelly 8 at a time. So maybe you don't really need to shoot a lot of requests since they would go in waiting mode anyways. Maybe you can push 1K requests and if 500 of them are left then push 500 more, something like that. Though I'm not very sure about this. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi there! Any help with the following would be much appreciated.
Context
I've got a large txt file (30GB+) containing transaction hashes. I have a nodejs script that reads this file and queries a full geth node for transaction receipts and saves these receipts in a csv file on disk.
Problem
Geth can handle 10K
getTransactionReceipt
requests in aPromise.all
fine. But geth hangs/crashes when I increase this amount to anything higher. At 20k it can process it for a bit before it crashes. Anything higher than 20k, it just hangs and crashes immediately. Geth returns an error sayingSERVER_ERROR
missing response
or some sort ofsocket hang up
error. The error log also contains a massive array of RPC requests containing the individual tx hashes.Configuration
I'm running geth on a
m5.2xlarge
(8 cores and 32GB memory) instance:The CPU utilisation hovers in the 70-80% range and memory hovers in the 30-50% range.
Pseudocode
My script is basically like this
What I've tried
JsonBatchRpcProvider
but it doesn't seem to make a difference. I also read that it's not actually faster because the rpc calls have to be serialised. The documentation is not clear as to whether this only applies to state changing transaction calls vs just simple queries to fetch dataPromise.all
. With this approach I send agetTransactionReceipt
serially without waiting for confirmation. Prior to making the call I increment the buffer count and on successful response I decrement the count. I add an arbitrary wait of a couple of seconds once the buffer count exceeds 10k. I didn't find this approach to be any faster and geth would most likely crash when I increased the count from 10k. FYI I wasn't too rigorous in this implementation.Questions
Thank you
Beta Was this translation helpful? Give feedback.
All reactions