Replies: 1 comment
-
I'm thinking about a similar architecture. Thank you for pointing out the size of the hidden state. I believe that a low-latency interconnect is more suitable for this application. The latency for InfiniBand (40G, 56G, 100G - HDR, FDR, EDR) is uniformly 600ns within a single subnet (1 hop). |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Would it be possible to have a setup with two separate machines, each with a single GPU running on the same local internet connection, in which:
My thinking is, because you are not transferring a large amount of data (only 16kb for each hidden state), the latency of the network is perhaps not enough of an issue for this to be a bottleneck, especially over local Ethernet with full-duplex communication.
Beta Was this translation helpful? Give feedback.
All reactions