Skip to content

Commit ac5e44d

Browse files
feat(connlib): request larger buffers for UDP sockets (#8731)
Sufficiently large receive buffers are important to sustain high-throughput as latency increases. If the receive buffer in the kernel is too small, packets need to be dropped on arrival. Firefox uses 1MB in its QUIC stack [0]. `quic-go` recommends to set send and receive buffers to 7.5 MB [1]. Power users of Firezone are likely receiving a lot more traffic than the average Firefox user (especially with Internet Resource activated) so setting it to 10 MB seems reasonable. Sending packets is likely not as critical because we have back-pressure through our system such that we will stop reading IP packets when we cannot write to our UDP socket. The UDP socket is sitting in a separate thread and those threads are connected with dedicated queues which act as another buffer. However, as the data below shows, some systems have really small send buffers which are currently likely a speed bottleneck because we need to suspend writing so frequently. Assuming a 50ms latency, the bandwidth-delay product tells us that we can (in theory) saturate a 1.6 Gbps link with a 10MB receive buffer (assuming the OS also has large enough buffer sizes in its TCP or QUIC stack): ``` 80 Mb / 0.05s = 1600Mbps ``` Experiments and research [2] show the following: |OS|Receive buffer (default)|Receive buffer (this PR)|Send buffer (default)|Send buffer (this PR)| |---|---|---|---|---| |Windows|65KB|10MB|65KB|1MB| |MacOS|786KB|8MB|9KB|1MB| |Linux|212KB|212KB|212KB|212KB| With the exception of Linux, the OSes appear to be quite generous with how big they allow receive buffers to be. On Linux, these limit can be changed by setting the `core.net.rmem_max` and `core.net.wmem_max` parameters using `sysctl`. Most of our users are on Windows and MacOS, meaning they immediately benefit from this without having to change any system settings. Larger client-side UDP receive buffers are critical for any "download" scenario which is likely the majority of usecases that Firezone is used for. On Windows, increasing this receive buffer almost doubles the throughput in an iperf3 download test. [0]: mozilla/neqo#2470 [1]: https://github.com/quic-go/quic-go/wiki/UDP-Buffer-Sizes [2]: https://unix.stackexchange.com/a/424381 --------- Signed-off-by: Thomas Eizinger <thomas@eizinger.io> Co-authored-by: Jamil <jamilbk@users.noreply.github.com>
1 parent 9303673 commit ac5e44d

File tree

7 files changed

+89
-3
lines changed

7 files changed

+89
-3
lines changed

rust/connlib/tunnel/src/sockets.rs

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -173,7 +173,7 @@ impl ThreadedUdpSocket {
173173
.build()
174174
.expect("Failed to spawn tokio runtime on UDP thread")
175175
.block_on(async move {
176-
let socket = match sf(&addr) {
176+
let mut socket = match sf(&addr) {
177177
Ok(s) => {
178178
let _ = error_tx.send(Ok(()));
179179

@@ -185,6 +185,13 @@ impl ThreadedUdpSocket {
185185
}
186186
};
187187

188+
match socket.set_buffer_sizes(socket_factory::SEND_BUFFER_SIZE, socket_factory::RECV_BUFFER_SIZE) {
189+
Ok(()) => {},
190+
Err(e) => {
191+
let _ = error_tx.send(Err(e));
192+
return;
193+
},
194+
}
188195

189196
let send = pin!(async {
190197
while let Ok(datagram) = outbound_rx.recv_async().await {

rust/socket-factory/src/lib.rs

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,10 @@ use tokio::io::Interest;
2323

2424
pub trait SocketFactory<S>: Fn(&SocketAddr) -> io::Result<S> + Send + Sync + 'static {}
2525

26+
pub const SEND_BUFFER_SIZE: usize = ONE_MB;
27+
pub const RECV_BUFFER_SIZE: usize = 10 * ONE_MB;
28+
const ONE_MB: usize = 1024 * 1024;
29+
2630
impl<F, S> SocketFactory<S> for F where F: Fn(&SocketAddr) -> io::Result<S> + Send + Sync + 'static {}
2731

2832
pub fn tcp(addr: &SocketAddr) -> io::Result<TcpSocket> {
@@ -183,6 +187,28 @@ impl UdpSocket {
183187
})
184188
}
185189

190+
pub fn set_buffer_sizes(
191+
&mut self,
192+
requested_send_buffer_size: usize,
193+
requested_recv_buffer_size: usize,
194+
) -> io::Result<()> {
195+
let socket = socket2::SockRef::from(&self.inner);
196+
197+
socket.set_send_buffer_size(requested_send_buffer_size)?;
198+
socket.set_recv_buffer_size(requested_recv_buffer_size)?;
199+
200+
let send_buffer_size = socket.send_buffer_size()?;
201+
let recv_buffer_size = socket.recv_buffer_size()?;
202+
203+
tracing::info!(%requested_send_buffer_size, %send_buffer_size, %requested_recv_buffer_size, %recv_buffer_size, port = %self.port, "Set UDP socket buffer sizes");
204+
205+
Ok(())
206+
}
207+
208+
pub fn port(&self) -> u16 {
209+
self.port
210+
}
211+
186212
/// Configures a new source IP resolver for this UDP socket.
187213
///
188214
/// In case [`DatagramOut::src`] is [`None`], this function will be used to set a source IP given the destination IP of the datagram.

website/src/app/kb/deploy/gateways/readme.mdx

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -125,6 +125,17 @@ Firezone's
125125
[automatic load balancing](/kb/architecture/critical-sequences#high-availability)
126126
to distribute Client connections across them.
127127

128+
### Performance tuning
129+
130+
The default receive buffer size on Linux is quite small which can limit the
131+
maximum throughput that users perceive in "upload scenarios" (i.e. where the
132+
Gateway needs to receive large volumes of traffic).
133+
134+
On startup, the Gateway attempts to increase the size of the UDP receive buffers
135+
to 10 MB. However, the actual size of the receive buffer is limited by the
136+
`net.core.rmem_max` kernel parameter. For the increased buffer size to take
137+
effect, you may need to increase the `net.core.rmem_max` parameter on the
138+
Gateway's host system.
128139
## Deploy a single Gateway
129140

130141
Deploying a single Gateway can be accomplished in the admin portal.

website/src/components/Changelog/Apple.tsx

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,14 @@ export default function Apple() {
2323
return (
2424
<Entries downloadLinks={downloadLinks} title="macOS / iOS">
2525
{/* When you cut a release, remove any solved issues from the "known issues" lists over in `client-apps`. This must not be done when the issue's PR merges. */}
26-
<Unreleased></Unreleased>
26+
<Unreleased>
27+
<ChangeItem pull="8731">
28+
Improves throughput performance by requesting socket receive buffers
29+
of 10MB. The actual size of the buffers is capped by the operating
30+
system. You may need to adjust <code>kern.ipc.maxsockbuf</code> for this to take
31+
full effect.
32+
</ChangeItem>
33+
</Unreleased>
2734
<Entry version="1.4.12" date={new Date("2025-04-21")}>
2835
<ChangeItem pull="8798">
2936
Improves performance of relayed connections on IPv4-only systems.

website/src/components/Changelog/GUI.tsx

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,22 @@ export default function GUI({ os }: { os: OS }) {
88
return (
99
<Entries downloadLinks={downloadLinks(os)} title={title(os)}>
1010
{/* When you cut a release, remove any solved issues from the "known issues" lists over in `client-apps`. This must not be done when the issue's PR merges. */}
11-
<Unreleased></Unreleased>
11+
<Unreleased>
12+
{os === OS.Linux && (
13+
<ChangeItem pull="8731">
14+
Improves throughput performance by requesting socket receive buffers
15+
of 10MB. The actual size of the buffers is capped by the operating
16+
system. You may need to adjust <code>net.core.rmem_max</code> for this to take
17+
full effect.
18+
</ChangeItem>
19+
)}
20+
{os === OS.Windows && (
21+
<ChangeItem pull="8731">
22+
Improves throughput performance by requesting socket receive buffers
23+
of 10MB.
24+
</ChangeItem>
25+
)}
26+
</Unreleased>
1227
<Entry version="1.4.11" date={new Date("2025-04-21")}>
1328
<ChangeItem pull="8798">
1429
Improves performance of relayed connections on IPv4-only systems.

website/src/components/Changelog/Gateway.tsx

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,12 @@ export default function Gateway() {
2626
<ChangeItem pull="8798">
2727
Improves performance of relayed connections on IPv4-only systems.
2828
</ChangeItem>
29+
<ChangeItem pull="8731">
30+
Improves throughput performance by requesting socket receive buffers
31+
of 10MB. The actual size of the buffers is capped by the operating
32+
system. You may need to adjust <code>net.core.rmem_max</code> for this to take
33+
full effect.
34+
</ChangeItem>
2935
</Unreleased>
3036
<Entry version="1.4.6" date={new Date("2025-04-15")}>
3137
<ChangeItem pull="8383">

website/src/components/Changelog/Headless.tsx

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,20 @@ export default function Headless({ os }: { os: OS }) {
1313
<ChangeItem pull="8798">
1414
Improves performance of relayed connections on IPv4-only systems.
1515
</ChangeItem>
16+
{os === OS.Linux && (
17+
<ChangeItem pull="8731">
18+
Improves throughput performance by requesting socket receive buffers
19+
of 10MB. The actual size of the buffers is capped by the operating
20+
system. You may need to adjust <code>net.core.rmem_max</code> for this to take
21+
full effect.
22+
</ChangeItem>
23+
)}
24+
{os === OS.Windows && (
25+
<ChangeItem pull="8731">
26+
Improves throughput performance by requesting socket receive buffers
27+
of 10MB.
28+
</ChangeItem>
29+
)}
1630
</Unreleased>
1731
<Entry version="1.4.6" date={new Date("2025-04-15")}>
1832
{os == OS.Linux && (

0 commit comments

Comments
 (0)