Speed up Erlang 500x by using nodelay and system_time #18
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
gen_tcp
by default uses heavy TCP socket buffering in user space and kernel space and sends packets with a delay.Disable that and send available buffer immediately, by setting
{nodelay, true}
that sets kernel socket optionTCP_NODELAY
. A lot of other Erlang web servers and non-Erlang ones enables this by default on tcp sockets or on HTTP server sockets. It is not enable by default ingen_tcp
in Erlang, because it is not enabled by default in OS sockets like Linux, and it is just following bad practice of not changing OS defaults.Additional use modern replacement for
os:timestamp
, and useerlang:system_time
. It shouldn't have any significant impact on performance.And do minor style changes in the code to follow Erlang standard coding style.
I am able to process 170k-200k requests per seconds on my machine. YES. 200000 requests per second.
With 1 thread and 1 connection from
wrk
, I am getting 13000-15000 requests/s and 66.5us average latency and 500us max latency.Example benchmark run on my machine. I added one more tests with 500 concurrent connections at the end to show scalability.
Requests per second
500x throughput improvement.
Average latency
500x latency improvement.
I do not want to brag, but Erlang is the fastest in the entire table, maybe with exception of Netty. Obviously I have no exact knowledge what machine was used for the tests. And the initial bad result should rise more suspicion and not be published in the first place.
No compiler or Erlang runtime changes or emulator flags/options.
Quick way to run it:
erlc erlanghttp.erl && erl -noshell -s erlanghttp start
.It is also important that the nodelay option has basically no effect on big responses, as big responses (either chunked / streamed response, or data via sendfile, etc), will fill up buffers that no delay will be created in the first place. Only on small responses are severely affected.