Skip to content

Twister2 Message Schemas for Headerless Messaging

Chathura Widanage edited this page Aug 19, 2019 · 1 revision

Twister2 communication layer uses a fixed set of sending and receiving buffers in order to control the flow and memory utilization. When sending messages from a source to a destination, twister2 transfers data from a sending buffer of the sender to a receiving buffer of the receiver. Hence, these buffers could contain one or more messages at a given time. Since buffers could contain multiple messages, identifying the bounds of a message is crucial. To mark the bounds of a message, Twister2 prepends the message size as an integer(4 bytes) header when writing a message to the out buffer. If the underlying communication operation is a keyed operation, with a non primitive key type, an additional integer has to be appended to the header to indicate the key size. Hence for keyed messages, twister2 has to send upto additional 8 bytes per message.

While sending size header is unavoidable for applications having variable message sizes, this can be optimized in the applications having a deterministic or fixed message schemas. Twister2 give the ability to define message schemas when configuring the in built parallel communication operations. If the message schema is defined, Twister2 serializers don’t have to prepend additional integers to indicate the message sizes and thereby reduces the amount of data that should be transferred through the network. The effect of headerless messaging is significant when the content of the message(actual message size) is comparatively small. For instance, in popular benchmark TeraSort where the standard message size is 100 bytes (10 bytes key size and 90 bytes message size), if a size header of 8 bytes(which is roughly equivalent to 8% of the actual message size) has to be prepended with each message, when transferring 1TB of data 80GB of that would be occupied by the size headers. Twister2 avoids the necessity of sending 80GB of redundant data through the network by making use of headerless messaging and predefined schemas.

Clone this wiki locally