Skip to content

Improve BinaryWriter string serialization #347

@fabianoliver

Description

@fabianoliver

At the moment, when BinaryWriter serializes a string, it allocates a new array for the UTF8 bytes of a string, and then effectively copies the bytes of that array to the destination stream.

It might be possible to optimize this a little - both in terms of avoiding array allocation, as well as potentially in terms of saving one copy of the data.

With the upcoming 11.x milestone, I thought now might be a good time to float that, especially as the latter might need a slightly bigger change. Anyways, here are two possible approaches.

If it's not considered to be too big of a change, I would think idea 1 might a nice way to go here. It would arguably be using a slightly more modern approach to write data, and would likely be the more performant solution, but curious to hear your thoughts!

Idea 1: BinaryWriter should write into a IBufferWriter<byte> rather than into a Stream

Stream is a little bit of a dated concept at this point, buffer writers are a great alternative as a sink for serialization.

So for example, when writing a string, we could acquire a buffer directly from the target buffer writer and serialize the string right into it. That would entirely eliminate any intermediate allocations, as well as extra copies of the data.

That would also benefit serialization of all other types that go through some intermediate buffer currently (eg floats), though I'd expect the benefits to be a little less noticeable there.

Idea 2: Keep Stream, serialize using stackalloc'ed segments

Untested code, so more of a sketch (eg this might not even work with multi char runes i think). Heap allocation of an intermediate array could still be avoided I think, though we would still incur the usual penalty of having to copy from buffer to stream. Might need a bit of benchmarking though to make sure its actually net net more optimal:

public void WriteString(string value)
{
    var chars = value.AsSpan();

    WriteInteger(Encoding.UTF8.GetByteCount(chars));

    Span<byte> buffer = stackalloc byte[256];
    for (var i = 0; i < chars.Length; i += 128)
    {
      var written = Encoding.UTF8.GetBytes(chars.Slice(i,128), buffer);
      stream.Write(buffer[..written]);
    }
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions