Skip to content

Segment padding uninitialized memory problems #1032

@divergentdave

Description

@divergentdave

This is a follow-up from #937 (comment). To recapitulate, IoBufs::write_to_log() does no padding if there is less than 32 bytes available at the end of a segment, and then uses the full length of the segment when writing, which can result in a read from uninitialized memory.

In trying to write a patch for this, I ran into further conundrums. While I haven't tried to reproduce this yet, it seems that IoBufs::write_to_log() might not pad the entirety of the segment when it does pad. First, we have:

let pad_len = capacity - bytes_to_write - MAX_MSG_HEADER_LEN;
...
let padding_bytes = vec![MessageKind::Corrupted.into(); pad_len];

The Cap message header is written from offset bytes_to_write to bytes_to_write + header_bytes.len(). Then, the zeros are written from offset bytes_to_write + header_bytes.len() to bytes_to_write + header_bytes.len() + pad_len. Cancelling terms, the range of bytes that is written ends at offset capacity + header_bytes.len() - MAX_MSG_HEADER_LEN. We know that header_bytes.len() will be less than MAX_MSG_HEADER_LEN, because the len field will have a zero in its top byte, thus there will be uninitialized bytes at the end of the buffer. Moving through the function, if should_pad is true, then maxed was true, and thus total_len is set to capacity, and the uninitialized bytes will be read from and written to disk.

Correcting this issue looks like it would require solving a Catch-22, because making the Cap message extend to the end of the segment would require knowing the length of the Cap message's header, but that would in turn require knowing how long we wanted the message to be, etc. Perhaps this could be solved by using capacity + header_bytes.len() - MAX_MSG_HEADER_LEN as the message's length in its header, checksumming that many bytes in the message, but then zeroing out more bytes until the end of the segment.

A parallel concern is whether the read paths properly handle the red zone of each segment. I noticed that LogIter checks if cur_lsn is within MAX_MSG_HEADER_LEN of the end of a segment, and moves to the next segment. It appears that IoBufs::start(), one of the few other callers of read_message(), does not check for this case. Is such logic required here?

At this point, I'm feeling a little lost in the weeds, would appreciate input on how to tighten up segment writing/reading here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions