New model for writing records to file #335

mkeskells · 2024-11-07T18:18:01Z

mkeskells
Nov 7, 2024

The current implementation buffers records, and writes them based on a flush

A more appropriate approach is IMHO to stream the records to file
In this model the RecordGrouper would not collect the records in the buffer, but indicate the logical stream of where the records should be written, and the caller ( the task) manages how the records are handled - added to a "logical stream"

This opens up several possibilities and improvements

files can be written when the logical stream has sufficient buffered recordsl, rather than waiting for a flush
files could be written in parts when there is sufficient data to write, but the file isn't full
memory pressure is reduced when records are written (maybe only after the file is closed)
the IO load is more distributed, (not maybe several hundred files being written when a flush occurs)

other possibilities

files can be written asynchronously, (ideally in virtual threads when we are building on a JVM that supports that) which makes the system more responsive (reduction in latency)
we can make a better back pressure model, that is more graceful
back pressure can be graceful, rather than a "stop the world action"

We can use requestFlush() when files have been written to update the metrics, rather than relying on timeouts and flushing everything

Issues

We would need to ensure that records are eventually written, by having some timeout, depending on the preCommitaction
some of the documented behaviour WRT to batching would change
file overwrite needs to be handled specially (that isn't in the PR below, but is simple) - enough.g. they you have a filename based on key rather than unique files such as partition and offset

A simple implementation of this is in #319, but this also has some other changes

Claudenw · 2025-05-19T09:33:27Z

Claudenw
May 19, 2025
Maintainer

If records are written before a flush how do we ensure that they are not duplicated in the case of a failure before flush? A restart would then, presumably, write the same records again resulting in duplicate records in storage.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

New model for writing records to file #335

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

New model for writing records to file #335

Uh oh!

mkeskells Nov 7, 2024

Replies: 1 comment

Uh oh!

Claudenw May 19, 2025 Maintainer

mkeskells
Nov 7, 2024

Claudenw
May 19, 2025
Maintainer