Skip to content

Optimize generating many bitsets for serialization #365

@richardartoul

Description

@richardartoul

One of our heaviest workloads uses this library to index a lot of ingested data in real time. In one part of the workload we basically have dozens of goroutines doing something like the following in a loop:

postings := roaring.NewBitmap()
for _, thing := range things {
    postings.Clear()
    for _, row := range $MATCHING_ROWS {
        postings.Add(row)
    }
    postings.WriteTo()
}

We've already heavily optimized the code that generates things to be very efficient and avoid allocations/pointers, however, when its time to actually serialize the index we're building we have to go through this library and it allocates like crazy even though we're trying to reuse a single datastructure.

I've provided a benchmark that demonstrates this along with some improvements in this P.R: #364

I realize that adding pooling of internal datastructures may be a big change, but I tried to structure the P.R so that its completely "opt in" via the new ClearRetainDatastructures() API.

EDIT: I can't provide screenshots of profiling for obvious reasons, but I just went back and checked and the allocations removed by this P.R represent ~ 56% of all allocations in our workload so this will have a very substantial performance impact for us (and others I expect who use the new API).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions