You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[Variant] Reserve capacity beforehand during large object building (#7922)
# Which issue does this PR close?
- Part of #7896
# Rationale for this change
In #7896, we saw that inserting a
large amount of field names takes a long time -- in this case ~45s to
insert 2**24 field names. The bulk of this time is spent just allocating
the strings, but we also see quite a bit of time spent reallocating the
`IndexSet` that we're inserting into.
`with_field_names` is an optimization to declare the field names upfront
which avoids having to reallocate and rehash the entire `IndexSet`
during field name insertion. Using this method requires at least 2
string allocations for each field name -- 1 to declare field names
upfront and 1 to insert the actual field name during object building.
This PR adds a new method `with_field_name_capacity` which allows you to
reserve space to the metadata builder, without needing to allocate the
field names themselves upfront. In this case, we see a modest
performance improvement when inserting the field names during object
building
Before:
<img width="1512" height="829" alt="Screenshot 2025-07-13 at 12 08
43 PM"
src="https://github.com/user-attachments/assets/6ef0d9fe-1e08-4d3a-8f6b-703de550865c"
/>
After:
<img width="1512" height="805" alt="Screenshot 2025-07-13 at 12 08
55 PM"
src="https://github.com/user-attachments/assets/2faca4cb-0a51-441b-ab6c-5baa1dae84b3"
/>
0 commit comments