-
Notifications
You must be signed in to change notification settings - Fork 33
perf: speed up iter, range, scan methods with Rc<Node> #356
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
|
|
|
FYI @frankdavid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this PR supposed to improve performance? It doesn't seem like we are cloning node
(in which case using Rc<Node<K>>
could improve performance).
you're right -- at first glance using but my guess is the improvement comes from avoiding repeated moves of large value. as benchmarks show this change provides a measurable impact:
updated github PR description with this explanation. does it make sense? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR improves iteration performance for BTreeMap (and BTreeSet benchmarks) by switching from moving entire Node structs to storing nodes behind reference‐counted pointers (Rc<Node>). Key changes include modifying the Cursor enum in src/btreemap/iter.rs to wrap nodes in Rc, and updating numerous iteration methods to create Rc instances, along with adjusting benchmark values to reflect performance improvements.
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
File | Description |
---|---|
src/btreemap/iter.rs | Updated Cursor and related iteration logic to wrap nodes in Rc. |
benchmarks/btreeset/canbench_results.yml | Benchmark instruction counts adjusted to reflect performance changes. |
benchmarks/btreemap/canbench_results.yml | Benchmark instruction counts adjusted to reflect performance changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation!
This PR replaces the use of reference-counted `Rc<Node>` pointers with unique `Box<Node>` allocations in the `BTreeMap` iterator to reduce overhead. btreemap ``` instructions: status: Improvements detected 🟢 counts: [total 285 | regressed 0 | improved 9 | new 0 | unchanged 276] change: [max 0 | p75 -97.33K | median -147.22K | p25 -481.26K | min -5.34M] change %: [max 0.00% | p75 -0.02% | median -0.03% | p25 -0.06% | min -3.25%] ``` btreeset ``` instructions: status: Improvements detected 🟢 counts: [total 100 | regressed 0 | improved 2 | new 0 | unchanged 98] change: [max +625.36K | p75 -288 | median -27.44K | p25 -80.94K | min -377.91K] change %: [max +0.05% | p75 -0.07% | median -0.30% | p25 -1.17% | min -2.35%] ``` Previous change #356 was adding `Rc<Node>` (vs simple moving) which added some improvements as well as regressions.
This PR improves iteration performance by storing nodes as Rc<Node>.
The change is inspired by the Entry API PoC in #251, but applied separately to avoid breaking changes.
btreeset:
btreemap:
Most of the improved benchmarks related to
iter
,range
,scan
methods.Some regressions are small and limited to benchmarks not involving traversal.
UPD(from comment):
At first glance using
Rc<Node<K>>
should not help since we don't explicitly cloneNode
anywhere.The improvement comes from avoiding repeated moves of large value.
Node
is fairly big (multiple fields and vectors) comparing toRc
, so moving it around during iteration adds overhead.Switching to
Rc
makes those moves much cheaper, it's just a pointer copy.