Skip to content

Conversation

BloodStainedCrow
Copy link

Improves #1189.

As discussed there, this adds a new function to ParallelIterator: const_length which can return a value if the length of a ParallelIterator can be known statically.

This function has a default implementation which returns None by default, so this is not a breaking change.

This change improves the speed of the algorithm in #1189 by a factor of ~2.5 and peak memory usage by about a factor of ~15x. benchmark.

The footgun of splitting unconditionally (which is the main reason the speedup is not that high), is not fixed by this PR though.

@BloodStainedCrow BloodStainedCrow marked this pull request as ready for review September 13, 2024 16:48
@BloodStainedCrow
Copy link
Author

Not part of this PR but I also have an implementation for stopping flat_map from splitting the inner iterator unconditionally, which would stop this footgun almost completely (About 400x faster, same speed as flat_map_iter on arrays, ~100x on iterators), but it currently has the problem of not waking (and therefore starting their work stealing) all threads, if the outer iterator is smaller than the number of cores.

@BloodStainedCrow
Copy link
Author

@cuviper Sorry for the ping, but does this seem reasonable?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant