Skip to content

Conversation

@dwforbes
Copy link

@dwforbes dwforbes commented Jun 6, 2017

kh_put_##name has an early check for an initial empty bucket that negatively impacts significant use, while only benefiting unrealistically simplistic benchmarks (e.g. inserting a single item).

As an example, for k-nucleotide, with this revision on an arbitrary test machine the entire process completes in ~3.3s real time, with ~9.5s of user time. Prior to this revision it completes in ~3.9s, with ~11.0s of user time.

dwforbes added 2 commits June 6, 2017 10:34
Increases flag memory usage by 300%, while significantly streamlining
logic and improving performance
@trapexit
Copy link

trapexit commented Jun 6, 2017

Do you mean "decreases flag memory usage"?

@dwforbes
Copy link
Author

dwforbes commented Jun 6, 2017

The first commit -- 9b12aa4 -- was the only one I intended to get captured in the pull request, in that case removing an unnecessary extra conditional that adds overhead (but was seemingly added as an optimization), discovered during some quick analysis regarding the performance of C in the nucleotide benchmark game.

It is a change that can only be positive, even for projects that might have poorly considered dependencies on internals.

The second commit -- 348601f -- is a potentially breaking change that reduces the amount of bit packing, increasing the memory footprint for flags (albeit still a tiny overhead for all but the most enormous of hash tables), however it has a dramatic benefit to performance on all architectures because it removes three expensive bit shifts per loop evaluation. It can break uses that might have a direct hard-fixed dependencies on the structure of flags, for instance (though any that used the macros provided would be fine), so it was more an example of cost-benefit analysis of memory versus performance.

@larseggert
Copy link

Both of these patches are great additions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants