Skip to content

Frequent Hash collision and redundant lookups in StorageService.updateCachedStacks() #8539

@MineCake147E

Description

@MineCake147E

How to reproduce the performance issue

  1. Install AE2 19.2.11 and Spark 1.10.124-neoforge-1.21.1 via Modrinth.
  2. Create a new Superflat world with The Void preset in creative mode.
  3. Set up an ME network with more than 32 ME Drives, all full of 1k ME Storage Components.
  4. Connect an ME Level Emitter with the filter set to Bone Meal and the threshold set to 1.
  5. Populate the ME network with a lot of Netherite Hoe with random damage and random repair_cost.
  6. Set up an apparatus that constantly exports and imports the Bone Meal, and put a Bone Meal into the ME network.
  7. Start the Spark profiler by /spark profile.
  8. See what happens.

Spark profile link

https://spark.lucko.me/pJo1dIk6o7

Which minecraft version are you using?

1.21

On which mod loaders does it happen?

NeoForge

Additional details

When I was profiling and debugging the AE2, I noticed that some Netherite hoes share the same hash value.

this.hashCode = ItemStack.hashItemAndComponents(stack);

I dived deep into this ItemStack.hashItemAndComponents function, and I found out that the vast majority of Reference2ObjectFunction<K, V>.hashCode() implementations only does a simple hash calculation of each entry.
So simple that it causes a lot of hash collisions among Netherite hoes, even with just a few differences in terms of damage and repair_cost.
Such a high probability of hash collision causes a lot of slow lookups in hash maps, which leads to even slower, detailed ItemStack comparisons. In the Spark profile, appeng.api.stacks.AEItemKey.equals() calls in it.unimi.dsi.fastutil.objects.Object2LongOpenHashMap.getLong() shows how frequent the hash collisions are.
Even without the hash collision issue, hash calculation on the whole components would be significantly slower than the one on the component patches.

I also noticed that appeng.me.service.StorageService.updateCachedStacks() has 3 loops. One for notifying the watchers the updates in currently available stacks, one for notifying the watchers the removed stacks, and the last one for updating everything in cachedAvailableAmounts.

private void updateCachedStacks() {
var time = System.nanoTime();
try {
cachedStacksNeedUpdate = false;
cachedAvailableStacks.clear();
storage.getAvailableStacks(cachedAvailableStacks);
// clear() only clears the inner maps,
// so ensure that the outer map gets cleaned up too
cachedAvailableStacks.removeEmptySubmaps();
// Post watcher update for currently available stacks
for (var entry : cachedAvailableStacks) {
var what = entry.getKey();
var newAmount = entry.getLongValue();
if (newAmount != cachedAvailableAmounts.getLong(what)) {
postWatcherUpdate(what, newAmount);
}
}
// Post watcher update for removed stacks
for (var what : cachedAvailableAmounts.keySet()) {
var newAmount = cachedAvailableStacks.get(what);
if (newAmount == 0) {
postWatcherUpdate(what, newAmount);
}
}
// Update private amounts
cachedAvailableAmounts.clear();
for (var entry : cachedAvailableStacks) {
cachedAvailableAmounts.put(entry.getKey(), entry.getLongValue());
}
} finally {
inventoryRefreshStats.add(System.nanoTime() - time);
}
}

cachedAvailableAmounts.put(entry.getKey(), entry.getLongValue()) returns the old value, so we don't need to look up the cachedAvailableAmounts more than once.

The second loop calls cachedAvailableStacks.get(what) for every single items that the StorageService observed in the ME storage in the previously tick. I would rather create a ObjectSet<AEKey> for this purpose, and only iterate the items that the cachedAvailableStacks doesn't contain.

So I forked the AE2 and created an almost-PR-ready optimization prototypes here.

  • Introduces a custom ItemStack hash calculation algorithm that
    • Outputs a 64-bit hash instead of 32-bit hash.
    • Suffices the existing requirements, like being oblivious to the registration order of the component patches' elements.
  • Separates the watcher notification logic from the StorageService.updateCachedStacks() entirely, since it is never needed while interestManager.isEmpty() is true, and cachedStacksNeedUpdate only becomes true when the interestManager.isEmpty() is true.
  • Optimizes the now separate watcher notification logic (StorageService.notifyWatchers()) by leveraging the ObjectOpenHashSet<AEKey>, which is cached in the StorageService instance.

The Spark profile after optimization: https://spark.lucko.me/90I4MugraO

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions