triedb/pathdb: introduce lookup structure to optimize state access #30971

rjl493456442 · 2024-12-30T07:51:36Z

This pull request introduces a mechanism to improve state lookup efficiency in pathdb
by maintaining a lookup structure that eliminates unnecessary iteration over diff layers.

The core idea is to track a mutation history for each dirty state entry residing in the
diff layers. This history records the state roots of all layers in which the entry was modified,
sorted from oldest to newest.

During state lookup, this mutation history is queried to find the most recent layer whose
state root either matches the target root or is a descendant of it. This allows us to quickly
identify the layer containing the relevant data, avoiding the need to iterate through all diff
layers from top to bottom.

Besides, the overhead for state lookup is constant, no matter how many diff layers are retained
in the pathdb, which unlocks the potential to hold more diff layers.

Of course, maintaining this lookup structure introduces some overhead. For each state transition,
we need to:
(a) update the mutation records for the modified state entries, and
(b) remove stale mutation records associated with outdated layers.

On our benchmark machine, it will introduce around 1ms overhead which is acceptable.

rjl493456442 · 2024-12-30T08:04:18Z

With this PR (on top of #30661), the block execution performance matches the current master branch.

The state access (storage, account) is slightly faster. The triedb commit is about 2ms slower due to the lookup overhead.

Bench07: PR
Bench08: Master

Without this PR, #30661 vs master, the state access is significantly slower

MariusVanDerWijden · 2025-05-19T09:46:54Z

triedb/pathdb/lookup.go

+						if i == 0 {
+							list = list[1:]
+							if cap(list) > 1024 {
+								list = append(make([]common.Hash, 0, len(list)), list...)


If we have 1024 non-finalized diff layers floating around, we would hit this every time, right?
I get the idea of pruning away some unnecessary memory if we ever hit a lot of parallel forks. I guess its not really a big deal either way

If we have 1024 non-finalized diff layers floating around

The number of diff layers is defined as 128, regardless of the associated state is finalized or not.
But for sure, with this PR, we can unlock the potential to pile up arbitrary number of diff layers
on top (if the memory is sufficient to host that many diff layers). In such cases, we need to adjust
this constant.

The main motivation to add this additional check is: for some storage slots frequently being updated,
it might be modified in every blocks. The underlying slice might be held forever. This mechanism just
tries to make sure the dangling slices won't be referenced forever.

MariusVanDerWijden · 2025-05-19T10:55:09Z

Generally looks good, the only thing I stumbled upon was that we are storing all descendants in the map instead of just pointers to the next descendant. I think the map approach is probably significantly faster for the lookup compared to the tree/pointer based approach

rjl493456442 · 2025-05-27T05:27:43Z

The advantage of maintaining the ancestor-descendant relationship is that it allows O(1) queries to determine whether a is a descendant of b. This type of query is extremely frequent during state retrieval.

The downside is that for each new layer, it must recursively traverse all ancestors up to the disk layer (up to 128 layers). However, this cost occurs once per block and is entirely acceptable.

triedb/pathdb/layertree.go

triedb/pathdb/lookup.go

…thereum#30971) This pull request introduces a mechanism to improve state lookup efficiency in pathdb by maintaining a lookup structure that eliminates unnecessary iteration over diff layers. The core idea is to track a mutation history for each dirty state entry residing in the diff layers. This history records the state roots of all layers in which the entry was modified, sorted from oldest to newest. During state lookup, this mutation history is queried to find the most recent layer whose state root either matches the target root or is a descendant of it. This allows us to quickly identify the layer containing the relevant data, avoiding the need to iterate through all diff layers from top to bottom. Besides, the overhead for state lookup is constant, no matter how many diff layers are retained in the pathdb, which unlocks the potential to hold more diff layers. Of course, maintaining this lookup structure introduces some overhead. For each state transition, we need to: (a) update the mutation records for the modified state entries, and (b) remove stale mutation records associated with outdated layers. On our benchmark machine, it will introduce around 1ms overhead which is acceptable. Signed-off-by: jsvisa <delweng@gmail.com>

rjl493456442 requested a review from holiman as a code owner December 30, 2024 07:51

rjl493456442 added the pbss-archive label Dec 30, 2024

rjl493456442 force-pushed the lookup branch from 267cf19 to 740f4ce Compare December 30, 2024 08:07

rjl493456442 mentioned this pull request Jan 2, 2025

triedb/pathdb: introduce lookup structure to optimize node query #30557

Closed

rjl493456442 force-pushed the lookup branch from 740f4ce to fe6c3a3 Compare January 6, 2025 07:07

rjl493456442 force-pushed the lookup branch from fe6c3a3 to abe4cb5 Compare January 14, 2025 02:41

fjl assigned zsfelfoldi and fjl Jan 14, 2025

rjl493456442 added this to the 1.15.1 milestone Feb 5, 2025

rjl493456442 force-pushed the lookup branch 2 times, most recently from deb27b6 to 554727e Compare February 10, 2025 03:18

fjl modified the milestones: 1.15.1, 1.15.2, 1.15.3 Feb 13, 2025

rjl493456442 force-pushed the lookup branch 2 times, most recently from 554727e to 07aadaf Compare February 19, 2025 07:41

fjl modified the milestones: 1.15.3, 1.15.4 Feb 25, 2025

rjl493456442 added the post-prague label Feb 27, 2025

fjl modified the milestones: 1.15.4, 1.15.5, 1.15.6 Mar 1, 2025

rjl493456442 force-pushed the lookup branch from 895a918 to 3caf660 Compare April 6, 2025 07:16

rjl493456442 force-pushed the lookup branch 3 times, most recently from d88a197 to d7b5411 Compare May 15, 2025 12:31

triedb/pathdb: introduce lookup structure to optimize state access

350a82c

rjl493456442 force-pushed the lookup branch from d7b5411 to 90218fd Compare May 16, 2025 10:35

triedb/pathdb: remove sync pool

c02137b

rjl493456442 force-pushed the lookup branch from 90218fd to c02137b Compare May 16, 2025 11:33

rjl493456442 assigned MariusVanDerWijden May 16, 2025

triedb/pathdb: update func description

602890b

MariusVanDerWijden reviewed May 19, 2025

View reviewed changes

triedb/pathdb: update

8f9c277

triedb/pathdb: update comment

79617ee

MariusVanDerWijden reviewed May 27, 2025

View reviewed changes

triedb/pathdb/layertree.go Show resolved Hide resolved

rjl493456442 commented May 27, 2025

View reviewed changes

triedb/pathdb/lookup.go Outdated Show resolved Hide resolved

rjl493456442 commented May 27, 2025

View reviewed changes

triedb/pathdb/lookup.go Show resolved Hide resolved

rjl493456442 added 2 commits May 27, 2025 21:24

triedb/pathdb: define removeFromList function

285f789

triedb/pathdb: use merged storage key

81ed563

fjl approved these changes May 28, 2025

View reviewed changes

fjl merged commit 8b9f2d4 into ethereum:master May 28, 2025
3 of 4 checks passed

fjl added this to the 1.15.12 milestone May 28, 2025

rjl493456442 mentioned this pull request Jun 18, 2025

Expose a config option that allows users to set the number of memory layers to retain #32059

Open

2 tasks

BrewTestBot mentioned this pull request Jun 26, 2025

ethereum 1.16.0 Homebrew/homebrew-core#228279

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

triedb/pathdb: introduce lookup structure to optimize state access #30971

triedb/pathdb: introduce lookup structure to optimize state access #30971

Uh oh!

rjl493456442 commented Dec 30, 2024 •

edited

Loading

Uh oh!

rjl493456442 commented Dec 30, 2024 •

edited

Loading

Uh oh!

MariusVanDerWijden May 19, 2025

Uh oh!

rjl493456442 May 27, 2025

Uh oh!

MariusVanDerWijden commented May 19, 2025

Uh oh!

rjl493456442 commented May 27, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

triedb/pathdb: introduce lookup structure to optimize state access #30971

triedb/pathdb: introduce lookup structure to optimize state access #30971

Uh oh!

Conversation

rjl493456442 commented Dec 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rjl493456442 commented Dec 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MariusVanDerWijden May 19, 2025

Choose a reason for hiding this comment

Uh oh!

rjl493456442 May 27, 2025

Choose a reason for hiding this comment

Uh oh!

MariusVanDerWijden commented May 19, 2025

Uh oh!

rjl493456442 commented May 27, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rjl493456442 commented Dec 30, 2024 •

edited

Loading

rjl493456442 commented Dec 30, 2024 •

edited

Loading