Move processing cache out of DA #5420

dapplion · 2024-03-17T14:38:25Z

Current unstable has a processing cache that tracks both blocks and blobs while they are being processed. To be specific, in the case of blobs "processing" means the time to KZG verify the proofs and insert them in the data availability overflow cache:

self.data_availability_checker
    .notify_rpc_blobs(slot, block_root, &blobs);
// < processing starts here
check_slashable_header(blobs);
let availability = self.data_availability_checker.put_rpc_blobs(block_root, blobs)?;
// < processing ends here
let r = self.process_availability(slot, availability).await;
self.remove_notified(&block_root, r)

The purpose of the cache is to:

Satisfy recent requirement (late '23) to serve blocks over RPC that are gossip verified but not fully verified / imported
Prevent duplicate requests on (now deleted) delayed lookup Remove delayed lookups #4992

Let's address each one for blocks and blobs

1 / blocks

Block processing can take a few hundred milliseconds due to execution validation. There must exist some cache but does not need to be tied to DA.

1 / blobs

This cache will be useful if we get a blobs by root request for a blob that we have just received which happens to be in the few milliseconds KZG proof validation takes. I can be convinced otherwise, but this use-case does not justify the complexity.

2

After deleting delayed lookup logic this argument is mostly void. In rare conditions we may end up

Benefits

The AvailabilityView abstraction spills complexity that only affects the accumulators of the availability cache and block lookups also into the processing cache.

By making the processing cache not concerned with DA Lighthouse is a bit more maintainable and easy to reason about.

beacon_node/network/src/sync/block_lookups/single_block_lookup.rs

dapplion · 2024-03-17T14:42:14Z

beacon_node/beacon_chain/src/data_availability_checker/overflow_lru_cache.rs


+    /// Returns true if the block root is known, without altering the LRU ordering
+    pub fn has_block(&self, block_root: &Hash256) -> bool {
+        self.in_memory.peek(block_root).is_some() || self.store_keys.get(block_root).is_some()


@ethDreamer is this line right, i.e. should I also check the store keys?

I think so, the store keys are referencing the entries in the data availability cache that over overflowed to disk. So it's an extension of the availability cache really.

Side note - but once we've merged tree states we should consider getting rid of the overflow logic. It would be a nice complexity reduction.

realbigsean · 2024-03-17T15:53:55Z

overall I agree with your reasoning and support the change! This probably could/should have happened along with the removal of delayed lookup logic

dapplion · 2024-04-06T06:05:54Z

Sharing some notes justifying the change as is. We can address the de-duplication gadget on another PR

Lighthouse processing cache

Processing cache is tied to the AvailabilityView trait, complicating extension like Data Availability Sampling.

Do processing cache items need to implement AvailabilityView?
Is the processing cache really necessary?

The processing cache purposes are:

Prevent re-downloads of blocks and blobs (e.g. while block from gossip is doing EL verification, receive an attestation that triggers an unknown head lookup)
Make blocks and blobs that pass gossip validation available to ReqResp consumers

Gossip block journey

Receive gossip block
Spawn worker for process_gossip_block
- Run process_gossip_unverified_block
- Insert + check into duplicate_cache.check_and_insert(block_root)
- Run process_gossip_verified_block
  - Insert into processing_cache
  - Run process_block
    - Perform execution validation (SLOW)
    - Insert into data_availability_checker
      IF AVAILABLE
      - Evict from data_availability_checker
      - Import block into fork-choice
      - Add block + blobs to early_attester_cache
      - Evict from processing_cache
- Evict from duplicate_cache

Gossip blob journey

Receive gossip blob
Spawn worker for process_gossip_blob
- Run chain.verify_blob_sidecar_for_gossip
- Run process_gossip_verified_blob
  - Check if known to fork_choice_read_lock, if true ignore
  - Insert into processing_cache
  - Run KZG verification (~1ms)
  - Insert into data_availability_checker
    IF AVAILABLE
    - Evict from data_availability_checker
    - Import block into fork-choice (~4ms, *see below)
    - Add block + blobs to early_attester_cache
    - Evict from processing_cache

Unknown head attestation journey

Receive gossip attestation
Spawn worker for process_gossip_attestation
- Early, run verify_head_block_is_known to check against

        chain.canonical_head.fork_choice_read_lock()
        .get_block(&attestation.data.beacon_block_root)

IF UNKNOWN

.
- Inform sync send(SyncMessage::UnknownBlockHashFromAttestation())
- Schedule for re-process

ON SYNC

Receive UnknownBlockHashFromAttestation
Create a new_current_lookup for block_root
Check block_already_downloaded
- Checks self.da_checker.has_block(block_root)
  - Checks self.processing_cache.has_block(block_root)
Check blobs_already_downloaded
- Checks processing_cache with

let Some(processing_components) = da_checker.processing_cache.get(block_root)
else { return MissingBlobs::fetch_all_for_block(block_root) };
da_checker.get_missing_blob_ids(block_root, processing_components)

Is the processing cache necessary for blobs?

For early ReqResp serving

No

The processing cache is not checked for ReqResp blob serving. And that's okay, blobs are inserted to the data_availability_checker almost immediately after being inserted into the processing cache

For de-duplication

No

If we rely only on data_availability_checker + fork-choice we lose knowledge of the blob during two instances:

During KZG verification (~1 ms)
During block import into fork-choice (~4 ms, *see below)

beacon_fork_choice_process_block_seconds metrics, most runs are instant with 1 run per epoch lasting 100ms

Which matches the long term average of 4ms seen below = 100ms/32

In the unlikely case that sync attempts to download a blob during a slow run of fork-choice block import, the worst case is to download a set of blobs.

Is the processing cache necessary for blocks?

For early ReqResp serving

Yes

duplicate_cache does not hold the block itself but only the root.

For de-duplication

No*

Blocks enjoy another cache, the duplicate_cache with the properties:

Blocks are inserted into duplicate_cache slightly before than the processing_cache
When evicted from the duplicate_cache there three outcomes:
- Block was available and got imported into fork_choice
- Block was not available and got inserted into data_availability_checker
- There was an error

So duplicate_cache + data_availability_checker cover all paths of processing_cache with the exception of block import into fork-choice (same as blobs)

…n-da-processing-cache

…ng blob id calculations

realbigsean · 2024-04-09T19:35:53Z

removed a TODO that was outdated, removed the processing cache file, and updated the missing blob ids calculation to consider if we're in deneb (keeps us from requesting blobs pre-deneb)

a6cd314

realbigsean · 2024-04-09T19:36:55Z

@Mergifyio queue

mergify · 2024-04-09T19:37:19Z

queue

🛑 The pull request has been removed from the queue `default`

Pull request #5420 has been dequeued by a dequeue command.

You can take a look at Queue: Embarked in merge queue check runs for more details.

In case of a failure due to a flaky test, you should first retrigger the CI.
Then, re-embark the pull request into the merge queue by posting the comment
@mergifyio refresh on the pull request.

realbigsean · 2024-04-09T20:38:02Z

@Mergifyio requeue

mergify · 2024-04-09T20:38:30Z

requeue

❌ This pull request head commit has not been previously disembarked from queue.

realbigsean · 2024-04-09T20:38:44Z

@Mergifyio dequeue

mergify · 2024-04-09T20:38:51Z

dequeue

✅ The pull request has been removed from the queue `default`

realbigsean · 2024-04-09T20:39:05Z

@Mergifyio requeue

mergify · 2024-04-09T20:39:11Z

requeue

✅ This pull request will be re-embarked automatically

The followup queue command will be automatically executed to re-embark the pull request

mergify · 2024-04-09T20:39:12Z

queue

🛑 The pull request has been removed from the queue `default`

The queue conditions cannot be satisfied due to failing checks.

You can take a look at Queue: Embarked in merge queue check runs for more details.

In case of a failure due to a flaky test, you should first retrigger the CI.
Then, re-embark the pull request into the merge queue by posting the comment
@mergifyio refresh on the pull request.

jimmygchen · 2024-04-10T09:22:00Z

@mergify requeue

mergify · 2024-04-10T09:22:07Z

requeue

✅ This pull request will be re-embarked automatically

The followup queue command will be automatically executed to re-embark the pull request

mergify · 2024-04-10T09:22:08Z

queue

🛑 The pull request has been removed from the queue `default`

The queue conditions cannot be satisfied due to failing checks.

You can take a look at Queue: Embarked in merge queue check runs for more details.

In case of a failure due to a flaky test, you should first retrigger the CI.
Then, re-embark the pull request into the merge queue by posting the comment
@mergifyio refresh on the pull request.

…n-da-processing-cache

realbigsean · 2024-04-10T13:40:21Z

@Mergifyio queue

mergify · 2024-04-10T13:40:26Z

queue

🛑 The pull request has been removed from the queue `default`

The queue conditions cannot be satisfied due to failing checks.

You can take a look at Queue: Embarked in merge queue check runs for more details.

In case of a failure due to a flaky test, you should first retrigger the CI.
Then, re-embark the pull request into the merge queue by posting the comment
@mergifyio refresh on the pull request.

realbigsean · 2024-04-10T17:17:39Z

@Mergifyio requeue

mergify · 2024-04-10T17:17:44Z

requeue

✅ This pull request will be re-embarked automatically

The followup queue command will be automatically executed to re-embark the pull request

mergify · 2024-04-10T17:17:45Z

queue

✅ The pull request has been merged automatically

The pull request has been merged automatically at 30dc260

Move processing cache out of DA

0c86593

dapplion force-pushed the non-da-processing-cache branch from 3c5af19 to 0c86593 Compare March 17, 2024 14:39

dapplion commented Mar 17, 2024

View reviewed changes

beacon_node/network/src/sync/block_lookups/single_block_lookup.rs Show resolved Hide resolved

dapplion commented Mar 17, 2024

View reviewed changes

jimmygchen mentioned this pull request Mar 21, 2024

Remove DataAvailabilityView trait from ChildComponents #5421

Merged

jimmygchen added the work-in-progress PR is a work-in-progress label Mar 21, 2024

Merge branch 'sigp/unstable' into non-da-processing-cach

75c807f

dapplion requested review from jimmygchen and realbigsean April 6, 2024 06:06

realbigsean added 2 commits April 9, 2024 15:07

Merge branch 'unstable' of https://github.com/sigp/lighthouse into no…

0304abd

…n-da-processing-cache

remove unused file, remove outdated TODO, add is_deneb check to missi…

a6cd314

…ng blob id calculations

realbigsean approved these changes Apr 9, 2024

View reviewed changes

realbigsean added ready-for-merge This PR is ready to merge. and removed work-in-progress PR is a work-in-progress labels Apr 9, 2024

realbigsean mentioned this pull request Apr 9, 2024

Remove availability view trait #5544

Merged

mergify bot added a commit that referenced this pull request Apr 10, 2024

Merge of #5420

d9ba41f

This was referenced Apr 10, 2024

merge queue: embarking unstable (b74da14) and [#5543 + #5420 + #5531] together #5547

Closed

Beta compiler fixes #5543

Merged

mergify bot mentioned this pull request Apr 10, 2024

Impl Ord on ForkName for ChainSpec usage #5531

Merged

mergify bot added a commit that referenced this pull request Apr 10, 2024

Merge of #5420

62d52c5

mergify bot mentioned this pull request Apr 10, 2024

merge queue: embarking unstable (b74da14), #5543 and #5420 together #5548

Closed

5 tasks

mergify bot added a commit that referenced this pull request Apr 10, 2024

Merge of #5420

f704614

This was referenced Apr 10, 2024

merge queue: embarking unstable (ced6538) and [#5420 + #5352 + #5534] together #5551

Closed

Skip calculation of boosted_relay_value when builder_boost_factor pro… #5352

Merged

Event-based block lookup tests #5534

Merged

mergify bot added a commit that referenced this pull request Apr 10, 2024

Merge of #5420

2d91e41

mergify bot mentioned this pull request Apr 10, 2024

merge queue: embarking unstable (ced6538) and #5420 together #5552

Closed

5 tasks

realbigsean added 2 commits April 10, 2024 09:36

Merge branch 'unstable' of https://github.com/sigp/lighthouse into no…

fcc95f8

…n-da-processing-cache

fix lints

321bd85

mergify bot merged commit 30dc260 into sigp:unstable Apr 10, 2024

dapplion deleted the non-da-processing-cache branch January 24, 2025 17:57

Move processing cache out of DA #5420

Move processing cache out of DA #5420

Uh oh!

Conversation

dapplion commented Mar 17, 2024

Benefits

Uh oh!

Uh oh!

dapplion Mar 17, 2024

Choose a reason for hiding this comment

Uh oh!

realbigsean Mar 17, 2024

Choose a reason for hiding this comment

Uh oh!

realbigsean commented Mar 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dapplion commented Apr 6, 2024

Lighthouse processing cache

Gossip block journey

Gossip blob journey

Unknown head attestation journey

Is the processing cache necessary for blobs?

Is the processing cache necessary for blocks?

Uh oh!

realbigsean commented Apr 9, 2024

Uh oh!

realbigsean commented Apr 9, 2024

Uh oh!

mergify bot commented Apr 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🛑 The pull request has been removed from the queue default

Uh oh!

realbigsean commented Apr 9, 2024

Uh oh!

mergify bot commented Apr 9, 2024

❌ This pull request head commit has not been previously disembarked from queue.

Uh oh!

realbigsean commented Apr 9, 2024

Uh oh!

mergify bot commented Apr 9, 2024

✅ The pull request has been removed from the queue default

Uh oh!

realbigsean commented Apr 9, 2024

Uh oh!

mergify bot commented Apr 9, 2024

✅ This pull request will be re-embarked automatically

Uh oh!

mergify bot commented Apr 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🛑 The pull request has been removed from the queue default

Uh oh!

jimmygchen commented Apr 10, 2024

Uh oh!

mergify bot commented Apr 10, 2024

✅ This pull request will be re-embarked automatically

Uh oh!

mergify bot commented Apr 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🛑 The pull request has been removed from the queue default

Uh oh!

realbigsean commented Apr 10, 2024

Uh oh!

mergify bot commented Apr 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🛑 The pull request has been removed from the queue default

Uh oh!

realbigsean commented Apr 10, 2024

Uh oh!

mergify bot commented Apr 10, 2024

✅ This pull request will be re-embarked automatically

Uh oh!

mergify bot commented Apr 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ The pull request has been merged automatically

Uh oh!

Reviewers

Assignees

Labels

Projects

realbigsean commented Mar 17, 2024 •

edited

Loading

mergify bot commented Apr 9, 2024 •

edited

Loading

🛑 The pull request has been removed from the queue `default`

✅ The pull request has been removed from the queue `default`

mergify bot commented Apr 9, 2024 •

edited

Loading

🛑 The pull request has been removed from the queue `default`

mergify bot commented Apr 10, 2024 •

edited

Loading

🛑 The pull request has been removed from the queue `default`

mergify bot commented Apr 10, 2024 •

edited

Loading

🛑 The pull request has been removed from the queue `default`

mergify bot commented Apr 10, 2024 •

edited

Loading