-
Couldn't load subscription status.
- Fork 127
Description
Background
The ValidityWindow component is a critical part of our system, with ChainIndex being one of its dependencies that populates its state. Inconsistencies in ValidityWindow state have led to subtle bugs that are difficult to diagnose.
Several months ago, we implemented a Syncer that helps transition to normal operation faster by fetching missing (historical) blocks from peers to build the ValidityWindow. However, these blocks weren't saved to disk, which resulted in inconsistent state after node restart. Had we saved the historical blocks, this issue would've been avoided.
Current Problem
Currently, there's a disconnect between accepting blocks into ValidityWindow and persisting them to disk:
Block accepted -> accept into ValidityWindow -> save on-disk
This separation of concerns creates opportunities for miss-use where one step might be implemented without the other, leading to state inconsistencies between memory and disk.
Proposed Solution
Make ValidityWindow handle the entire process: when a block is accepted into ValidityWindow, it should automatically save to disk without requiring additional calls from the client code. This approach would:
- Simplify the API
- Eliminate differences between memory state and disk state
- Make it impossible to forget saving to disk
- Improve ease of use for ValidityWindow users
The key concept is that having a block in ValidityWindow and saving it on disk should be a single atomic operation - you can't do one without the other.
Implementation Considerations
- During consensus, we currently need to save the block prior to notifying the validity window. We need to assess if the proposed change would affect this requirement.
- We should ensure ValidityWindow can be responsible for saving historical blocks.
- The relationship between ValidityWindow and block persistence needs to be made explicit to prevent future implementation errors.
Next Steps
- Have audit of the execution path to understand potential side effects before starting this
- Evaluate the feasibility of making ValidityWindow responsible for both accepting and persisting blocks
- Determine if any existing components would need modification to support this change
- Create a proof-of-concept implementation to validate the approach
Related PR
This extends on our ongoing efforts to reduce the surface area of ValidityWindow after [PR #2012](#2012), which aimed to ensure ValidityWindow is complete before starting normal operation.