-
Notifications
You must be signed in to change notification settings - Fork 754
Implement serialize/deserialize for obsolete accounts #8414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Implement serialize/deserialize for obsolete accounts #8414
Conversation
accounts-db/src/accounts_db.rs
Outdated
/// Returns the accounts that were marked obsolete as of the passed in slot | ||
/// or earlier. Returned data includes the slots that the accounts were marked | ||
/// obsolete at | ||
pub fn obsolete_accounts_at_slot(&self, slot: Slot) -> ObsoleteAccounts { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had to create a new API for serialization that also returns the slots that accounts are marked obsolete at (other use cases don't include the slot).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not saying you have to do this here, but when creating one-off pub functions like this, I've found that naming who is calling the fn, to be helpful.
For here, that'd be something like obsolete_accounts_for_snapshots()
.
The main purpose being to inform others they should not call this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will update
bank_snapshots.into_iter().next_back() | ||
} | ||
|
||
pub fn write_obsolete_accounts_to_snapshot( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might look a little bit strange at first: The accounts are serialized with write_obsolete_accounts_to_snapshot and deserialized with deserialize_obsolete_accounts.
This is because the translation they are handling is as follows:
AccountStorageEntry -> SerdesObsoleteAccounts -> file -> SerdesObsoleteAccounts
serialize_obsolete_accounts/deserialize_obsolete_accounts handles SerdesObsoleteAccounts -> file -> SerdesObsoleteAccounts, which are matching functions
Another approach is separating collect_obsolete_accounts and serialize at the caller, but that bled the internals of SerdesObsoleteAccounts into core which I want to avoid.
Combining write_obsolete_accounts_to_snapshot and serialize_obsolete_accounts and just calling it serialize_obsolete_accounts made for difficult unit test writing.
Ok(consumed_size) | ||
} | ||
|
||
#[allow(dead_code)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function will be used in the next PR.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #8414 +/- ##
========================================
Coverage 83.2% 83.2%
========================================
Files 846 847 +1
Lines 369251 369533 +282
========================================
+ Hits 307229 307621 +392
+ Misses 62022 61912 -110 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just over the serde types, which we need to modify. I'll do another pass to look at the actual serialization and deserialization code.
accounts-db/src/accounts_db.rs
Outdated
/// Returns the accounts that were marked obsolete as of the passed in slot | ||
/// or earlier. Returned data includes the slots that the accounts were marked | ||
/// obsolete at | ||
pub fn obsolete_accounts_at_slot(&self, slot: Slot) -> ObsoleteAccounts { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not saying you have to do this here, but when creating one-off pub functions like this, I've found that naming who is calling the fn, to be helpful.
For here, that'd be something like obsolete_accounts_for_snapshots()
.
The main purpose being to inform others they should not call this function.
#[derive(Debug, Default, Serialize, Deserialize)] | ||
pub(crate) struct SerdeObsoleteAccounts { | ||
/// The ID of the associated account file. Used for verification to ensure the restored | ||
/// obsolete accounts correspond to the correct account file | ||
pub id: SerializedAccountsFileId, | ||
/// The number of obsolete bytes in the storage. These bytes are removed during archive | ||
/// serialization/deserialization but are present when restoring from directories. This value | ||
/// is used to validate the size when creating the accounts file. | ||
pub bytes: u64, | ||
/// A list of accounts that are obsolete in the storage being restored. | ||
pub accounts: ObsoleteAccounts, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here's where we need to add Serialize and Deserialize. Plus the frozen ABI stuff.
e.g.
#[cfg_attr(
feature = "frozen-abi",
derive(AbiExample),
frozen_abi(digest = "TODO")
)]
#[derive(Debug, Clone, PartialEq, Default, Serialize, Deserialize)]
struct SerdeObsoleteAccounts{
id: SerializedAccountsFileId,
bytes: u64,
accounts: Vec<(Offset, usize, Slot)>
}
Note, we could leave the usize
in the obsolete accounts tuple, or change to a u64. Either is probably fine (I think the mapping to u64 would end up being a noop once compiled?).
In either case, we'll want to set the bincode options to use fixed int encoded.
e.g.:
bincode::DefaultOptions::new().with_fixint_encoding()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't implemented the abi stuff yet. I'll take a look tomorrow.
/// is used to validate the size when creating the accounts file. | ||
pub bytes: u64, | ||
/// A list of accounts that are obsolete in the storage being restored. | ||
pub accounts: ObsoleteAccounts, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use Vec<(Offset, usize, slot)> directly here and remove Serialize/Deserialize from ObsoleteAccounts.
runtime/src/snapshot_utils.rs
Outdated
let obsolete_accounts_file = fs::File::create(&obsolete_accounts_path)?; | ||
let mut file_stream = BufWriter::new(obsolete_accounts_file); | ||
|
||
for (slot, obsolete_accounts) in obsolete_accounts_map { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hashmap has non-deterministic order.
how about sort before serializing to ensure deterministic snapshot files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not recommend this. We have many hashmaps in the snapshot already, and we don't sort those.
But we should not be serializing item by item. There are serde helpers for serializing the whole hashmap. See solana_runtime::serde_snapshot::utils::serialize_iter_as_seq()
et al.
overall LGTM. The main fix is brook's comment for Ser/Der type boundary. And a minor suggestion for deterministic order for reproducible snapshots. |
accounts-db/src/lib.rs
Outdated
pub use { | ||
buffered_reader::large_file_buf_reader, file_io::validate_memlock_limit_for_disk_io, | ||
obsolete_accounts::ObsoleteAccounts, | ||
obsolete_accounts::{ObsoleteAccounts, ObsoleteAccountItem}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had to make this public for access in storage.rs
|
||
#[derive(Debug, Clone, PartialEq)] | ||
struct ObsoleteAccountItem { | ||
pub struct ObsoleteAccountItem { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same, public for access in storage.rs
runtime/src/serde_snapshot/tests.rs
Outdated
#[test_case(0, 0)] | ||
#[test_case(1, 0)] | ||
#[test_case(10, 15)] | ||
fn test_serialize_and_deserialize_obsolete_accounts( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved this to test in the latest review.
runtime/src/serde_snapshot.rs
Outdated
(serializable_bank, serializable_accounts_db, extra_fields).serialize(serializer) | ||
} | ||
|
||
pub(crate) fn serialize_obsolete_accounts<W>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really clear where the right place for this to live is, the ser/des functions are spread all over the place. Any feedback?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah... it's not great... We could create a new file runtime/src/serde_snapshot/obsolete_accounts.rs
and put everything in there. The Serde struct, the serialize/deserialize functions, and the tests.
I kind of like that. Wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, makes sense.
runtime/src/serde_snapshot.rs
Outdated
let bincode = bincode::DefaultOptions::new().with_fixint_encoding(); | ||
bincode.serialize_into( | ||
stream, | ||
&utils::serialize_iter_as_tuple(obsolete_accounts_map.iter()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to serialize as tuple
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just looked at the serialize_status_cache()
, and I think we can pass obsoelte_accounts_map
to serialize_into()
and it'll just work?
&utils::serialize_iter_as_tuple(obsolete_accounts_map.iter()), | |
obsolete_accounts_map, |
Minimally clippy is happy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If clippy is happy, it probably works!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried it out, it doesn't work:
It is because it is being serialized as a hashmap and deserialized as a dashmap.
I could use a dashmap on the serialization side, but dashmaps do not suppot serialize/deseralize, so it would need something like serialize_iter_as_tuple anyways in that situation.
Dashmap is definitely needed on the deserialize size as performance testing found. It could be deseralized as a hashmap and then converted to a dashmap, but that seems less efficient.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a serde
feature for DashMap. I haven't used it before. Maybe it'll work here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But otherwise, yeah I agree with you. Probably don't want to do extra allocations/conversions unnecessarily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passes testing. I will check performance in a bit
Do we need the slots info to ser/des, i.e. Vec<(Offset, usize, slot)> --> Vec<(Offset, usize)>? |
runtime/src/serde_snapshot/tests.rs
Outdated
#[test_case(10, 15)] | ||
fn test_serialize_and_deserialize_obsolete_accounts( | ||
num_storages: u64, | ||
num_accounts_per_storage: usize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wdyt
num_accounts_per_storage: usize, | |
num_obsolete_accounts_per_storage: usize, |
runtime/src/serde_snapshot/tests.rs
Outdated
// Deserialize the obsolete accounts | ||
let cursor = Cursor::new(buf.as_slice()); | ||
let mut reader = BufReader::new(cursor); | ||
let deserialized_accounts = deserialize_obsolete_accounts(&mut reader).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wdyt
let deserialized_accounts = deserialize_obsolete_accounts(&mut reader).unwrap(); | |
let deserialized_obsolete_accounts = deserialize_obsolete_accounts(&mut reader).unwrap(); |
runtime/src/serde_snapshot.rs
Outdated
obsolete_accounts.id, | ||
)); | ||
} | ||
let accounts = ObsoleteAccounts { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wdyt
let accounts = ObsoleteAccounts { | |
let obsolete_accounts = ObsoleteAccounts { |
runtime/src/serde_snapshot.rs
Outdated
let bincode = bincode::DefaultOptions::new().with_fixint_encoding(); | ||
bincode.serialize_into( | ||
stream, | ||
&utils::serialize_iter_as_tuple(obsolete_accounts_map.iter()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just looked at the serialize_status_cache()
, and I think we can pass obsoelte_accounts_map
to serialize_into()
and it'll just work?
&utils::serialize_iter_as_tuple(obsolete_accounts_map.iter()), | |
obsolete_accounts_map, |
Minimally clippy is happy.
let consumed_size = file_stream.stream_position()?; | ||
Ok(consumed_size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may want to check the size and set some limit, similar to serialize_snapshot_data_file_capped()
.
Likely the same on the deserialize side too (arguably more important on the deserialize side).
let mut snapshot_storages = Vec::new(); | ||
for i in 0..num_storages { | ||
let storage = Arc::new(AccountStorageEntry::new( | ||
&PathBuf::new(), | ||
i, // Incrementing slot | ||
i as u32, // Incrementing id | ||
1024, | ||
AccountsFileProvider::AppendVec, | ||
StorageAccess::File, | ||
)); | ||
snapshot_storages.push(storage); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let snapshot_storages: Vec<_> = (0..num_storages).map(|i| {
// create storage
})
.collect();
- Minor cleanup - Moved obsolete_accounts to own module with tests
- Added Type SerdeObsoleteAccountsMap - Consolidated some function under SerdeObsoleteAccountsMap - Simplified serdes functions. Added serdes_snapshot::serialize_into
Changing to a dashmap ended up being a big change:
Not sure they best way to review this. Might be best to review the entire thing. |
frozen_abi(digest = "12qimMBghYs9dL4nhws7Xe1B5MXWBV75VTMTGLHxevYE") | ||
)] | ||
#[derive(Serialize, Deserialize, Debug)] | ||
pub(crate) struct SerdeObsoleteAccountsMap { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the new type added, and the frozen_abi
|
||
impl SerdeObsoleteAccountsMap { | ||
/// Creates a new `SerdeObsoleteAccountsMap` from a list of storage entries and a snapshot slot. | ||
pub(crate) fn new_from_storages( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this new and pushed the complexity of creating it into here.
} | ||
|
||
/// Removes and returns the obsolete accounts data for a given slot. | ||
pub(crate) fn remove(&self, slot: &Slot) -> Option<(ObsoleteAccounts, AccountsFileId, usize)> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved the transformation to a more general type here as well.
runtime/src/serde_snapshot.rs
Outdated
.deserialize_from::<R, T>(reader) | ||
} | ||
|
||
pub fn serialize_into<W, T>(writer: W, value: &T) -> bincode::Result<()> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this function and made the above one public.
It likely makes sense to have the with_limit parameter be passed in, but I think that would be a separate refactor.
let updated_len = current_len + obsolete_accounts.bytes as usize; | ||
let id = id as SerializedAccountsFileId; | ||
if obsolete_accounts.id != id { | ||
let updated_len = current_len + obsolete_accounts.2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's more clear if you do a diff to the last two commits, but there was a translation from the SerdeObsoleteAccounts to ObsoleteAccounts performed in this function. I moved the translation into obsolete_accounts.rs.
Generally I would like to avoid the .0, .1, .2 calls here, but the other options I see are
- passing it in as 3 individual options. This seems very confusing as they are tied together.
- Adding a struct just for this function call...
snapshot_slot: Slot, | ||
) -> Self { | ||
let map = DashMap::with_capacity(snapshot_storages.len()); | ||
snapshot_storages.par_iter().for_each(|storage| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed this to par_iter. Providing large benefits:
without par_iter
./agave-validator.log:[2025-10-14T18:44:49.214644854Z INFO solana_core::snapshot_packager_service] Saving obsolete accounts...
./agave-validator.log:[2025-10-14T18:44:49.497557413Z INFO solana_core::snapshot_packager_service] Saving obsolete accounts... Done in 282.908062ms
With par_iter
./agave-validator.log:[2025-10-14T19:36:40.040171417Z INFO solana_core::snapshot_packager_service] Saving obsolete accounts...
./agave-validator.log:[2025-10-14T19:36:40.141304000Z INFO solana_core::snapshot_packager_service] Saving obsolete accounts... Done in 101.127826ms
Problem
Summary of Changes
Note: After this change there will be one more PR to hook up save/restore of obsolete accounts, remove the legacy completers, rev the fastboot version and remove validator constraints.
Fixes #