Skip to content

Conversation

kskalski
Copy link

@kskalski kskalski commented Oct 15, 2025

Problem

MAX_GENESIS_ARCHIVE_UNPACKED_SIZE constant is defined in code that should mostly perform untar operation and it's way high-level / business logic related to be there.
And it's used only in test that could as well use / test for tigher value.

The constant is used across several crates, but most of those are in ledger / ledger-tool

Summary of Changes

Move to ledger/genesis_utils.rs

(test for hitting max size when unpacking will come separately - right now no test is actually reliant on the constant)

Copy link

@steviez steviez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I agree that this constant doesn't belong in accounts-db, there is still stuff like unpack_genesis_archive() that lives in accounts-db/src/hardened_unpack.rs. I think the constant and the functions should probably live together, but there is some shared code between the unpack-genesis & unpack-snapshot code correct ? If so, how do you plan to reconcile that ?

@codecov-commenter
Copy link

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.1%. Comparing base (0199a3a) to head (2233fed).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff            @@
##           master    #8489    +/-   ##
========================================
  Coverage    83.1%    83.1%            
========================================
  Files         840      840            
  Lines      367669   367665     -4     
========================================
+ Hits       305719   305855   +136     
+ Misses      61950    61810   -140     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@kskalski
Copy link
Author

While I agree that this constant doesn't belong in accounts-db, there is still stuff like unpack_genesis_archive() that lives in accounts-db/src/hardened_unpack.rs. I think the constant and the functions should probably live together, but there is some shared code between the unpack-genesis & unpack-snapshot code correct ? If so, how do you plan to reconcile that ?

upack_genesis_archive and related functions all take max_genesis_archive_unpacked_size, so technically that code can be seen as generic utility for unpacking with specified limit and the constant doesn't belong to that module.
Aside from that, the actual limit used is subject to specific binary, its flags and overrides, so in fact the constant should be seen as "default" limit and I would like to move it closest to the common denominator of uses.

It's possible that its place should be in agave-snapshot... I don't yet have a clear vision for the exact scope of that crate - maybe it should be just utils or maybe it should gather all the snapshot-related settings too (or at least their defaults). Then consolidating "snapshots" and "genesis" is possibly yet another step, AFAIU genesis is a special kind of snapshot that is used as starting point - for now I would try not to mix too much genesis code into agave-snapshot.

@kskalski
Copy link
Author

As for the code for genesis and snapshot processing in hardened_unpack module - yes, there is a lot in common / shared, but there is ton of custom / wrapping logic for genesis, like allowing different files in the archive, applying different sanity limits, etc.

I think at minimum those should be split into separate module for tar unpacking and customizing that process for regular / genesis snapshots - most of that will likely stay in agave-snapshots, just different modules, but maybe we could also aim for strict removal of genesis customizations into a different place and keep agave-snapshots more generic.

@kskalski kskalski marked this pull request as ready for review October 15, 2025 08:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants