Skip to content

inact1v1ty/solana-merkle

Repository files navigation

Merkle Tree Implementation for Solana (Vanilla, Page-Based Optimization)

Overview

This repository holds an efficient implementation of on-chain Merkle Tree calculation implemented using vanilla Solana.

The main optimization is making the Merkle Tree page-based.

The repo consists of 4 components:

  • merkle_program - The on-chain program.
  • merkle_common - Code common between on-chain program and off-chain SDK, tree calculation, hashing and etc.
  • merkle_sdk - SDK for communicating with merkle_program in an easier and more ergonomic way.
  • merkle_cli - CLI for interaction with the program.

There is a deployed instance of the program on devnet with address mrktCtrz8auQHeBbdwoFmJ8TbAcR55TkRqaGF5KQ5Td, it is initialized and used by default in the CLI.

Developed with solana-ci 2.1.21.


Merkle Tree requirements

  • Insert leaves dynamically into the tree (append-only).
  • Recalculate and store the root after each insertion.
  • Emit event messages with the updated root (via msg! logs).
  • Allow later generation of Merkle proofs for leaves.

What does it mean - page-based?

Instead of simply storing all leaf hashes and recalculating the whole tree each time, this implemetation stores pages of configurable number of leaves that keep their hashes, so that only a small portion of the tree must be recalculated on each insertion.

On-chain data structures

// or 8, 16, 64, depending on expected data size,
// total leaf count and rent/realloc considerations.
const PAGE_SIZE: usize = 32;

/// Global state for the Merkle tree, storing the root and page information.
#[derive(BorshSerialize, BorshDeserialize, Debug, Clone, PartialEq)]
pub struct MerkleTreeState {
    /// The current root hash of the entire Merkle tree.
    pub root: [u8; 32],
    /// Public keys of the accounts storing the Merkle pages.
    pub pages: Vec<MerklePageState>,
    /// Total number of leaves inserted into the tree across all pages.
    pub total_leaf_count: u32,
}

#[derive(BorshSerialize, BorshDeserialize, Debug, Clone, PartialEq)]
pub struct MerklePageState {
    pub pubkey: Pubkey,
    pub page_hash: [u8; 32],
}

/// Represents a single page of the Merkle tree, storing a batch of leaf hashes
/// in a fixed-size array.
#[derive(BorshSerialize, BorshDeserialize, Debug, Clone, PartialEq)]
pub struct MerklePage {
    /// Fixed-size array storing the leaf hashes in this page.
    pub leaves: [[u8; 32]; PAGE_CAPACITY],
    /// The number of leaves currently stored in this page.
    pub leaf_count: u32,
}

Why page-based?

There are two main benefits:

Avoid realloc and account size limits

Storing all leaves in one account will soon result in hitting Solana memory limits.

Minimize hash operations amount on inserting new leaf

Because page becomes immutable after becoming full, we can save a lot of computation effort by saving each page's hash. The efficiency increase is huge!

See this comparison in number of hashing operations for recalculating Merkle Tree root with recalculating the whole tree and using the page-based approach:

Page based vs All leaves

Here is comparison in number of operations using different page sizes:

Page based different page sizes

In this implemetation, 32 leaves per page is used as the most balanced default.


Hashing

SHA-256 is used for all hashing.

Second preimage attack

In order to protect from the second preimage attack, a 0x00 byte is prepended to a leaf when hashing, while 0x01 is prepended when computing internal (branch) hashes, as defined in Certificate Transparency.


Insertion Workflow

  1. If the current page is full (leaf_count == PAGE_SIZE), create a new MerklePage account.
  2. Update the page by writing the new leaf at position leaf_count.
  3. Increment leaf_count and tree's total_leaf_count by 1.
  4. Recompute the root of last page and global Merkle Tree root.

Example Insert

if page.leaf_count < PAGE_SIZE as u32 {
    page.leaves[page.leaf_count as usize] = new_leaf;
    page.leaf_count += 1;
} else {
    // Create a new page
}

Proof Generation

Here is an example of tree layout if PAGE_CAPACITY == 8 and there are 19 leaves.

Merkle Tree Example

Because of the page-based tree example, we construct the proof in two steps:

  1. Generate the proof from the leaf to the page hash.
  2. Generate the proof from the page hash to the main root.

This way we can both:

  • Recalculate the tree root in an optimal amount of hash operations.
  • Verify the tree root in an optimal amount of hash operations - the verification is dealing with one optimal proof array just like in an ordinary Merkle Tree.

Merkle CLI Commands

1. initialize

Runs the Initialize instruction of the merkle_program.

Usage:

initialize --signer <path_to_signer_file> [--address <program_address>] [-u <URL_OR_MONIKER>]

Arguments:

  • --signer <path_to_signer_file>: Path to the signer's wallet file.
  • --address <program_address> (optional): The address of the deployed program. If not provided, it might use a default or configured address.
  • -u, --url <URL_OR_MONIKER> (optional): URL for Solana's JSON RPC or moniker (e.g., mainnet-beta, testnet, devnet, localhost). Defaults to devnet.

2. rand

Generates a random 16-byte hexadecimal string (32 symbols).

Usage:

rand

3. insert

Runs the InsertLeaf instruction of the merkle_program. It can either use a provided 32-symbol hexadecimal string or generate a random one.

Usage:

insert --signer <path_to_signer_file> [--rand | <hex_of_32_symbols>] [-u <URL_OR_MONIKER>]

Arguments:

  • --signer <path_to_signer_file>: Path to the signer's wallet file.
  • --rand: If specified, a random 32-symbol hex string will be generated and used.
  • <hex_of_32_symbols>: A 32-symbol hexadecimal string to be inserted.
  • -u, --url <URL_OR_MONIKER> (optional): URL for Solana's JSON RPC or moniker (e.g., mainnet-beta, testnet, devnet, localhost). Defaults to devnet.

You should either provide either the 32-symbol hexadecimal string or --rand argument.

4. verify

Runs the VerifyProof instruction of the merkle_program using the provided hexadecimal string as data.

Usage:

verify --signer <path_to_signer_file> <hex_of_32_symbols> [-u <URL_OR_MONIKER>]

Arguments:

  • --signer <path_to_signer_file>: Path to the signer's wallet file.
  • <hex_of_32_symbols>: A 32-symbol hexadecimal string to be used for verification.
  • -u, --url <URL_OR_MONIKER> (optional): URL for Solana's JSON RPC or moniker (e.g., mainnet-beta, testnet, devnet, localhost). Defaults to devnet.

About

Implementation of a Merkle tree on Solana without using Anchor

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages