Skip to content

Support value data deduplication #7

@toru

Description

@toru

Summary

RadixDB currently stores record values directly within nodes, which is not the most space-efficient approach. Since developers primarily choose Radix trees for their space-efficient properties, it is important to further optimize RadixDB’s storage by introducing value data deduplication.

Implementation

Introduce a data deduplication mechanism where values are referenced by their cryptographically secure hash values, with the actual values stored centrally. Use SHA-256 to generate unique identifiers for values and maintain hash-value pairs in a shared storage area within the database file. An exception applies when the value size is smaller than the hash size (32 bytes for SHA-256). In such cases, the value should remain directly within the node.

Desired Outcome

Given the right dataset, data deduplication will significantly reduce storage requirements for the developer's project. This enhancement aligns with the space-efficient properties of Radix trees and can be especially valuable for IoT applications.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions