-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Summary
RadixDB currently stores record values directly within nodes, which is not the most space-efficient approach. Since developers primarily choose Radix trees for their space-efficient properties, it is important to further optimize RadixDB’s storage by introducing value data deduplication.
Implementation
Introduce a data deduplication mechanism where values are referenced by their cryptographically secure hash values, with the actual values stored centrally. Use SHA-256 to generate unique identifiers for values and maintain hash-value pairs in a shared storage area within the database file. An exception applies when the value size is smaller than the hash size (32 bytes for SHA-256). In such cases, the value should remain directly within the node.
Desired Outcome
Given the right dataset, data deduplication will significantly reduce storage requirements for the developer's project. This enhancement aligns with the space-efficient properties of Radix trees and can be especially valuable for IoT applications.