-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Background
Recall each source row is modeled as a sequence of cell values. Ignoring the cell value types (strings, blobs, numbers, whatever) for the moment, we compute each source row's SHA-256 hash from the concatenation of the row's cell's hashes. However, since we don't want the hash of a cell's value to leak information, we salt each cell's value using a pseudo random salt. The pseudo random salt itself, is the SHA-256 of a ledger-wide secret key and the cell's row/column coordinates.
The upshot of this hashing strategy is that in proving a row's cell values, we can safely redact any individual cell value by substituting it with the cell's hash. This salting business, of course, bloats the byte sizes of the hash proofs morsels contain: 32-bytes per cell. So this can add up, and any way to reduce the size of such proof is worth exploring.
Minimize Necessary Salt For No Redact Case
If a morsel file (.mrsl
) contains source rows where none of their cell values are redacted, then provide a proof mode where you only need to show a single row-seed salt for that entire row. If any single row cell is redacted, then the row's seed hash is not disclosed, and each unredacted cell value is accompanied by a unique salt (i.e. it would fallback to its present behavior.)
Motivation
This feature will be useful in text processing applications. Lines naturally translate to rows; words and tokens to cells. Often, a line's words are not redacted. And since words themselves can be short and plentiful, the savings should be worthwhile.
Implementation
The following will need to change:
- Add 1-level of indirection in middle of hash computation. Table-salt -> row-salt -> cell-salt.
- Update morsel file format.
Notes
Presently, the io.crums.sldg.src
package knows little about the TableSalt
abstraction (even tho it lives there). That's been a good thing, keep it that way.