-
Notifications
You must be signed in to change notification settings - Fork 5
Open
Description
So, its known now that as of couchdb 0.10 that document revs are a monotonically incrementing value + a deterministic md5 hash of the document at that moment in time. (more info: http://jchrisa.net/drl/_design/sofa/_list/post/post-page?startkey=%5B%22Deep-Couch-Deterministic-Revs-for-Idempotent-PUTs%22%5D)
The way Txn currently works is that it GETs the id, then uses obj_diff to make a decision on if we need to do a PUT. While this is a fine approach, it can be variable cost in terms of I/O and computation.
I'd like to propose an alternative approach, using HEAD. Here's how it would work:
- txn does a HEAD request for a doc id. on success the etag has a rev value (1-122ade142b..)
- we strip the monotonic header (the 1-) and what we have left is the deterministic content hash
- we define a node.js implementation of the couchdb rev hash function, and pass our candidate object into it to get a hash value.
- we compare the fetched hash with our locally computed hash
- if the hashes are not different, we could do 2 things, either do a full GET and obj_diff (like normal) then a PUT, or just simply a PUT, because its different.
The benefit here is that we can rely on a much simpler diff comparator and use a lot less I/O during the comparison step.
Thoughts?
Metadata
Metadata
Assignees
Labels
No labels