-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
To provide a workaround for rare but possible use-cases
- Support Scenario where S3 Path is both a File and a Directory #170
- Support Scenario where S3 Path is a File and used to correspond to a Directory and still has version of key there #172
and under assumptions that
- this is a rare situation (hence @aaronkanzer is ok to just skip such problematic paths)
and relying on the fact that we do have .s3invsync.versions.json where we store version information per "file" in any folder, I think we can workaround by extending record there with optional bool, "folder" field, e.g.
"{key}": {
"version_id": "...",
"etag": "...",
"folder": true
},
which would signify that there is {key}/ folder on S3. Locally we would keep it as {key}.s3invsync/ and thus using that for prefix for any path under.
- whenever that
{key}latest version becomes aDeleteMarker(removed), we simplymv {key}.s3invsync {key}, and remove that{key}record from corresponding.s3invsync.versions.json. - if
{key}file appears while there is still{key}/with some files under, wemv {key} {key}.s3invsyncfirst for that folder and add"folder": trueto the.s3invsync.versions.json
Features:
- If
{key}file (not deleted) is the most recent version (ref: Support Scenario where S3 Path is a File and used to correspond to a Directory and still has version of key there #172) -- we still have access to it just fine under non-modified path, and under{key}.s3invsync/we just have older versions (require logic to comprehend anyways) - If it is a "legit" zarr, there would be no
{key}file for an existing{key}/folder (which would be renamed); and thus we would not have that{key}.s3invsync/folder whenever there is an expectation for having{key}/directory. So we should be good for that too.
Cons:
- some performance hit since now additional treatments of paths would be needed but I think it could be quite minimal since once again
WDYT @jwodder ?
Metadata
Metadata
Assignees
Labels
No labels