-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
@jwodder just filing an Issue to brainstorm:
When running the s3invsync, it takes quite some time to resume much further in the manifest.json where I would like a subsequent backup to continue -- I have been using the --path-filter as a human-readable mechanism, but I'm curious if you have any thoughts on how I can get s3invsync to start at a specific row/delimiter
I understand that it must at least do a regex comparison per row; however, wondering if there could be some sort of f.seek() behavior where a pointer could be set to start somewhere specific.
Curious to get your thoughts here -- for context, here is a sample command for what I am running, where the PREFIX is determined from an array of processes on MIT Engaging Cluster
s3invsync --path-filter "zarr/${PREFIX}[a-z0-9-]*" --allow-new-nonempty --ignore-errors all \
s3://linc-brain-mit-prod-us-east-2/linc-brain-mit-prod-us-east-2/production-configuration/ \
/orcd/data/linc/001/s3lincbrain/
Metadata
Metadata
Assignees
Labels
No labels