Skip to content

Commit 17e9e86

Browse files
authored
Merge pull request #8785 from nadavMiz/multipart-upload-document
NSFS | versioning | add comment and documentation about multipart upload version-id
2 parents a860d8a + ffdc9d1 commit 17e9e86

File tree

2 files changed

+13
-2
lines changed

2 files changed

+13
-2
lines changed

docs/design/NsfsVersioning.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -89,6 +89,12 @@ In order to support best effort on scale of these scenarios, for POSIX file syst
8989
5. else - unlink unique_tmp_path
9090
```
9191

92+
### Multipart upload version order
93+
According to AWS specifications, multipart upload version time should be calculated based on multipart upload creation time rather than completion time (see [AWS Documentation](https://docs.aws.amazon.com/AmazonS3/latest/userguide/mpuoverview.html#distributedmpupload)).
94+
On the other hand, for directory buckets, the object creation time is the completion date of the multipart upload (see [AWS Documentation](https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-express-using-multipart-upload.html#s3-express-distributedmpupload)).
95+
There are performance issues for calculating the latest version after complete multipart whe using creation time.
96+
In our design, due to the performance issues and to be aligned with AWS directory buckets, the version-id time is calculated based on completion time.
97+
9298
## OUT OF SCOPE
9399
### TODO
94100
* Add GPFS design.

src/sdk/namespace_fs.js

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1824,15 +1824,20 @@ class NamespaceFS {
18241824
}
18251825
}
18261826

1827-
// iterate over multiparts array -
1827+
// iterate over multiparts array -
18281828
// 1. if num of unique sizes is 1
18291829
// 1.1. if this is the last part - link the size file and break the loop
18301830
// 1.2. else, continue the loop
18311831
// 2. if num of unique sizes is 2
18321832
// 2.1. if should_copy_file_prefix
1833-
// 2.1.1. if the cur part is the last, link the previous part file to upload_path and copy the last part (tail) to upload_path
1833+
// 2.1.1. if the cur part is the last, link the previous part file to upload_path and copy the last part (tail) to upload_path
18341834
// 2.1.2. else - copy the prev part size file prefix to upload_path
18351835
// 3. copy bytes of the current's part size file
1836+
// NOTE on versioning - according to general aws specifications, the version_id time should be based on when we created the upload.
1837+
// for directory buckets, on AWS, the object creation time is the completion date of the multipart upload
1838+
// on our design we decided to do it based on when the upload was completed.
1839+
// see https://docs.aws.amazon.com/AmazonS3/latest/userguide/mpuoverview.html#distributedmpupload
1840+
// see https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-express-using-multipart-upload.html#s3-express-distributedmpupload
18361841
async complete_object_upload(params, object_sdk) {
18371842
const part_size_to_fd_map = new Map(); // { size: fd }
18381843
let read_file;

0 commit comments

Comments
 (0)