Skip to content

Commit 1877f3b

Browse files
committed
- doc updates
1 parent 400281c commit 1877f3b

File tree

4 files changed

+71
-16
lines changed

4 files changed

+71
-16
lines changed

args.go

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -522,7 +522,7 @@ func (p *paramParserValidator) pvSourceInfoForS3IsReq() error {
522522
burl, err := url.Parse(p.params.sourceURIs[0])
523523

524524
if err != nil {
525-
return fmt.Errorf("Invalid S3 endpoint URL. Parsing error: %v.\nThe format is s3://[END_POINT]/[BUCKET]/[OBJECT]", err)
525+
return fmt.Errorf("Invalid S3 endpoint URL. Parsing error: %v.\nThe format is s3://[END_POINT]/[BUCKET]/[PREFIX]", err)
526526
}
527527

528528
p.params.s3Source.endpoint = burl.Hostname()
@@ -533,10 +533,14 @@ func (p *paramParserValidator) pvSourceInfoForS3IsReq() error {
533533

534534
segments := strings.Split(burl.Path, "/")
535535

536+
if len(segments) < 2 {
537+
return fmt.Errorf("Invalid S3 endpoint URL. Bucket not specified. The format is s3://[END_POINT]/[BUCKET]/[PREFIX]")
538+
}
539+
536540
p.params.s3Source.bucket = segments[1]
537541

538542
if p.params.s3Source.bucket == "" {
539-
return fmt.Errorf("Invalid source S3 URI. Bucket name could be parsed")
543+
return fmt.Errorf("Invalid source S3 URI. Bucket name could not be parsed")
540544
}
541545

542546
prefix := ""

docs/perfmode.rst

Lines changed: 5 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,7 @@
11
Performance Mode
22
======================================
3-
4-
If you want to maximize performance, and your source and target are public HTTP based end-points (Blob, S3, and HTTP), running the transfer in a high bandwidth environment such as a VM on the cloud, is strongly recommended. This recommendation comes from the fact that blob to blob, S3 to blob or HTTP to blob transfers are bidirectional where BlobPorter downloads the data (without writing to disk) and uploads it as it is received.
5-
6-
When running in the cloud, consider the region where the transfer VM ( where BlobPorter will be running), will be deployed. When possible, deploy the transfer VM in the same the same region as the target of the transfer. Running in the same region as the target minimizes the transfer costs (egress from the VM to the target storage account) and the network performance impact (lower latency) for the upload operation.
7-
8-
For downloads or uploads of multiple or large files the disk i/o could be the constraining resource that slows down the transfer. And often identifying if this is the case, is a cumbersome process. But if done, it could lead to informed decisions about the environment where BlobPorter runs.
9-
10-
To help with this indentification process, BlobPorter has a performance mode that uploads random data generated in memory and measures the performance of the operation without the impact of disk i/o.
11-
The performance mode for uploads could help you identify the potential upper limit of throughput that the network and the target storage account can provide.
3+
BlobPorter has a performance mode that uploads random data generated in memory and measures the performance of the operation without the impact of disk i/o.
4+
The performance mode for uploads could help you identify the potential upper limit of throughput that the network and the target storage account can provide.
125

136
For example, the following command will upload 10 x 10GB files to a storage account.
147

@@ -24,19 +17,17 @@ blobporter -f "1GB:10" -c perft -t perf-blockblob -g 20
2417

2518
Similarly, for downloads, you can simulate downloading data from a storage account without writing to disk. This mode could also help you fine-tune the number of readers (-r option) and get an idea of the maximum download throughput.
2619

27-
The following command will download the data we previously uploaded.
20+
The following command downloads the data previously uploaded.
2821

2922
```
3023
export SRC_ACCOUNT_KEY=$ACCOUNT_KEY
3124
blobporter -f "https://myaccount.blob.core.windows.net/perft" -t blob-perf
3225
```
3326

34-
Then you can try downloading to disk.
27+
Then you can download to disk.
3528

3629
```
3730
blobporter -c perft -t blob-file
3831
```
3932

40-
If the performance difference is significant then you can conclude that disk i/o is the bottleneck, at which point you can consider an SSD backed VM.
41-
42-
33+
The performance difference will you a measurement of the impact of disk i/o.

docs/resumable_transfers.rst

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
Resumable Transfers
2+
======================================
3+
BlobPorter supports resumable transfers. To enable this feature you need to set the -l option with a path to the transfer status file.
4+
5+
```
6+
blobporter -f "manyfiles/*" -c "many" -l mylog
7+
```
8+
9+
The status transfer file contains entries for when a file is queued and when it was succesfully tranferred.
10+
11+
The log entries are created with the following tab-delimited format:
12+
13+
```
14+
[Timestamp] [Filename] [Status (1:Started,2:Completed,3:Ignored)] [Size] [Transfer ID ]
15+
```
16+
17+
The following output from a transfer status file shows that three files were included in the transfer (file10, file11 and file15).
18+
However, only two were successfully transferred: file10 and file11.
19+
20+
```
21+
2018-03-05T03:31:13.034245807Z file10 1 104857600 938520246_mylog
22+
2018-03-05T03:31:13.034390509Z file11 1 104857600 938520246_mylog
23+
2018-03-05T03:31:13.034437109Z file15 1 104857600 938520246_mylog
24+
2018-03-05T03:31:25.232572306Z file10 2 104857600 938520246_mylog
25+
2018-03-05T03:31:25.591239355Z file11 2 104857600 938520246_mylog
26+
```
27+
28+
In case of failure, you can reference the same status file and BlobPorter will skip files that were already transferred.
29+
30+
Consider the previous scenario. After executing the transfer again, the status file has entries only for the missing file (file15).
31+
32+
```
33+
2018-03-05T03:31:13.034245807Z file10 1 104857600 938520246_mylog
34+
2018-03-05T03:31:13.034390509Z file11 1 104857600 938520246_mylog
35+
2018-03-05T03:31:13.034437109Z file15 1 104857600 938520246_mylog
36+
2018-03-05T03:31:25.232572306Z file10 2 104857600 938520246_mylog
37+
2018-03-05T03:31:25.591239355Z file11 2 104857600 938520246_mylog
38+
2018-03-05T03:54:33.660161772Z file15 1 104857600 495675852_mylog
39+
2018-03-05T03:54:34.579295059Z file15 2 104857600 495675852_mylog
40+
```
41+
42+
When the transfer is sucessful, a summary is created at the end of the transfer status file.
43+
44+
```
45+
----------------------------------------------------------
46+
Transfer Completed----------------------------------------
47+
Start Summary---------------------------------------------
48+
Last Transfer ID:495675852_mylog
49+
Date:Mon Mar 5 03:54:34 UTC 2018
50+
File:file15 Size:104857600 TID:495675852_mylog
51+
File:file10 Size:104857600 TID:938520246_mylog
52+
File:file11 Size:104857600 TID:938520246_mylog
53+
Transferred Files:3 Total Size:314572800
54+
End Summary-----------------------------------------------
55+
```
56+
57+
58+
59+

sources/s3info.go

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ type s3InfoProvider struct {
3333
func newS3InfoProvider(params *S3Params) (*s3InfoProvider, error) {
3434
s3client, err := minio.New(params.Endpoint, params.AccessKey, params.SecretKey, true)
3535

36+
3637
if err != nil {
3738
log.Fatalln(err)
3839
}

0 commit comments

Comments
 (0)