Skip to content

Commit 471a5f7

Browse files
committed
- Batch transfers, 200 files at the time.
- New options related to batch transfers: - - x, number of files per batch tranfer (default 200). - - Max number of handles for linux is calculated by the number of sources times number of handles per file. - - h, handles per file, default 2. - Preprocess of sources occur concurrently. - Default blocksize for blocks is 8MB instead of 4 MB - Reduced default per core number of readers and workers (5, 8). - p option keeps the directory structure from storage account while downloading from the storage account.
1 parent 45aa228 commit 471a5f7

File tree

16 files changed

+706
-222
lines changed

16 files changed

+706
-222
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ _build/windows_amd64
1212
# Architecture specific extensions/prefixes
1313
*.[568vq]
1414
[568vq].out
15+
debug
1516

1617
*.cgo1.go
1718
*.cgo2.c

.vscode/launch.json

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,8 +11,11 @@
1111
"port": 2345,
1212
"host": "127.0.0.1",
1313
"program": "${fileDirname}",
14-
"env": {},
15-
"args": [" -f \"http://video.ch9.ms/ch9/d36c/6cfce1e3-63e5-47a7-b3b6-8c813a23d36c/SnackPackPlanetXamarin.mp3\" -f \"http://video.ch9.ms/ch9/d36c/6cfce1e3-63e5-47a7-b3b6-8c813a23d36c/SnackPackPlanetXamarin.mp3\" -n v1.mp3 -n v2.mp3 -t http-file "] ,
14+
"env": {
15+
"ACCOUNT_NAME":"storagejaa",
16+
"ACCOUNT_KEY":"/mXWG4aXMWvftzR+Ed5URccwrDSvvv5cUsKCqq5gbyPXvtlyfQ9gjq462kuH/2dErjwR8QcAS74vLaNXynd74g=="
17+
},
18+
"args": ["-c", "many", "-t","blob-file","-p"] ,
1619
"showLog": true
1720
}
1821
]

README.md

Lines changed: 19 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ Sources and targets are decoupled, this design enables the composition of variou
2929
Download, extract and set permissions:
3030

3131
```bash
32-
wget -O bp_linux.tar.gz https://github.com/Azure/blobporter/releases/download/v0.5.04/bp_linux.tar.gz
32+
wget -O bp_linux.tar.gz https://github.com/Azure/blobporter/releases/download/v0.5.10/bp_linux.tar.gz
3333
tar -xvf bp_linux.tar.gz linux_amd64/blobporter
3434
chmod +x ~/linux_amd64/blobporter
3535
cd ~/linux_amd64
@@ -46,7 +46,7 @@ export ACCOUNT_KEY=<STORAGE_ACCOUNT_KEY>
4646
4747
### Windows
4848

49-
Download [BlobPorter.exe](https://github.com/Azure/blobporter/releases/download/v0.5.04/bp_windows.zip)
49+
Download [BlobPorter.exe](https://github.com/Azure/blobporter/releases/download/v0.5.10/bp_windows.zip)
5050

5151
Set environment variables (if using the command prompt):
5252

@@ -84,6 +84,7 @@ If you want to rename multiple files, you can use the -n option:
8484

8585
`./blobporter -f /datadrive/f1.tar -f /datadrive/f2.md -n b1 -n b2 -c mycontainer`
8686

87+
8788
### Upload to Azure Page Blob Storage
8889

8990
Same as uploading to block blob storage, but with the transfer definiton (-t option) set to file-pageblob.
@@ -127,6 +128,11 @@ Without the -n option all files in the container will be downloaded.
127128

128129
`./blobporter -c mycontainer -t blob-file`
129130

131+
By default files are downloaded to the same directory where you are running blobporter. If you want to keep the same directory structure of the storage account use the -p option.
132+
133+
`./blobporter -p -c mycontainer -t blob-file`
134+
135+
130136
### Download a file via HTTP to a local file
131137

132138
`./blobporter -f "http://mysource/file.bam" -n /datadrive/file.bam -t http-file`
@@ -171,15 +177,23 @@ Without the -n option all files in the container will be downloaded.
171177

172178
## Performance Considerations
173179

174-
By default, BlobPorter creates 6 readers and 9 workers for each core on the computer. You can overwrite these values by using the options -r (number of readers) and -g (number of workers). When overriding these options there are few considerations:
180+
By default, BlobPorter creates 5 readers and 8 workers for each core on the computer. You can overwrite these values by using the options -r (number of readers) and -g (number of workers). When overriding these options there are few considerations:
175181

176182
- If during the transfer the buffer level is constant at 000%, workers could be waiting for data. Consider increasing the number of readers. If the level is 100% the opposite applies; increasing the number of workers could help.
177183

178184
- In BlobPorter, each reader or worker correlates to one goroutine. Goroutines are lightweight and a Go program can create a high number of goroutines, however, there's a point where the overhead of context switching impacts overall performance. Increase these values in small increments, e.g. 5.
179185

180-
- For transfers from fast disks (SSD) or HTTP sources a lesser number readers or workers could provide the same performance than the default values. You could reduce these values if you want to minimize resource utilization. Lowering these numbers reduces contention and the likelihood of experiencing throttling conditions.
186+
- For transfers from fast disks (SSD) or HTTP sources reducing the number readers or workers could provide better performance than the default values. Reduce these values if you want to minimize resource utilization. Lowering these numbers reduces contention and the likelihood of experiencing throttling conditions.
187+
188+
- Starting with version 0.5.10:
189+
190+
- - Transfers are batched. Each batch transfer will concurrently read and transfer up to 200 files (default value) from the source. The batch size can be modified using the -x option, the maximum value is 500.
191+
192+
- - Blobs smaller than the block size are transferred in a single operation. With relatively small files (<32MB) performance may be higher if you set a block size equal to the size of the files. Setting the number of workers and readers to the number of files could also, yeild performance gains.
193+
194+
## Issues and Feedback
181195

182-
- In Linux, BlobPorter reduces the number of readers if the number of open files during the transfer is greater than 1024. Linux restricts the number of files open by a process and since each reader holds a handle to the file to transfer, you can reach this limit if you want transfer multiple files even with a relatively low number of readers. For example, if you have 10 readers and want to transfer more than 102 files you will reach this limit. In this case BlobPorter will issue a warning displaying the new number of readers. If the resulting number of readers impacts performance, consider running multiple instances of BlobPorter with a smaller source list.
196+
If you have a question or find a bug, open a new issue in this repository. BlobPorter is an OSS project maintained by the contributors.
183197

184198
## Contribute
185199

0 commit comments

Comments
 (0)