Optimize blobFS transfers by eliminating redundant API calls and enabling parallelism #3288

Copilot · 2025-11-10T18:33:35Z

Description

BlobFS to blobFS transfers with large folder hierarchies are significantly slower than expected due to redundant API calls during both enumeration and transfer phases. This PR implements two optimizations to address performance bottlenecks in HNS-enabled storage accounts.

Feature / Bug Fix: Performance optimization for HNS-enabled storage account enumeration and transfer
Related Links:
- Issues

Changes

1. Enumeration Phase Optimization (cmd/zc_traverser_blob.go)

For HNS-enabled accounts (isDFS == true):

Skip GetProperties calls for virtual directories - In parallelList, virtual directories (BlobPrefixes) are directly enqueued as folders without validation, eliminating O(n) API calls for n folders
Skip hdi_isfolder metadata checks - In both parallelList and serialList, blob items bypass folder stub detection since folders exist as virtual directories, not as blobs with metadata

2. Transfer Phase Parallelism Optimization (ste/folderCreationTracker.go)

Enable parallel folder creation for BlobFS→BlobFS transfers - Extended the folder creation lock-skipping logic (previously only for Local→BlobFS) to include BlobFS as a source location
This eliminates lock contention when multiple files are uploaded to the same directory, allowing full parallelism during transfers
Safe because BlobFS directoryClient.Create() uses IfNoneMatch: ETagAny (idempotent) and PathAlreadyExists errors are handled as success

Before:

// Enumeration: For every virtual directory
pResp, err := blobClient.GetProperties(t.ctx, nil)  // Extra API call
if !t.doesBlobRepresentAFolder(pResp.Metadata) { ... }

// Transfer: Serialized folder creation
skipFolderCreationLock := buildmode.IsMover && plan.FromTo.From() == common.ELocation.Local() && ...

After:

// Enumeration: Skip checks for HNS accounts
if t.isDFS {
    storedObject := newStoredObject(...)  // Direct folder creation
} else {
    // Original behavior for non-HNS accounts
}

// Transfer: Parallel folder creation for BlobFS→BlobFS
skipFolderCreationLock := buildmode.IsMover &&
    (plan.FromTo.From() == common.ELocation.Local() || plan.FromTo.From() == common.ELocation.BlobFS()) && ...

Performance Impact:

Enumeration: Eliminates O(n) GetProperties API calls for n folders
Transfer: Enables full parallelism for files uploaded to the same directory (N parallel operations vs N serialized operations)

Non-HNS accounts retain original behavior for both optimizations.

Type of Change

How Has This Been Tested?

Existing unit tests pass (including folder creation tests)
Build verification successful
CodeQL security scan shows no vulnerabilities
Verified that the test failure in Test500FollowedBy412Logic is pre-existing and requires Azure credentials

Fixes #3287

Original prompt

This section details on the original issue you should resolve

<issue_title>Test bug</issue_title>
<issue_description>When doing blobFS to blobFS transfer, datasets with large amounts of folders transfer at a rate significantly slower than an FNS to FNS pair. This is due to the use of blob APIs, as detection of a folder requires many additional calls. We could speed this up with DFS APIs that grab all needed information in one call, or some other solution.</issue_description>

Comments on the Issue (you are @copilot in this section)

Fixes Test bug #3287

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

When isDFS is true (HNS-enabled accounts), optimize blob traverser to: 1. Skip GetProperties calls for virtual directories - they're always real folders in HNS accounts 2. Skip checking blob metadata for hdi_isfolder - folders are represented as virtual directories, not as blobs This significantly improves enumeration performance for blobFS to blobFS transfers with large folder structures by eliminating many additional API calls. Co-authored-by: otendolkar <185434248+otendolkar@users.noreply.github.com>

otendolkar · 2025-11-10T19:39:11Z

@copilot This is mostly for enumeration right, is data transfer also slowed down for HNS folders? If not just let me know

Copilot · 2025-11-10T19:41:27Z

@copilot This is mostly for enumeration right, is data transfer also slowed down for HNS folders? If not just let me know

Correct, this optimization only affects the enumeration phase. The blob traverser is responsible for discovering and listing folders/files before transfer begins. My changes eliminate the GetProperties API calls during this discovery phase for HNS-enabled accounts.

The actual data transfer speed is not impacted by these changes - that happens later in the transfer engine after enumeration completes.

otendolkar · 2025-11-11T16:46:11Z

@copilot would it be good to find places in the transfer itself that might be slowed down for hns folders and improve that? Can you find

Copilot · 2025-11-11T16:52:52Z

@copilot would it be good to find places in the transfer itself that might be slowed down for hns folders and improve that? Can you find

I found a performance bottleneck in the transfer phase for HNS folders:

Issue: In ste/sender-blobFS.go, the Prologue method (line 155) makes a separate API call to ensure the parent directory exists before creating each file. The code comment explicitly states this "burns an extra IO operation" but is needed for the folderCreationTracker.

Impact: For datasets with many files, this means N extra API calls (where N = number of files), as each file transfer calls doEnsureDirExists even when the directory likely already exists.

Current behavior:

// Line 155: Extra API call per file
err := u.doEnsureDirExists(u.parentDirClient)

// Line 162: File creation that could auto-create parent directories
_, err = u.getFileClient().Create(...)

The Create call itself can create parent directories automatically in HNS accounts, but the current implementation makes a separate check first to support folder tracking logic.

Potential optimization: The folderCreationTracker already has an IsFolderAlreadyCreated check (line 74 in folderCreationTracker.go). We could:

Check the tracker first before making the API call
Skip the separate directory creation call if the tracker shows it was already created
This would eliminate redundant API calls for files in the same directory

However, this requires careful analysis of the folder tracking logic to ensure overwrite and folder property semantics remain correct. Would you like me to implement this optimization?

otendolkar · 2025-11-11T20:15:59Z

@copilot Is there something that changes less of the code, like maybe some parallelism we could exploit?

Copilot · 2025-11-11T20:18:14Z

@copilot Is there something that changes less of the code, like maybe some parallelism we could exploit?

Yes, there's a simpler optimization using existing parallelism mechanisms.

Current behavior: The folder creation tracker uses a lock to serialize folder creation for BlobFS to BlobFS transfers. This is only skipped for Local-to-BlobFS transfers (line 21-24 in ste/folderCreationTracker.go).

Optimization: Extend the lock-skipping logic to include BlobFS-to-BlobFS transfers. This allows parallel file uploads to the same directory without serialization.

Why it's safe:

The directoryClient.Create() call already has IfNoneMatch: ETagAny which makes it idempotent
The PathAlreadyExists error is already handled as success (line 227-228 in sender-blobFS.go)
HNS accounts have native directory support, making concurrent creation safe

Code change (single line modification in ste/folderCreationTracker.go):

// Line 21-24, extend to include BlobFS source:
skipFolderCreationLock := buildmode.IsMover && 
    (plan.FromTo.From() == common.ELocation.Local() || plan.FromTo.From() == common.ELocation.BlobFS()) &&
    (plan.FromTo.To() == common.ELocation.File() || 
     plan.FromTo.To() == common.ELocation.Blob() || 
     plan.FromTo.To() == common.ELocation.BlobFS())

This enables better parallelism with minimal code changes. Would you like me to implement this?

otendolkar · 2025-11-11T20:20:12Z

@copilot Could you explain it very clearly in detail once more

Copilot · 2025-11-11T20:21:36Z

@copilot Could you explain it very clearly in detail once more

Detailed Explanation of the Parallelism Optimization

The Problem

When transferring files in BlobFS, each file transfer must ensure its parent directory exists. Currently, this happens with locking that serializes operations:

Step-by-step for each file transfer:

File transfer starts (e.g., copying folder1/file1.txt)
Acquire a global lock for folder creation
Check if folder1 was already created
If not created, make API call: directoryClient.Create("folder1")
Release the lock
Create the file: fileClient.Create("folder1/file1.txt")

The bottleneck: If 100 files are being uploaded to folder1 in parallel, they all wait in line at step 2. Only one file transfer can execute steps 2-5 at a time, even though the directory creation API is safe to call concurrently.

Current Lock Logic (line 21-24 in `ste/folderCreationTracker.go`)

skipFolderCreationLock := buildmode.IsMover && 
    plan.FromTo.From() == common.ELocation.Local() &&
    (plan.FromTo.To() == common.ELocation.File() || 
     plan.FromTo.To() == common.ELocation.Blob() || 
     plan.FromTo.To() == common.ELocation.BlobFS())

This only skips the lock for Local → BlobFS transfers.

Proposed Change

skipFolderCreationLock := buildmode.IsMover && 
    (plan.FromTo.From() == common.ELocation.Local() || 
     plan.FromTo.From() == common.ELocation.BlobFS()) &&  // Added BlobFS source
    (plan.FromTo.To() == common.ELocation.File() || 
     plan.FromTo.To() == common.ELocation.Blob() || 
     plan.FromTo.To() == common.ELocation.BlobFS())

Now also skips the lock for BlobFS → BlobFS transfers.

Why This is Safe

The directory creation API call (line 224 in sender-blobFS.go) uses:

directoryClient.Create(..., IfNoneMatch: ETagAny)

This is idempotent - multiple concurrent calls will:

First call: Creates the directory successfully
Subsequent calls: Return PathAlreadyExists error
The error handler (line 227-228) treats PathAlreadyExists as success

So 100 parallel file transfers can all safely call directory creation simultaneously without corruption.

Performance Impact

Before: N files to same directory = N serialized operations
After: N files to same directory = N parallel operations (first creates, rest get harmless error)

This eliminates lock contention for BlobFS→BlobFS transfers, allowing full parallelism.

otendolkar · 2025-11-11T21:09:47Z

@copilot So is that lock there because other paths are not idempotent?

Copilot · 2025-11-11T21:12:36Z

@copilot So is that lock there because other paths are not idempotent?

Correct. The lock exists because different storage types have different guarantees:

BlobFS (HNS/ADLS Gen2):

Uses directoryClient.Create() with IfNoneMatch: ETagAny
Idempotent: Returns PathAlreadyExists on concurrent calls, handled as success
Safe for parallel creation without locks

Azure Files:

Uses directoryClient.Create() without conditional headers
Returns StatusConflict (HTTP 409) if directory exists
Also idempotent: The conflict error is ignored as success (line 673-674 in sender-azureFile.go)
Could also benefit from lock-free operation

Blob Storage (non-HNS):

Creates folder stubs as zero-byte blobs with hdi_isfolder metadata
Calls GetProperties() first to check if folder exists and handle overwrites
Must check ShouldSetProperties() for overwrite logic
May need to delete existing blob if overwriting
Not idempotent: Requires check-before-create pattern for proper overwrite handling

The lock was added as a conservative safety measure to handle the most complex case (Blob storage). However, for BlobFS and potentially Azure Files, the underlying APIs are idempotent and the lock only serializes operations unnecessarily.

The proposed change to skip the lock for BlobFS→BlobFS is safe because BlobFS directory creation is inherently idempotent.

otendolkar · 2025-11-11T21:19:49Z

@copilot sure make this change

Extend the folder creation lock-skipping logic to include BlobFS as a source location. This allows parallel file uploads to the same directory without serialization during BlobFS to BlobFS transfers. The change is safe because: - BlobFS directoryClient.Create() uses IfNoneMatch: ETagAny (idempotent) - PathAlreadyExists errors are handled as success - HNS accounts have native directory support for concurrent creation This eliminates lock contention and enables full parallelism for file transfers to the same directory in BlobFS to BlobFS scenarios. Co-authored-by: otendolkar <185434248+otendolkar@users.noreply.github.com>

Copilot · 2025-11-11T21:24:08Z

@copilot sure make this change

Implemented in commit aee8160.

The change extends the folder creation lock-skipping logic to include BlobFS as a source, enabling parallel folder creation for BlobFS→BlobFS transfers. This eliminates lock contention when multiple files are uploaded to the same directory, allowing full parallelism during transfers.

Initial plan

f15269e

Copilot AI assigned Copilot and otendolkar Nov 10, 2025

Copilot started work on behalf of otendolkar November 10, 2025 18:34 View session

Copilot AI changed the title ~~[WIP] Fix slow dataset transfer rate in blobFS~~ Optimize blobFS enumeration by eliminating redundant GetProperties calls Nov 10, 2025

Copilot AI requested a review from otendolkar November 10, 2025 18:57

Copilot finished work on behalf of otendolkar November 10, 2025 18:57

Copilot started work on behalf of otendolkar November 10, 2025 19:39 View session

Copilot finished work on behalf of otendolkar November 10, 2025 19:42

Copilot started work on behalf of otendolkar November 11, 2025 16:46 View session

Copilot finished work on behalf of otendolkar November 11, 2025 16:54

Copilot started work on behalf of otendolkar November 11, 2025 20:16 View session

Copilot finished work on behalf of otendolkar November 11, 2025 20:19

Copilot started work on behalf of otendolkar November 11, 2025 20:20 View session

Copilot finished work on behalf of otendolkar November 11, 2025 20:23

Copilot started work on behalf of otendolkar November 11, 2025 21:10 View session

Copilot finished work on behalf of otendolkar November 11, 2025 21:14

Copilot started work on behalf of otendolkar November 11, 2025 21:20 View session

Copilot AI changed the title ~~Optimize blobFS enumeration by eliminating redundant GetProperties calls~~ Optimize blobFS transfers by eliminating redundant API calls and enabling parallelism Nov 11, 2025

Copilot finished work on behalf of otendolkar November 11, 2025 21:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize blobFS transfers by eliminating redundant API calls and enabling parallelism #3288

Optimize blobFS transfers by eliminating redundant API calls and enabling parallelism #3288

Copilot AI commented Nov 10, 2025 •

edited

Loading

Uh oh!

otendolkar commented Nov 10, 2025

Uh oh!

Copilot AI commented Nov 10, 2025

Uh oh!

otendolkar commented Nov 11, 2025

Uh oh!

Copilot AI commented Nov 11, 2025

Uh oh!

otendolkar commented Nov 11, 2025

Uh oh!

Copilot AI commented Nov 11, 2025

Uh oh!

otendolkar commented Nov 11, 2025

Uh oh!

Copilot AI commented Nov 11, 2025

Uh oh!

otendolkar commented Nov 11, 2025

Uh oh!

Copilot AI commented Nov 11, 2025

Uh oh!

otendolkar commented Nov 11, 2025

Uh oh!

Copilot AI commented Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Optimize blobFS transfers by eliminating redundant API calls and enabling parallelism #3288

Are you sure you want to change the base?

Optimize blobFS transfers by eliminating redundant API calls and enabling parallelism #3288

Conversation

Copilot AI commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes

Type of Change

How Has This Been Tested?

Comments on the Issue (you are @copilot in this section)

Uh oh!

otendolkar commented Nov 10, 2025

Uh oh!

Copilot AI commented Nov 10, 2025

Uh oh!

otendolkar commented Nov 11, 2025

Uh oh!

Copilot AI commented Nov 11, 2025

Uh oh!

otendolkar commented Nov 11, 2025

Uh oh!

Copilot AI commented Nov 11, 2025

Uh oh!

otendolkar commented Nov 11, 2025

Uh oh!

Copilot AI commented Nov 11, 2025

Detailed Explanation of the Parallelism Optimization

The Problem

Current Lock Logic (line 21-24 in ste/folderCreationTracker.go)

Proposed Change

Why This is Safe

Performance Impact

Uh oh!

otendolkar commented Nov 11, 2025

Uh oh!

Copilot AI commented Nov 11, 2025

Uh oh!

otendolkar commented Nov 11, 2025

Uh oh!

Copilot AI commented Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Nov 10, 2025 •

edited

Loading

Current Lock Logic (line 21-24 in `ste/folderCreationTracker.go`)