Write support for blob handles with pending payload #24458

ChumpChief · 2025-04-25T00:58:29Z

Ports the remainder of the prototype in #24158, which is write support for blob handles with pending payload.

Mostly unchanged as compared to the prototype, other than updating for the renames taken in #24320.

Copilot

Pull Request Overview

This PR ports write support for blob handles with pending payload by adding a new flag (createBlobPayloadPending) and updating tests, API signatures, and internal handling of blob payload states. Key changes include:

Updating test files to run with and without pending payload support.
Adding new type guards and API exports (e.g. isFluidHandlePayloadPending) to support pending payload states.
Updating blob manager logic to handle payload state transitions (local, shared, failed).

Reviewed Changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
packages/test/test-end-to-end-tests/src/test/blobsisAttached.spec.ts	Refactored imports and wrapped tests in a loop for createBlobPayloadPending variations.
packages/test/test-end-to-end-tests/src/test/blobs.spec.ts	Updated test container config and test descriptions to pass the new flag.
packages/runtime/runtime-utils/*	Export and add new type guard for handling pending blob payloads.
packages/runtime/container-runtime/src/*	Updated blob manager methods and tests to support blob handles with pending payloads.
packages/common/core-interfaces/*	Added new interfaces and types for the pending payload feature.

packages/runtime/container-runtime/src/test/blobManager.spec.ts

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

anthony-murphy · 2025-04-25T16:51:38Z

packages/common/core-interfaces/src/handles.ts

+ * @legacy
+ * @alpha
+ */
+export type PayloadState = "local" | "shared" | "pending" | "failed";


i'm trying to think through the usage here. what are the supported state transitions? how do users reason about them? What is the experience for local client vs remote clients? are they symmetrical? should they be, or do client need custom code for local and remote.

i think this would be easier to review if the implementation were split from api, as the api will likely have significantly more comments than the internals. getting the internals in would also unblock re-enabling testing with staging mode.

i think we should remove 'local' at this layer, and only have "persisted" | "pending" | "failed", anything related to local should be moved to a blob handle type such that local information is only available on the strongly typed local object returned from the call to upload.

For visibility - had a chat with Tony on the side, we have a bit better understanding of each other now.

Re: the question above, the states are not symmetrical for local/remote nor should they be, since the states are wholly different between the two (both in what can be detected and what reaction should be taken), with the exception of "shared" state which is where they reach parity. I've updated the documentation a bit to clarify as well.

In our conversation Tony suggested splitting the interface instead, such that local/remote have different types. I'm thinking more on this option or if there's any preferable alternative here.

Also just realized that you can use mermaid diagrams in github comments, so including the current state diagram here too:

--- title: Handles with pending payload lifecycle --- flowchart LR subgraph local["local only"] direction LR local_state["local"] failed_state["failed (emit 'failed')"] local_state -- "fail publish" --> failed_state end subgraph remote["remote only"] direction LR pending_state["pending"] end shared_state["shared (emit 'shared')"] local ~~~ shared_state shared_state ~~~ remote local_state -- "publish" --> shared_state pending_state -- "observe publish" --> shared_state

Loading

i understand the remote case a bit better now, but i'm still not sure the api as is ergonomic for the remote case

Re: splitting the interface, I'm struggling to find a case where it improves the customer experience.

When creating a BlobHandle via uploadBlob(), the customer immediately gets the handle back in local state (and it will not exit this state until they elect to attach the handle, giving them plenty of time to register listeners). They need the full interface except "pending" state, since they likely want to observe all of the possible transitions and states through the upload process. It's true we have type knowledge here that the handle will never enter "pending" state, but at the same time the customer doesn't need to do anything special to avoid the "pending" state; it doesn't get in their way.

const myBlobHandle = containerRuntime.uploadBlob(blob); const onFailed = (error) => { /* Some remedy here, e.g. delete the handle */ myBlobHandle.events.off("failed", onFailed); myBlobHandle.events.off("shared", onShared); }; const onShared = () => { /* Here maybe clear a "might lose data" flag to warn users before closing the tab */ myBlobHandle.events.off("failed", onFailed); myBlobHandle.events.off("shared", onShared); } myBlobHandle.events.on("failed", onFailed); myBlobHandle.events.on("shared", onShared); /* Here maybe set a "might lose data" flag to warn users before closing the tab */ myMap.set("someKey", myBlobHandle);

To acquire a RemoteFluidObjectHandle, the customer is likely pulling it from a location where it might be mixed in with BlobHandles in normal app operation. For example, if I get an IFluidHandle out of a SharedMap it is probably dangerous to assume whether that IFluidHandle is actually a BlobHandle or RemoteFluidObjectHandle. The customer can't ignore that both are possibilities with whatever logic they're applying. However, I don't think this is a burden for them. If we assume they are doing their local failure watching at the point of calling uploadBlob as suggested above, then this portion is probably just looking to run the remote cleanup heuristic.

// Here in the code we don't know whether the handle is sourced locally or remote const someMaybeRemoteHandle = myMap.get<IFluidHandle>("someOtherKey")!; // ...So we just check for "pending", e.g. rather than !"shared" - these are the ones we want to run a heuristic on only if (someMaybeRemoteHandle.payloadState === "pending") { const shouldGiveUpP = new Promise<boolean>((resolve) => { const onShared = () => { resolve(false); someMaybeRemoteHandle.events.off("shared", onShared); }; someMaybeRemoteHandle.events.on("shared", onShared); setTimeout(() => { resolve(true); someMaybeRemoteHandle.events.off("shared", onShared); }, someDelayMs); }); shouldGiveUpP .then((shouldCleanUp) => { /* if true, do some cleanup */ }) .catch((error) => { /* ... */ }; }

These are obviously just examples, but I don't see where having a narrower interface necessarily simplifies. If there are other better examples would be glad to take a look though.

Sorry I just realized I had missed your tsplayground - responses inline here

based on the responses i feel pretty strongly we should split the local and remote experiences. its very confusing that only some states and events only apply local or remote. I can see customer writing code where they wait for failure on remote handles which is impossible, but nothing in the interface or code will prevent it. I'm of the opinion that when possible, it should be clear from the code itself how to use an api, and it seems quite achievable here with little change.

for the local api, i would still push for a combined event which covers both shared and failed states to avoid the need to write code that crosses event handles, as that type of code gets very complex, especially when you take in further requirements like tracking close/dispose to give up.

I'm also curious what @yann-achard-MS and @znewton think about this, as they will helping partners to onboard these apis and their extensions.

I would also consider punting on the remote half of the api, since its not hooked up yet, so no need to commit to that api yet anyway.

anthony-murphy · 2025-04-25T20:56:17Z

packages/common/core-interfaces/src/handles.ts

+	(event: "shared", listener: () => void);
+	/**
+	 * Emitted for locally created handles when the payload fails sharing to remote collaborators.
+	 */
+	(event: "failed", listener: (error: unknown) => void);


i think we should remove both of these, and just have a single event for progress. Maybe something like:
(event: "progress", listener: ({type:"update"} | {type:"persisted"} | {type:"failed" error?:Error})=>void)

this ensure the event is applicable to all clients, and it lets the handle implementation short-circuit the event if someone registers while the state is already terminal (persisted or failed). we do this for a number of events like connected, as we saw users frequently building dangling event handles that never resolved as the connection state was already connected.

something like progress is probably optional for now, but someday i could see us using signals in the blob manager so broadcast progress updates.

I hesitate to merge since the customer reaction to a failure likely has no overlap with their reaction to successful completion. In general it's more ergonomic for a customer to register specifically for the event they're interested in observing (rather than forcing them to conditionalize within their handler).

The short-circuiting behavior you mention is an antipattern, we removed many of those over time (e.g. #14371, not sure if we got them all though). See #14272 (comment) and other comments in that PR for more context.

i find it more ergonomic to only need a single event when both events must always be considered, as i don't think there is ever a case where shared or failed are useful separately, so the current design forces a complicated multi event pattern. if there is a single event pattern, combining also doesn't make that more difficult.

packages/runtime/container-runtime/src/blobManager/blobManager.ts

anthony-murphy · 2025-04-30T17:56:06Z

packages/runtime/container-runtime/src/blobManager/blobManager.ts

+		return (this._events ??= createEmitter<IFluidHandlePayloadPendingEvents>());
+	}
+
+	private _state: PayloadState = "local";


what will this do on a remote client right now?

Nothing since remote clients don't instantiate BlobHandles (they will only have RemoteFluidObjectHandles). When we add support in RemoteFluidObjectHandles they'll have their own corresponding state/events for "pending" -> "shared".

so isFluidHandlePayloadPending will always return false for a remote blob handle?

Not in the future but yes for this PR (since we haven't implemented observability for remote handles yet). It will always return true when the payload status is inspectable/observable.

Once observability is moved to IFluidHandle then the typeguard isn't needed anymore and can just be removed.

github-actions · 2025-05-08T20:24:07Z

🔗 No broken links found! ✅

Your attention to detail is admirable.

linkcheck output


> fluid-framework-docs-site@0.0.0 ci:check-links /home/runner/work/FluidFramework/FluidFramework/docs
> start-server-and-test "npm run serve -- --no-open" 3000 check-links

1: starting server using command "npm run serve -- --no-open"
and when url "[ 'http://127.0.0.1:3000' ]" is responding with HTTP status code 200
running tests using command "npm run check-links"


> fluid-framework-docs-site@0.0.0 serve
> docusaurus serve --no-open

[SUCCESS] Serving "build" directory at: http://localhost:3000/

> fluid-framework-docs-site@0.0.0 check-links
> linkcheck http://localhost:3000 --skip-file skipped-urls.txt

Crawling...

Stats:
  195689 links
    1565 destination URLs
    1797 URLs ignored
       0 warnings
       0 errors

ChumpChief added 5 commits April 24, 2025 14:20

Port core-interfaces portion

34eac3e

Port runtime-utils portion

f715d3a

Port BlobManager changes

6f7beca

Fixup tests

6510b24

Add end-to-end-tests

5e8872d

Copilot AI review requested due to automatic review settings April 25, 2025 00:58

ChumpChief requested a review from a team as a code owner April 25, 2025 00:58

github-actions bot added base: main PRs targeted against main branch area: runtime Runtime related issues area: tests Tests to add, test infrastructure improvements, etc public api change Changes to a public API labels Apr 25, 2025

Merge branch 'main' into PendingPayloadWrite

854d288

Copilot AI reviewed Apr 25, 2025

View reviewed changes

packages/runtime/container-runtime/src/test/blobManager.spec.ts Outdated Show resolved Hide resolved

Update packages/runtime/container-runtime/src/test/blobManager.spec.ts

aff9689

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

ChumpChief requested review from anthony-murphy, markfields, znewton, yann-achard-MS and dannimad April 25, 2025 01:44

anthony-murphy reviewed Apr 25, 2025

View reviewed changes

packages/runtime/container-runtime/src/blobManager/blobManager.ts Outdated Show resolved Hide resolved

ChumpChief added 6 commits April 28, 2025 09:22

Comment clarification

9770df6

Switch handles over to new event emitter

142aebb

Convert BlobManager to new event emitter

b280f92

Defer instantiation of event emitter

dff0849

Merge remote-tracking branch 'upstream/main' into PendingPayloadWrite

b100d89

Fix tests

29a334c

anthony-murphy reviewed Apr 30, 2025

View reviewed changes

packages/runtime/container-runtime/src/blobManager/blobManager.ts Show resolved Hide resolved

anthony-murphy reviewed Apr 30, 2025

View reviewed changes

ChumpChief added 5 commits May 1, 2025 13:28

Merge remote-tracking branch 'upstream/main' into PendingPayloadWrite

9d24881

Merge remote-tracking branch 'upstream/main' into PendingPayloadWrite

a93261f

Split local payload state out from remote payload state

516f2d7

Merge remote-tracking branch 'upstream/main' into PendingPayloadWrite

0c78871

Settle on final names where possible

7021f99

anthony-murphy approved these changes May 8, 2025

View reviewed changes

ChumpChief merged commit 9416ca9 into microsoft:main May 8, 2025
37 checks passed

ChumpChief deleted the PendingPayloadWrite branch May 8, 2025 22:12

Write support for blob handles with pending payload #24458

Write support for blob handles with pending payload #24458

Uh oh!

Conversation

ChumpChief commented Apr 25, 2025 • edited by azure-boards bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

anthony-murphy Apr 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

anthony-murphy Apr 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

anthony-murphy Apr 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

anthony-murphy Apr 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented May 8, 2025

linkcheck output

Uh oh!

Uh oh!

Uh oh!

ChumpChief commented Apr 25, 2025 •

edited by azure-boards bot

Loading

anthony-murphy Apr 25, 2025 •

edited

Loading

anthony-murphy Apr 25, 2025 •

edited

Loading

anthony-murphy Apr 25, 2025 •

edited

Loading

anthony-murphy Apr 25, 2025 •

edited

Loading