Skip to content

Upsert memo support #858

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 12, 2025
Merged

Upsert memo support #858

merged 4 commits into from
May 12, 2025

Conversation

maciejdudko
Copy link
Contributor

What was changed

Added workflow.upsert_memo function to support memo upsert functionality.

Why?

Feature request: temporalio/features#119

Checklist

  1. Closes [Feature Request] Upsert memo support #190

  2. How was this tested:

Modified test_workflow_memo to also test upsert operation.

  1. Any docs updates needed?

@maciejdudko maciejdudko requested a review from a team as a code owner May 8, 2025 20:28
@CLAassistant
Copy link

CLAassistant commented May 8, 2025

CLA assistant check
All committers have signed the CLA.

@@ -218,8 +218,11 @@ def __init__(self, det: WorkflowInstanceDetails) -> None:
self._current_history_length = 0
self._current_history_size = 0
self._continue_as_new_suggested = False
self._raw_memo: Optional[
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is _raw_memo needed here if it's already available in info? Also, we need to make sure we update the info's raw memo info on upsert. Note, we need to make sure we update the same dict, not recreate it (which you can't really anyways because it's a frozen data class).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I thought _info is read-only, that's why I made the other one. I've updated PR to operate on _info.raw_memo directly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 While the top level fields are read-only, we do mutate a couple of maps in there (memo and search attributes)

Comment on lines 1114 to 1121
# Clearing cached value, will be regenerated on next workflow_memo() call.
self._untyped_converted_memo = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should update the existing dict with the updates IMO. A user may have assigned it to a variable. It also doesn't need to go back through the payload conversion process because you can just use the objects directly that you were given by the user. We should also document on memo() that the result can be mutated with the results of upsert.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like an idea of handing over references to mutable state. I'd imagine most code would expect it to not change. Handing over a read-only copy feels safer and less surprising. Additionally, if the user ever modified the returned dict, our internal state would get desynchronized with what's in the command buffer and that smells like all kinds of trouble.

I deliberately round-trip the memo value for two reasons: to keep the result of the untyped memo read consistent regardless of how the memo was obtained (from server vs. from recent upsert), and to allow deserialization with different type hints than the original value (we already offer that ability in current SDK version, somebody somewhere probably depends on it).

Copy link
Member

@cretz cretz May 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like an idea of handing over references to mutable state. I'd imagine most code would expect it to not change. Handing over a read-only copy feels safer and less surprising. Additionally, if the user ever modified the returned dict, our internal state would get desynchronized with what's in the command buffer and that smells like all kinds of trouble.

But we already do this with search attributes and this happens in other SDKs. I think it's actually more surprising to users when multiple instances of a workflow's memo are independent. Sure in some languages where immutable collections are more common it can make more sense to copy-on-write, but not here (and again we don't with other mutable collections like search attributes). We do have a type that tells people the map is an immutable view of it, so they shouldn't mutate, but we can't stop them.

I deliberately round-trip the memo value for two reasons: to keep the result of the untyped memo read consistent regardless of how the memo was obtained (from server vs. from recent upsert), and to allow deserialization with different type hints than the original value (we already offer that ability in current SDK version, somebody somewhere probably depends on it).

I think we can keep the behavior of lazily mutating the typed map only if it had been accessed before, but instead let it continue to be lazily created from raw (which we also update here) if it hasn't been accessed. All of the existing functionality for individual memo values by type hint is retained because it works off of raw. Only this one workflow_memo method does this deserialization with default type hints (so often dicts and arrays and such) and that should continue to work.

workflow.memo() should return the same object every time for clarity and consistency.

Comment on lines +1125 to +1127
self._untyped_converted_memo[k] = self._payload_converter.from_payload(
v
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self._untyped_converted_memo[k] = self._payload_converter.from_payload(
v
)
self._untyped_converted_memo[k] = updates[k]

I don't think you should go back through the converter, it's fine if we use the value as the user gave it to us

Copy link
Member

@cretz cretz May 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussion, to have the same value whether they have called memo() before or not, we do need to re-convert here. Hopefully people don't use memo() and they use type-hint-based ones instead. Feel free to add comment mentioning that we re-convert because we don't want the addition of memo() earlier in the workflow to affect the this value.

@maciejdudko maciejdudko merged commit 257f143 into temporalio:main May 12, 2025
14 checks passed
@maciejdudko maciejdudko deleted the upsert-memo branch May 12, 2025 21:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] Upsert memo support
3 participants