-
Notifications
You must be signed in to change notification settings - Fork 240
activity Schedule To Close timeout(Activity complete after timeout) #1912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Interesting. The deadline calculated on the worker is based on the server start-time of the task because that's how server uses it. The server will timeout the activity for the same period of time, this is just a worker-side to keep activities from running forever. The time is started when the server starts the task, not when the worker starts the task. I am not sure we support situations of significant clock skew. I think there is an expectation that server and worker clocks must be reasonably accurate and that timeouts should be set high enough to not be hit except in rare/failure scenarios. I will confer with the team, but we'd strongly recommend clock accuracy and timeouts high enough to overcome and skew difference. |
After conferring with team, we may be able to make deadline relative for start to close timeout, but we cannot for schedule to close. |
We have opened #1926 to warn when SDK and server clocks vary significantly, and we have opened #1930 to have start-to-close timeout be relative to local time instead of server time. We cannot do this with schedule to close, because there is no local-time equivalent of when first scheduled to base the timeout off of so we have to use the server time. (closing issue in favor of those two issues, but feel free to keep commenting) |
Expected Behavior
In a three-node cluster using Temporal, the time on each node is not synchronized. The Temporal server is deployed on the main control node, where Service A starts a workflow and sets configurations such as ScheduleToCloseTimeout and other timeout settings. When the task is scheduled to execute on Service B on node B, the local time on node B is later than that on node A. As a result, when Service B executes the task, its local time has already exceeded the timeout. It is expected that no error will be reported.
Actual Behavior
return err : Activity complete after timeout..
File : internal/internal_task_handlers.go
I think the code should use context.WithTimeout instead of context.WithDeadline(ctx, info.deadline).
Steps to Reproduce the Problem
1.Use Temporal in a cluster with two or more nodes, with the Temporal server running on the main control node.
2.Configure the task timeout and allow the task to be scheduled for execution on another node.
3.Modify the time on all hosts in the cluster to be unsynchronized, with the time difference exceeding the configured task timeout.
Specifications
The text was updated successfully, but these errors were encountered: