Skip to content

activity Schedule To Close timeout(Activity complete after timeout) #1912

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Pharaohsk opened this issue Apr 13, 2025 · 3 comments
Closed
Labels
bug Something isn't working

Comments

@Pharaohsk
Copy link

Expected Behavior

In a three-node cluster using Temporal, the time on each node is not synchronized. The Temporal server is deployed on the main control node, where Service A starts a workflow and sets configurations such as ScheduleToCloseTimeout and other timeout settings. When the task is scheduled to execute on Service B on node B, the local time on node B is later than that on node A. As a result, when Service B executes the task, its local time has already exceeded the timeout. It is expected that no error will be reported.

Actual Behavior

return err : Activity complete after timeout..
File : internal/internal_task_handlers.go

	info := getActivityEnv(ctx)
	ctx, dlCancelFunc := context.WithDeadline(ctx, info.deadline)
	defer dlCancelFunc()

	output, err := activityImplementation.Execute(ctx, t.Input)
	// Check if context canceled at a higher level before we cancel it ourselves
	isActivityCancel := ctx.Err() == context.Canceled

	dlCancelFunc()
	if <-ctx.Done(); ctx.Err() == context.DeadlineExceeded {
		ath.logger.Info("Activity complete after timeout.",
			tagWorkflowID, t.WorkflowExecution.GetWorkflowId(),
			tagRunID, t.WorkflowExecution.GetRunId(),
			tagActivityType, activityType,
			tagAttempt, t.Attempt,
			tagResult, output,
			tagError, err,
		)
		return nil, ctx.Err()
	}

I think the code should use context.WithTimeout instead of context.WithDeadline(ctx, info.deadline).

Steps to Reproduce the Problem

1.Use Temporal in a cluster with two or more nodes, with the Temporal server running on the main control node.
2.Configure the task timeout and allow the task to be scheduled for execution on another node.
3.Modify the time on all hosts in the cluster to be unsynchronized, with the time difference exceeding the configured task timeout.

Specifications

  • Version: v1.22.0
  • Platform: go version go1.18 linux
@Pharaohsk Pharaohsk added the bug Something isn't working label Apr 13, 2025
@cretz
Copy link
Member

cretz commented Apr 14, 2025

Interesting. The deadline calculated on the worker is based on the server start-time of the task because that's how server uses it. The server will timeout the activity for the same period of time, this is just a worker-side to keep activities from running forever. The time is started when the server starts the task, not when the worker starts the task. I am not sure we support situations of significant clock skew. I think there is an expectation that server and worker clocks must be reasonably accurate and that timeouts should be set high enough to not be hit except in rare/failure scenarios.

I will confer with the team, but we'd strongly recommend clock accuracy and timeouts high enough to overcome and skew difference.

@cretz
Copy link
Member

cretz commented Apr 18, 2025

After conferring with team, we may be able to make deadline relative for start to close timeout, but we cannot for schedule to close.

@cretz
Copy link
Member

cretz commented Apr 23, 2025

We have opened #1926 to warn when SDK and server clocks vary significantly, and we have opened #1930 to have start-to-close timeout be relative to local time instead of server time. We cannot do this with schedule to close, because there is no local-time equivalent of when first scheduled to base the timeout off of so we have to use the server time.

(closing issue in favor of those two issues, but feel free to keep commenting)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants