Skip to content

Failed orchestrations in ACA don't restart but are stuck #3226

@lilyjma

Description

@lilyjma

A customer running Durable Functions using the MSSQL backend in ACA reported that orchestrations that failed are not restarted and continue where it left off when the app starts again. Instead, they're stuck.

Here's what they reported:

- Exception: Cannot access a disposed object.

Object name: 'IServiceProvider'.

Stack: at Microsoft.Extensions.DependencyInjection.ServiceLookup.ThrowHelper.ThrowObjectDisposedException()

at Microsoft.Extensions.DependencyInjection.ServiceProvider.CreateScope()

at Microsoft.Azure.Functions.Worker.DefaultFunctionContext.get_InstanceServices()

 This exception happens when the Container App scales in and orchestrator instances are interrupted mid execution. Normally, when a Function App restarts for example, the orchestrator picks up execution and continues where it left off after the app starts up again. However, in this specific scenario where the Container App scales in, it is causing the orchestrator to throw an exception, and as this exception is not handled or caught, it results in putting the orchestrator instance in a "failed" state. Failed orchestrators do not continue executions after the start up and have no automatic retry mechanism.

 The impact here is that the work that the orchestrator instance was supposed to execute will not be done.

Support ticket reference: 2507040050000823

There's an issue in GitHub that seems to be related to this. Based on the comment, perhaps the issue lies in drain mode not being properly implemented for Function apps on ACA.

Metadata

Metadata

Assignees

Labels

ReliabilityDurable functions get stuck or don’t run as expected.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions