Skip to content

Getting timeouts on publishing many events in parallel #113

@Dylan-DutchAndBold

Description

@Dylan-DutchAndBold

We are having issues on our production systems where we utilise Rebus with RabbitMQ.

Problematic scenario

We have a connecting 3rd party system which posts events over HTTP to our service which will take it and publish an event for it using Rebus.

The 3rd party system fires around 100 HTTP calls to our system at once, and unfortunately this results in timeout errors from Rebus/RabbitMQ.

This should not be an uncommon scenario.

The exception

[2023-12-08 13:47:20Z] fail: Microsoft.AspNetCore.Diagnostics.DeveloperExceptionPageMiddleware[1]
      An unhandled exception has occurred while executing the request.
System.TimeoutException: The operation has timed out.
   at RabbitMQ.Util.BlockingCell`1.WaitForValue(TimeSpan timeout)
   at RabbitMQ.Client.Impl.SimpleBlockingRpcContinuation.GetReply(TimeSpan timeout)
   at RabbitMQ.Client.Impl.ModelBase.ModelRpc(MethodBase method, ContentHeaderBase header, Byte[] body)
   at RabbitMQ.Client.Framing.Impl.Model._Private_ChannelOpen(String outOfBand)
   at RabbitMQ.Client.Framing.Impl.AutorecoveringConnection.CreateNonRecoveringModel()
   at RabbitMQ.Client.Framing.Impl.AutorecoveringConnection.CreateModel()
   at Rebus.RabbitMq.RabbitMqTransport.CreateChannel()
   at Rebus.Internals.WriterModelPoolPolicy.Create()
   at Rebus.Internals.ModelObjectPool.Get()
   at Rebus.RabbitMq.RabbitMqTransport.SendOutgoingMessages(IEnumerable`1 outgoingMessages, ITransactionContext context)
   at Rebus.Transport.AbstractRebusTransport.<>c__DisplayClass3_1.<<Send>b__1>d.MoveNext()

Sample project for reproduction

We have setup a sample project which can reproduce this error. The test scenario needs a little more than 100 simultaneous request to fail on my local system so I have set it to 1000. The failure will unfortunately only occur when in a similar scenario as our production system. Meaning it is in the context of an HTTP call being handled by .NET.

We tried to reproduce the error more isolated without being in an HTTP context, but this will not make it fail with the timeout. However, these tests will still show that publishing 1000 messages in parallel will take a very long time to complete. Too long if compared to a similar library (MassTransit) which takes ~ 2 seconds as where Rebus will take ~ 40 seconds to complete.

https://github.com/Dylan-DutchAndBold/demonstrate-rebus-timeout-issue

Version information

Software Version
Rebus 9.0.1
Rebus.ServiceProvider 10.0.0
Rebus.RabbitMq 9.0.1
RabbitMQ 3.12.10
.NET 7

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions