-
Notifications
You must be signed in to change notification settings - Fork 392
[BUG]: Updating to 2.14.0 lead to high memory and termination of karafka worker #4626
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Putting aside the bloat issue itself, it seems based on what @wahlg told me, that the patches for tracing are applied even when datadog tracing is not in use (and only the internal karafka dd tracing is): def inspect_class(klass)
puts "Class: #{klass}"
# Get included modules
puts "\nIncluded modules:"
klass.included_modules.each { |m| puts " #{m}" }
# Get ancestors (includes superclasses and included modules)
puts "\nAncestors (superclasses and included modules):"
klass.ancestors.each { |a| puts " #{a}" }
# Get the singleton class (eigenclass)
singleton = klass.singleton_class
# Get modules extended to the class (included in singleton class)
puts "\nExtended modules (included in singleton class):"
singleton.included_modules.each { |m| puts " #{m}" unless m == Class }
# For prepended modules, we need to check the ancestors
puts "\nPrepended modules (appear before the class in ancestors):"
prepended = []
klass.ancestors.each_with_index do |ancestor, i|
if ancestor == klass && i > 0
# All ancestors before this index are prepended
prepended = klass.ancestors[0...i]
break
end
end
prepended.each { |m| puts " #{m}" if m.is_a?(Module) && !m.is_a?(Class) }
end
inspect_class(Karafka::Messages::Messages)
inspect_class(Karafka::Instrumentation::Monitor) results in: Class: Karafka::Messages::Messages
Included modules:
Karafka::Pro::Cleaner::Messages::Messages
Datadog::Tracing::Contrib::Karafka::MessagesPatch
Enumerable
ActiveSupport::Dependencies::RequireDependency
PP::ObjectMixin
ActiveSupport::ToJsonWithActiveSupportEncoder
ActiveSupport::Tryable
JSON::Ext::Generator::GeneratorMethods::Object
Kernel
Ancestors (superclasses and included modules):
Karafka::Pro::Cleaner::Messages::Messages
Datadog::Tracing::Contrib::Karafka::MessagesPatch
Karafka::Messages::Messages
Enumerable
ActiveSupport::Dependencies::RequireDependency
Object
PP::ObjectMixin
ActiveSupport::ToJsonWithActiveSupportEncoder
ActiveSupport::Tryable
JSON::Ext::Generator::GeneratorMethods::Object
Kernel
BasicObject
Extended modules (included in singleton class):
ActiveSupport::DescendantsTracker::ReloadedClassesFiltering
Zeitwerk::ConstAdded
Module::Concerning
ActiveSupport::Dependencies::RequireDependency
PP::ObjectMixin
ActiveSupport::ToJsonWithActiveSupportEncoder
ActiveSupport::Tryable
JSON::Ext::Generator::GeneratorMethods::Object
Kernel
Prepended modules (appear before the class in ancestors):
Karafka::Pro::Cleaner::Messages::Messages
Datadog::Tracing::Contrib::Karafka::MessagesPatch
Class: Karafka::Instrumentation::Monitor
Included modules:
Datadog::Tracing::Contrib::Karafka::Monitor
ActiveSupport::Dependencies::RequireDependency
PP::ObjectMixin
ActiveSupport::ToJsonWithActiveSupportEncoder
ActiveSupport::Tryable
JSON::Ext::Generator::GeneratorMethods::Object
Kernel
Ancestors (superclasses and included modules):
Datadog::Tracing::Contrib::Karafka::Monitor
Karafka::Instrumentation::Monitor
Karafka::Core::Monitoring::Monitor
ActiveSupport::Dependencies::RequireDependency
Object
PP::ObjectMixin
ActiveSupport::ToJsonWithActiveSupportEncoder
ActiveSupport::Tryable
JSON::Ext::Generator::GeneratorMethods::Object
Kernel
BasicObject
Extended modules (included in singleton class):
ActiveSupport::DescendantsTracker::ReloadedClassesFiltering
Zeitwerk::ConstAdded
Module::Concerning
ActiveSupport::Dependencies::RequireDependency
PP::ObjectMixin
ActiveSupport::ToJsonWithActiveSupportEncoder
ActiveSupport::Tryable
JSON::Ext::Generator::GeneratorMethods::Object
Kernel
Prepended modules (appear before the class in ancestors):
Datadog::Tracing::Contrib::Karafka::Monitor with: Datadog::Tracing::Contrib::Karafka::MessagesPatch
Datadog::Tracing::Contrib::Karafka::Monito being present. |
Hey @wahlg thanks for reporting this. I would like to ask you about the auto instrumentation, do use have something like this in your Gemfile gem 'datadog', require: 'datadog/auto_instrument' or just in the code require 'datadog/auto_instrument' |
Hi @Strech this is the current entry in the Gemfile:
|
I added the configuration described in this issue to the test app in https://github.com/p-datadog/karafka-test and I do see what appears to be constant growth in memory consumed by the server process, however:
That turns on a lot of functionality and if I, for example, remove profiling from this list the growth in process memory size appears to slow down. With the upstream instrumentation removed I think there is still continuous growth in process memory size but it is very slow, possibly just due to heap fragmentation? |
Karafka has no known memory leaks and I've been running 20-30k msg/s processing to debug other stuff lately. The heap fragmentation may occur but it is very slow and stabilizes after a while (depending on processing nature and payloads, etc). Though as far as I know, the thing is, with updated (patched?) dd things are much worse. |
My recent run of 10-20k msg/s increases memory usage by 9.5MB over period of 5 hours. P.S. By no means I say it is not Karafka related ofc and if this is the case I will fix it asap. |
I do not work on tracing part of the library and could be wrong, but perhaps the issue is that https://github.com/DataDog/dd-trace-rb/blob/master/lib/datadog/tracing/contrib/karafka/monitor.rb#L21 uses the block form of |
For some more context, the task where we are running karafka has 2GB of available memory. Prior to upgrading the datadog gem, when running karafka in production with all the other datadog instrumentation enabled (profiling, tracing of other dependencies, etc) the memory usage hovers around 40%. On the datadog gem version 2.14.0, if I enable karafka tracing through karafka.rb, memory grows linearly from the moment the task starts until it reaches 100% and is ultimately killed due to out-of-memory. |
This requests If you do not wish to instrument everything by default, you can remove the I ran the above code inspecting included modules and it does not show
And shows the output provided above with Datadog modules added to Karafka with
(The bundler require is needed due to karafka/karafka#2590 and I had to manually bypass the Rails-delayed activation with
Since it looks like karafka is bringing in Rails? |
@p-datadog that makes sense. Is there a way to enable auto-instrumention but exclude specific items? Or is there a way to list everything that is enabled via auto-instrumentation that might not be explicitly configured in the datadog initializer? I just want to make sure I don't inadvertently disable something that was previously enabled. |
Currently with the datadog auto-instrumentation enabled, and the karafka.rb config for datadog traces disabled, we are seeing normal memory usage and traces being recorded for karafka consumer processes. So I think we are in good shape now. Thanks all for the help on this issue |
@p-datadog in such case, WDYT about helping me figuring out how to detect that DD Karafka is in use so I can issue a warning to users when they use both? |
@mensfeld You can use Karafka patcher interface to track was it patched or not Datadog::Tracing::Contrib::Karafka::Patcher.patched? # => true/false @wahlg We already have some discussion around auto-instrument and configuration compatibility. But technically, you should not experience high-memory usage at the first place, no matter how you instrument. I think we will launch investigation about this contrib (but I can't guarantee the priority of it). Glad that the issue is resolved, but I would not close the ticket in order to track it. |
@mensfeld What about storing the span created in https://github.com/karafka/karafka/blob/master/lib/karafka/instrumentation/vendors/datadog/logger_listener.rb#L51 in thread/fiber-local storage so that it can be finished later? Would that work? |
Tracer Version(s)
2.14.0
Ruby Version(s)
ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +YJIT +PRISM [arm64-darwin24]
Relevent Library and Version(s)
No response
Bug Report
We have a Rails app that uses the karafka gem for kafka message consumption.
We have been using the built-in karafka tracing described here
We recently upgraded our datadog tracing version to 2.14.0. We did NOT enable the karafka tracing that this provides (i.e. we did NOT add
c.tracing.instrument :karafka
to our datadog initializer).However, after deploying the app we started to see very high memory usage of the container running karafka, which ultimately led to the container continuously getting killed due to out-of-memory issues. We also detected traces recorded by the worker of upwards of 90 minutes.
Once we disabled the karafka instrumentation for traces above, memory usage returned to normal. However, we now have no traces, and were not expecting this datadog upgrade to be a breaking change for our application.
Is there a way we can disable any instrumentation that the datadog gem is providing, and just use the tracing packaged with the karafka gem?
Thanks
Reproduction Code
Add this to karafka.rb, without adding any karafka tracing in Datadog.configure
Configuration Block
Error Logs
No response
Operating System
No response
How does Datadog help you?
No response
The text was updated successfully, but these errors were encountered: