Skip to content

Improve memory use of rabbit_mgmt_gc #13898

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 16, 2025
Merged

Conversation

the-mikedavis
Copy link
Collaborator

The main change here is to hibernate the rabbit_mgmt_gc gen_server after it completes its GC run. It's an ideal process for hibernation since it wakes up periodically (every 2min by default) to do some work and is then completely idle.

Especially if the broker is mostly idle this server may not perform enough work to be GC'd naturally so its process memory use can grow steadily over time. It can get up to fairly high amounts (tens of MB) because of the work it does during each GC run: it creates a set out of metadata entities like vhosts, queues and exchanges. On an idle single-node broker with 50k exchanges for example, rabbit_mgmt_gc can creep up to around 50MB. With the hibernate change it stays at around 1KB between GC runs.

I've also updated the sets usage here from gb_sets to sets v2 as it's faster and more memory efficient.

tprof comparison of sets...
> List = lists:seq(1, 50_000).
> tprof:profile(sets, from_list, [List, [{version, 2}]], #{type => call_memory}).

****** Process <0.94.0>  --  100.00% of total *** 
FUNCTION          CALLS   WORDS   PER CALL  [     %]
maps:from_keys/2      1  184335  184335.00  [100.00]
                         184335             [ 100.0]
ok
> tprof:profile(gb_sets, from_list, [List], #{type => call_memory}).

****** Process <0.97.0>  --  100.00% of total *** 
FUNCTION                  CALLS   WORDS   PER CALL  [    %]
lists:rumergel/3              1       2       2.00  [ 0.00]
gb_sets:from_ordset/1         1       3       3.00  [ 0.00]
lists:reverse/2               1  100000  100000.00  [16.76]
lists:usplit_1/5          49999  100002       2.00  [16.76]
gb_sets:balance_list_1/2  65535  396605       6.05  [66.48]
                                 596612             [100.0]

On main you can monitor rabbit_mgmt_gc with observer_cli or recon and create or import 50K exchanges. It will eventually creep up in MB of memory usage. Decrease the default time between wake-ups (rabbit_mgmt_gc.erl:23) to see it happen faster.

The `rabbit_mgmt_gc` gen_server performs garbage collections
periodically. When doing so it can create potentially fairly large
terms, for example by creating a set out of
`rabbit_exchange:list_names/0`. With many exchanges, for example, the
process memory usage can climb steadily especially when the management
agent is mostly idle since `rabbit_mgmt_gc` won't hit enough reductions
to cause a full-sweep GC on itself. Since the process is only active
periodically (once every 2min by default) we can hibernate it to GC the
terms it created.

This can save a medium amount of memory in situations where there are
very many pieces of metadata (exchanges, vhosts, queues, etc.). For
example on an idle single-node broker with 50k exchanges,
`rabbit_mgmt_gc` can hover around 50MB before being naturally GC'd. With
this patch the process memory usage stays consistent between `start_gc`
timer messages at around 1KB.
`sets` v2 were not yet available when this module was written. Compared
to `gb_sets`, v2 `sets` are faster and more memory efficient:

    > List = lists:seq(1, 50_000).
    > tprof:profile(sets, from_list, [List, [{version, 2}]], #{type => call_memory}).

    ****** Process <0.94.0>  --  100.00% of total ***
    FUNCTION          CALLS   WORDS   PER CALL  [     %]
    maps:from_keys/2      1  184335  184335.00  [100.00]
                             184335             [ 100.0]
    ok
    > tprof:profile(gb_sets, from_list, [List], #{type => call_memory}).

    ****** Process <0.97.0>  --  100.00% of total ***
    FUNCTION                  CALLS   WORDS   PER CALL  [    %]
    lists:rumergel/3              1       2       2.00  [ 0.00]
    gb_sets:from_ordset/1         1       3       3.00  [ 0.00]
    lists:reverse/2               1  100000  100000.00  [16.76]
    lists:usplit_1/5          49999  100002       2.00  [16.76]
    gb_sets:balance_list_1/2  65535  396605       6.05  [66.48]
                                     596612             [100.0]
@the-mikedavis the-mikedavis self-assigned this May 16, 2025
@michaelklishin michaelklishin added this to the 4.1.1 milestone May 16, 2025
Copy link
Collaborator

@michaelklishin michaelklishin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great find.

Thank you, @the-mikedavis.

@michaelklishin michaelklishin merged commit a0c6a0b into main May 16, 2025
271 checks passed
@michaelklishin michaelklishin deleted the md/rabbit_mgmt_gc-memory branch May 16, 2025 23:58
michaelklishin added a commit that referenced this pull request May 17, 2025
Improve memory use of `rabbit_mgmt_gc` (backport #13898)
@michaelklishin michaelklishin modified the milestones: 4.1.1, 4.2.0 May 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants