Skip to content

Introduce a few GC controls to limit the heap size when running benchmarks #58487

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

d-netto
Copy link
Member

@d-netto d-netto commented May 21, 2025

We will benefit from having more control over Julia's heap size when benchmarking MMTk:

This PR introduces two heap-limit flags:

  • --hard-heap-limit: Set a hard limit on the heap size: if we ever go above this limit, we will abort.
  • --upper-bound-for-heap-target-increment: Set an upper bound on how much the heap target can increase between consecutive collections.

Note that they are behind a GC_ENABLE_HIDDEN_CTRLS build-time flag, so these options won't be available for most Julia users.

It may be a bit tricky to test this, given that the flags are only enabled if you define GC_ENABLE_HIDDEN_CTRLS.

@d-netto d-netto added GC Garbage collector needs tests Unit tests are required for this change GC: MMTK MMTK GC integration labels May 21, 2025
@qinsoon
Copy link
Contributor

qinsoon commented May 22, 2025

What we discussed:

  • Be able to run Julia with a fixed heap size: it means there will be no GC triggered before reaching the fixed heap size, the heap size won't get resized in any way, the heap will never grow larger than the given heap size (the process aborts if out-of-memory). So in this way, you can hold the heap size as a constant, and measure the performance/utilization changes from other factors (such as GC algorithms etc). This is not intended for running workloads in production, but it is generally a very useful way to measure GC performance. This is also useful to determine how much memory you need to run a workload.
  • Be able to run Julia GC on every X bytes/KB/MB allocation. The heap size may change and may grow, but it guarantees to trigger GCs frequently based on the application volume. This is useful for some cases. However, overall, this is less important than the fixed heap size option.

It is unclear to me how these two flags would deliver the above behaviors.

@d-netto
Copy link
Member Author

d-netto commented May 22, 2025

it means there will be no GC triggered before reaching the fixed heap size, the heap size won't get resized in any way, the heap will never grow larger than the given heap size (the process aborts if out-of-memory)

--hard-heap-limit seems to be doing precisely this. Could you clarify your comment?

Be able to run Julia GC on every X bytes/KB/MB allocation

Right now, we are letting MemBalancer compute the heap target for the next GC, and if we see that the heap size increment is greater than --upper-bound-for-heap-target-increment, then we cap it.

Should be a fairly easy change to make the heap size increment exactly --upper-bound-for-heap-target-increment, though.

@qinsoon
Copy link
Contributor

qinsoon commented May 22, 2025

it means there will be no GC triggered before reaching the fixed heap size, the heap size won't get resized in any way, the heap will never grow larger than the given heap size (the process aborts if out-of-memory)

--hard-heap-limit seems to be doing precisely this. Could you clarify your comment?

The heap target is initialized as the default_collect_interval:

jl_atomic_store_relaxed(&gc_heap_stats.heap_target, default_collect_interval);

And a GC is triggered based on the heap target:

if (jl_atomic_load_relaxed(&gc_heap_stats.heap_size) >= jl_atomic_load_relaxed(&gc_heap_stats.heap_target) || jl_gc_debug_check_other()) {

The first GC is triggered by the default_collect_interval rather than the fixed heap size. Did I miss anything?

@qinsoon
Copy link
Contributor

qinsoon commented May 22, 2025

Be able to run Julia GC on every X bytes/KB/MB allocation

Right now, we are letting MemBalancer compute the heap target for the next GC, and if we see that the heap size increment is greater than --upper-bound-for-heap-target-increment, then we cap it.

Should be a fairly easy change to make the heap size increment exactly --upper-bound-for-heap-target-increment, though.

It is not about heap target, or about resizing. It is an additional way to control the GC triggering. For example, if a workload allocates 10 MB in total and we set the option to 2MB, we expect to see ~5 GCs. If we set the option to 0.1MB, we expect to see ~100 GCs. It is not related with the heap size, or heap resizing.

However, I am not sure if we urgently need this for Julia's stock GC, as we can do so with MMTk, and we can use MMTk to understand a workload's behavior. Maybe @udesou can clarify if he plans to evaluate the stock GC using this feature.

@d-netto
Copy link
Member Author

d-netto commented May 22, 2025

Tried to add some of your suggestions in the last commit (but still need to adjust docstrings and flag names).

@d-netto d-netto marked this pull request as draft May 22, 2025 03:55
@d-netto d-netto force-pushed the dcn-gc-hidden-controls branch 7 times, most recently from 6697a5b to 6114b75 Compare May 26, 2025 16:53
@d-netto d-netto removed the needs tests Unit tests are required for this change label May 26, 2025
@d-netto d-netto marked this pull request as ready for review May 26, 2025 21:09
@gbaraldi
Copy link
Member

The code looks fine. I don’t think ifdefs are needed because it looks like it’s not in any performance sensitive piece of code. I’m still concerned that adding flags just for benchmarking against a target that the GC itself doesn’t really care about but we can remove them in the future

@d-netto d-netto force-pushed the dcn-gc-hidden-controls branch 6 times, most recently from 9eb3e34 to 88d649b Compare May 28, 2025 18:59
@d-netto d-netto force-pushed the dcn-gc-hidden-controls branch from 88d649b to 735b5d9 Compare May 28, 2025 19:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GC: MMTK MMTK GC integration GC Garbage collector
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants