8357471: GenShen: Share collector reserves between young and old #25357

kdnilsen · 2025-05-21T15:29:09Z

Genshen independently reserves memory to hold evacuations into young and old generations. We have found that under duress, it is sometimes difficult for mixed evacuations to make progress because the reserves in old are too small and we cannot expand old because young is running so frequently that it does not have the excess memory required to justify expansion of old (and shrinking of young).

This PR exploits the fact that the reserves in young are often much larger than young requires to carry out its anticipated next GC cycle. In this case, we can share the young collector reserves with the old generation. This allows much more effective operation of mixed evacuations when GC is running at or near its full capacity.

The following spreadsheet snapshots highlight the benefits of this change. In control with 6G heap size, we perform large numbers of mixed evacuations, but each mixed evacuation has very low productivity (e.g. one region at a time). This causes excessive delays in reclaiming the garbage from old, which is required to shrink old and expand young. This is why we see the large number of unproductive GC cycles, many of which degenerate and a few of which upgrade to full GC. In the experiment with 6G heap size, there are far fewer mixed cycles, but they are each much more productive. The total number of GC cycles decreases significantly.

With 7G heap size, the benefits of this PR manifest as a decrease in mixed evacuations, which also allows us to decrease total GC cycles. By more quickly reclaiming old garbage, we are able to more quickly expand young, which decreases the number of young GC cycles. This reduces CPU load. The impact on response times is not as significant as with the 6G heap size. We see slight improvement at p50-p99.9, with slight degradation at p99.99 through p100.

At 8G heap size, the GC is not at all stressed. We see approximately the same numbers of GC cycles, slight degradation of response times at p50-p99, slight improvement in response times at p99.9-p100.

The command line for these comparisons follows:

            ~/github/jdk.share-collector-reserves/build/linux-x86_64-server-release/images/jdk/bin/java \
                -XX:+UnlockExperimentalVMOptions \
                -XX:-ShenandoahPacing \
                -XX:+AlwaysPreTouch -XX:+DisableExplicitGC -Xms$s -Xmx$s \
                -XX:ShenandoahMinimumOldTimeMs=25 \
                -XX:ShenandoahFullGCThreshold=1024 \
                -XX:+UseShenandoahGC -XX:ShenandoahGCMode=generational \
                -Xlog:"gc*=info,ergo" \
                -Xlog:safepoint=trace -Xlog:safepoint=debug -Xlog:safepoint=info \
                -XX:+UnlockDiagnosticVMOptions \
                -jar ~/github/heapothesys/Extremem/src/main/java/extremem.jar \
                -dInitializationDelay=45s \
                -dDictionarySize=3000000 \
                -dNumCustomers=300000 \
                -dNumProducts=60000 \
                -dCustomerThreads=500 \
                -dCustomerPeriod=7s \
                -dCustomerThinkTime=1s \
                -dKeywordSearchCount=4 \
                -dServerThreads=5 \
                -dServerPeriod=5s \
                -dProductNameLength=10 \
                -dBrowsingHistoryQueueCount=5 \
                -dSalesTransactionQueueCount=5 \
                -dProductDescriptionLength=32 \
                -dProductReplacementPeriod=25s \
                -dProductReplacementCount=10 \
                -dCustomerReplacementPeriod=30s \
                -dCustomerReplacementCount=1000 \
                -dBrowsingExpiration=1m \
                -dPhasedUpdates=true \
                -dPhasedUpdateInterval=90s \
                -dSimulationDuration=25m \
                -dResponseTimeMeasurements=100000 \
                >$t.genshen.share-reserves.$r-evac-ratio.$s.out 2>$t.genshen.share-reserves.$r-evac-ratio.$s.err
            gzip $t.genshen.share-reserves.$r-evac-ratio.$s.out $t.genshen.share-reserves.$r-evac-ratio.$s.err

We have tested this patch through our performance pipeline. Both aarch64 and x86 show similar results, a small increase in concurrent evacuation on the graphchi benchmark, with slight improvements of other metrics on a number of other test workloads:

Genshen aarch64
-------------------------------------------------------------------------------------------------------
+16.35% graphchi/concurrent_evacuation p=0.00000
  Control:      1.895ms (+/-392.33us)        306
  Test:         2.205ms (+/-401.72us)        124

-33.43% specjbb2015/concurrent_marking_old p=0.00213
  Control:    513.923ms (+/-225.22ms)        338
  Test:       385.169ms (+/-231.25ms)         38

-28.58% specjbb2015/cm_parallel_mark_old p=0.00833
  Control:      1.022s  (+/-446.83ms)        333
  Test:       794.476ms (+/-440.83ms)         35

-25.31% crypto.aes/shenandoahfinalupdaterefs_stopped p=0.00000
  Control:      0.113ms (+/-  0.01ms)        285
  Test:         0.090ms (+/-  0.02ms)        158

-18.52% scimark.fft.small/shenandoahfinalupdaterefs_stopped p=0.00000
  Control:      0.106ms (+/-  0.01ms)        474
  Test:         0.090ms (+/-  0.01ms)        180

-15.29% hyperalloc_a3072_o2048/concurrent_marking_old p=0.00103
  Control:    384.599ms (+/- 76.47ms)        277
  Test:       333.581ms (+/- 89.51ms)         55

-15.28% hyperalloc_a3072_o2048/cm_total_old p=0.00105
  Control:    768.676ms (+/-152.94ms)        277
  Test:       666.786ms (+/-178.97ms)         55

-15.28% hyperalloc_a3072_o2048/cm_parallel_mark_old p=0.00105
  Control:    768.676ms (+/-152.94ms)        277
  Test:       666.786ms (+/-178.97ms)         55


Shenandoah aarch64
-------------------------------------------------------------------------------------------------------
-12.07% extremem-phased/update_references p=0.00050
  Control:    479.826ms (+/- 52.78ms)         23
  Test:       428.148ms (+/-  2.26ms)          3

Genshen x86
-------------------------------------------------------------------------------------------------------
+16.35% graphchi/concurrent_evacuation p=0.00000
  Control:      1.895ms (+/-392.33us)        306
  Test:         2.205ms (+/-401.72us)        124

-33.43% specjbb2015/concurrent_marking_old p=0.00213
  Control:    513.923ms (+/-225.22ms)        338
  Test:       385.169ms (+/-231.25ms)         38

-28.58% specjbb2015/cm_parallel_mark_old p=0.00833
  Control:      1.022s  (+/-446.83ms)        333
  Test:       794.476ms (+/-440.83ms)         35

-25.31% crypto.aes/shenandoahfinalupdaterefs_stopped p=0.00000
  Control:      0.113ms (+/-  0.01ms)        285
  Test:         0.090ms (+/-  0.02ms)        158

-18.52% scimark.fft.small/shenandoahfinalupdaterefs_stopped p=0.00000
  Control:      0.106ms (+/-  0.01ms)        474
  Test:         0.090ms (+/-  0.01ms)        180

-15.29% hyperalloc_a3072_o2048/concurrent_marking_old p=0.00103
  Control:    384.599ms (+/- 76.47ms)        277
  Test:       333.581ms (+/- 89.51ms)         55

-15.28% hyperalloc_a3072_o2048/cm_total_old p=0.00105
  Control:    768.676ms (+/-152.94ms)        277
  Test:       666.786ms (+/-178.97ms)         55

-15.28% hyperalloc_a3072_o2048/cm_parallel_mark_old p=0.00105
  Control:    768.676ms (+/-152.94ms)        277
  Test:       666.786ms (+/-178.97ms)         55


Shenandoah x86
-------------------------------------------------------------------------------------------------------
-12.07% extremem-phased/update_references p=0.00050
  Control:    479.826ms (+/- 52.78ms)         23
  Test:       428.148ms (+/-  2.26ms)          3

Progress

Change must be properly reviewed (1 review required, with at least 1 Reviewer)
Change must not contain extraneous whitespace
Commit message must refer to an issue

Issue

JDK-8357471: GenShen: Share collector reserves between young and old (Task - P4)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/25357/head:pull/25357
$ git checkout pull/25357

Update a local copy of the PR:
$ git checkout pull/25357
$ git pull https://git.openjdk.org/jdk.git pull/25357/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 25357

View PR using the GUI difftool:
$ git pr show -t 25357

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/25357.diff

Using Webrev

Link to Webrev Comment

This enables mixed evacuations to be more productive more quickly.

Before this commit, release build made reference to a variable that is only present in debug builds.

At init update refs, we transfer remaining memory from Collector and OldCollector partitions in order to allow more mutator allocations to succeed during update-refs. A conditional transfer is not reliable because soft_max_capacity() constraints may cause the transfer to fail.

Added tag jdk-25+13 for changeset 11a37c8

Added tag jdk-25+14 for changeset a347ecd

This reverts commit 2412337.

This reverts commit 2aa534a.

For non-generational mode, we don't have to set evacuation reserves. It is computed by ShenandoahFreeSet::reserve_regions()

…n/jdk into share-collector-reserves

A merge conflict was resolved incorrectly. phase5_epilog() should return void.

…n/jdk into share-collector-reserves

bridgekeeper · 2025-05-21T15:29:42Z

👋 Welcome back kdnilsen! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

openjdk · 2025-05-21T15:30:15Z

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

openjdk · 2025-05-21T15:30:34Z

⚠️ @kdnilsen This pull request contains merges that bring in commits not present in the target repository. Since this is not a "merge style" pull request, these changes will be squashed when this pull request in integrated. If this is your intention, then please ignore this message. If you want to preserve the commit structure, you must change the title of this pull request to Merge <project>:<branch> where <project> is the name of another project in the OpenJDK organization (for example Merge jdk:master).

openjdk · 2025-05-21T15:30:43Z

@kdnilsen The following labels will be automatically applied to this pull request:

hotspot-gc
shenandoah

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

mlbridge · 2025-05-21T16:08:04Z

Webrevs

earthling-amzn · 2025-05-21T17:54:12Z

src/hotspot/share/gc/shenandoah/shenandoahGenerationSizer.cpp

+  young_gen->increase_capacity(bytes_to_transfer);
+  old_gen->decrease_capacity(bytes_to_transfer);
+  const size_t new_size = young_gen->max_capacity();
+  log_info(gc)("Forcing transfer of %zu region(s) from %s to %s, yielding increased size: " PROPERFMT,


Can this be gc, ergo like the method to transfer regions to old?

Agreed. Making this change.

earthling-amzn · 2025-05-21T17:55:51Z

src/hotspot/share/gc/shenandoah/shenandoahGenerationSizer.cpp

@@ -193,6 +193,19 @@ void ShenandoahGenerationSizer::force_transfer_to_old(size_t regions) const {
                     regions, young_gen->name(), old_gen->name(), PROPERFMTARGS(new_size));


If this is now really only used for in-place promotions, can we change the log message to indicate the region is being promoted? I think when users see messages about things being "forced" in the log, they start to wonder if young/old sizes need to adjusted.

It actually does have one other use. I'm refactoring to distinguish the different purposes. Replacing ShenandoahGenerationSizer log messages with log_develop_debug messages.

earthling-amzn · 2025-05-21T18:00:34Z

src/hotspot/share/gc/shenandoah/shenandoahGenerationalHeap.cpp

@@ -295,6 +295,7 @@ oop ShenandoahGenerationalHeap::try_evacuate_object(oop p, Thread* thread, Shena

    if (copy == nullptr) {
      // If we failed to allocate in LAB, we'll try a shared allocation.
+#ifdef KELVIN_ORIGINAL


This looks like debugging code? Should we back this out?

Sorry I forgot to clean this up. Refined the comment and removed the #ifdef controls

earthling-amzn · 2025-05-21T18:05:55Z

src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp

@@ -1666,7 +1666,12 @@ void ShenandoahHeap::on_cycle_start(GCCause::Cause cause, ShenandoahGeneration*
  set_gc_cause(cause);
  set_gc_generation(generation);

-  generation->heuristics()->record_cycle_start();
+  if (mode()->is_generational()) {
+    // young-gen heuristics track young, bootstrap, and global GC cycle times


This seems like an unrelated change. Mixing in bootstrap and global gc cycle times is likely to increase the average time and make the heuristics more aggressive.

Good catch. I will remove this code from this PR. (It will come back in a subsequent PR, to support adaptive old evac ratios. There, we consolidate the accounting for all gc in the "young heuristics", but we parameterize each phase so as to not bias triggering. Marking phase time is parameterized with bytes marked, for example,

earthling-amzn · 2025-05-21T18:10:20Z

src/hotspot/share/gc/shenandoah/shenandoahHeapRegion.hpp

@@ -276,6 +276,10 @@ class ShenandoahHeapRegion {
  // Return adjusted max heap size
  static size_t setup_sizes(size_t max_heap_size);

+  inline bool is_recycling() {


Is this used anywhere?

Good catch. Removing this.

earthling-amzn · 2025-05-21T18:32:00Z

src/hotspot/share/gc/shenandoah/shenandoah_globals.hpp

@@ -374,26 +374,42 @@
          "runs out of memory too early.")                                  \
                                                                            \
  product(uintx, ShenandoahOldEvacRatioPercent, 75, EXPERIMENTAL,           \


Phew, this is a lot of explanatory text and it reads like the target audience are GC developers. If we are going to expose this as a user configurable option, I think the help text should just explain how the behavior changes as this values goes up or down. Something like:

Increasing this allows for more old regions in mixed collections. Decreasing this reduces the number of old regions in mixed collections.

The first sentence makes it seem as though this is the percentage of the entire heap to reserve for old evacuations, but the next clarifies that this is the percentage of the collection set.

Question about this sentence:

A value of 100 allows a mixed evacuation to focus entirely on old-gen memory, allowing no young-gen regions to be collected.

With a setting of 100, would GenShen still "preselect" young regions of tenuring age with sufficient garbage into the collection set?

I also find the name of the option slightly confusing - is it a ratio? or a percentage? Seems like it's really a percentage (though it controls the ratio of reserves used for the collection).

Thank you for honest feedback. Way too many words. I've pared the description back and tried to clarify the
confusion you describe. Am also renaming to ShenandoahOldEvacPercent. Please let me know what you think.

earthling-amzn · 2025-05-21T18:35:01Z

src/hotspot/share/gc/shenandoah/shenandoahGenerationalHeap.hpp

-  void compute_old_generation_balance(size_t old_xfer_limit, size_t old_cset_regions);
-
-  // Transfers surplus old regions to young, or takes regions from young to satisfy old region deficit
-  TransferResult balance_generations();


Are we still using this TransferResult thing? Seems like we might be able to delete it with this change.

You're right. Deleting. Thanks.

earthling-amzn · 2025-05-23T23:43:06Z

src/hotspot/share/gc/shenandoah/shenandoah_globals.hpp

-          "full GC events.  Faiure to defragment old-gen memory can also "  \
-          "result in unconstrained expansion of old-gen, and shrinkage of " \
-          "young gen, causing inefficient high frequency of young-gen GC.") \
+  product(uintx, ShenandoahOldEvacPercent, 75, EXPERIMENTAL,                \


This is much easier to read. Thank you for making the change!

ysramakrishna · 2025-05-31T02:53:52Z

src/hotspot/share/gc/shenandoah/heuristics/shenandoahGlobalHeuristics.cpp

-    heap->young_generation()->set_evacuation_reserve(young_evac_reserve - regions_transferred_to_old * region_size_bytes);
-    heap->old_generation()->set_evacuation_reserve(old_evac_reserve + regions_transferred_to_old * region_size_bytes);
-  }
+  heap->young_generation()->set_evacuation_reserve((size_t) (young_evac_bytes * ShenandoahEvacWaste));


So we are using the amount to be evacuated out of young (suitably marked up to account for waste) from the collection set of a specific cycle to predict the same for the next cycle? And similarly for the promotion bytes.

This seems reasonable, but how does that compare with using the live data identified in the most recent marking cycle instead? I can imagine that the former is more accurate under steady state assumptions and the latter is an overestimate to the extent that not all live data will be evacuated because it's in mostly live, i.e. densely live regions. However, it would be interesting to see how they compare and which tracks reality better. Since this is in the nature of a prediction/estimate, once can consider a control algorithm that tries to move the estimate closer based on minimizing some historical deviation between marked vs evacuated.

This need not be done here, but can be considered a future enhancement/experiment.

ysramakrishna · 2025-05-31T02:57:38Z

src/hotspot/share/gc/shenandoah/heuristics/shenandoahGlobalHeuristics.cpp

-  }
+  heap->young_generation()->set_evacuation_reserve((size_t) (young_evac_bytes * ShenandoahEvacWaste));
+  heap->old_generation()->set_evacuation_reserve((size_t) (old_evac_bytes * ShenandoahOldEvacWaste));
+  heap->old_generation()->set_promoted_reserve((size_t) (promo_bytes * ShenandoahPromoEvacWaste));


Note that the census from the most recent mark provides both these bits of information, but doesn't account for other criteria (i.e. liveness denseness) that go into the exclusion of certain regions (and their objects) from the collection set. Therein lies the rub, but armed with historical numbers of each and reality, one might be able to predict this well (may be).

bridgekeeper · 2025-06-28T06:22:36Z

@kdnilsen This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply issue a /touch or /keepalive command to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

kdnilsen added 30 commits February 17, 2025 16:21

Share reserves between old and young

6c90312

This enables mixed evacuations to be more productive more quickly.

Cleaner separation when asserts enabled

43c1842

Before this commit, release build made reference to a variable that is only present in debug builds.

Remove debug instrumentation

980110b

Add some debug instrumentation

7ae16e1

Recycle old region on demands adjusts free bounds

ac39797

Cancel pending GC trigger at start of old gc

b6ef323

Fix white space

a8e40e1

Reduce likelihood of promotion by shared allocs

1911e04

Merge tag 'jdk-25+13' into share-collector-reserves

7ce1120

Added tag jdk-25+13 for changeset 11a37c8

Add support for adaptive OldEvacRatio

2412337

Merge tag 'jdk-25+14' into share-collector-reserves

6ddcc4d

Added tag jdk-25+14 for changeset a347ecd

Fix typo in preprocessor directive

437c71f

Merge remote-tracking branch 'jdk/master' into share-collector-reserves

65eb7b9

Repair damage resulting from upstream merge

77ae0c1

Add asserts and diagnostics to help debug failures

fa28658

Global cycles need to record cycle start time

7e97937

Revert "Add support for adaptive OldEvacRatio"

b489f62

This reverts commit 2412337.

Acquire heaplock before adjusting interval for old

2aa534a

Revert "Acquire heaplock before adjusting interval for old"

85f03c3

This reverts commit 2aa534a.

Improve empty region accounting in FreeSet

143515d

Fixup bugs introduced by most recent commit

be05b38

Merge remote-tracking branch 'jdk/master' into share-collector-reserves

da883d7

Revert change inadvertantly applied during merge

eda09d5

For non-generational mode, we don't have to set evacuation reserves. It is computed by ShenandoahFreeSet::reserve_regions()

Refactor to check ShenFreeSet::avaiable() outside global heaplock

577b0cb

Improve empty region accounting in FreeSet

3cf1b87

Fixup bugs introduced by most recent commit

684b9c2

Merge branch 'share-collector-reserves' of https://github.com/kdnilse…

9a2b972

…n/jdk into share-collector-reserves

compute_old_generation_balance needs available computations under lock

ce897cd

make old gc more aggresive

6abbd8b

kdnilsen added 3 commits May 20, 2025 14:38

Change fullgc phase5 return type

242bce5

A merge conflict was resolved incorrectly. phase5_epilog() should return void.

Merge branch 'share-collector-reserves' of https://github.com/kdnilse…

49abfca

…n/jdk into share-collector-reserves

Merge remote-tracking branch 'jdk/master' into share-collector-reserves

f0afae0

openjdk bot added hotspot-gc hotspot-gc-dev@openjdk.org shenandoah shenandoah-dev@openjdk.org labels May 21, 2025

kdnilsen changed the title ~~Share collector reserves~~ 8357471: GenShen: Share collector reserves between young and old May 21, 2025

Fix whitespace

3d55a64

openjdk bot added the rfr Pull request is ready for review label May 21, 2025

earthling-amzn suggested changes May 21, 2025

View reviewed changes

kdnilsen added 2 commits May 23, 2025 16:34

Keep gc cycle times with heuristics for the relevant generation

babc04e

respond to reviewer feedback

ae8c83c

earthling-amzn reviewed May 23, 2025

View reviewed changes

ysramakrishna reviewed May 31, 2025

View reviewed changes

Fix assert_bounds() assertions when old_trash_not_in_bounds

8ec6ea0

openjdk bot removed the rfr Pull request is ready for review label Jun 9, 2025

fix whitespace

aea2b92

openjdk bot added the rfr Pull request is ready for review label Jun 12, 2025

		@@ -193,6 +193,19 @@ void ShenandoahGenerationSizer::force_transfer_to_old(size_t regions) const {
		regions, young_gen->name(), old_gen->name(), PROPERFMTARGS(new_size));

8357471: GenShen: Share collector reserves between young and old #25357

Are you sure you want to change the base?

8357471: GenShen: Share collector reserves between young and old #25357

Uh oh!

Conversation

kdnilsen commented May 21, 2025 • edited by openjdk bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Progress

Issue

Reviewing

Uh oh!

bridgekeeper bot commented May 21, 2025

Uh oh!

openjdk bot commented May 21, 2025

Uh oh!

openjdk bot commented May 21, 2025

Uh oh!

openjdk bot commented May 21, 2025

Uh oh!

mlbridge bot commented May 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Webrevs

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ysramakrishna May 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bridgekeeper bot commented Jun 28, 2025

Uh oh!

Uh oh!

kdnilsen commented May 21, 2025 •

edited by openjdk bot

Loading

mlbridge bot commented May 21, 2025 •

edited

Loading

ysramakrishna May 31, 2025 •

edited

Loading