-
Notifications
You must be signed in to change notification settings - Fork 6.1k
8357471: GenShen: Share collector reserves between young and old #25357
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
This enables mixed evacuations to be more productive more quickly.
Before this commit, release build made reference to a variable that is only present in debug builds.
At init update refs, we transfer remaining memory from Collector and OldCollector partitions in order to allow more mutator allocations to succeed during update-refs. A conditional transfer is not reliable because soft_max_capacity() constraints may cause the transfer to fail.
Added tag jdk-25+13 for changeset 11a37c8
Added tag jdk-25+14 for changeset a347ecd
This reverts commit 2412337.
This reverts commit 2aa534a.
For non-generational mode, we don't have to set evacuation reserves. It is computed by ShenandoahFreeSet::reserve_regions()
…n/jdk into share-collector-reserves
A merge conflict was resolved incorrectly. phase5_epilog() should return void.
…n/jdk into share-collector-reserves
👋 Welcome back kdnilsen! A progress list of the required criteria for merging this PR into |
❗ This change is not yet ready to be integrated. |
|
Webrevs
|
young_gen->increase_capacity(bytes_to_transfer); | ||
old_gen->decrease_capacity(bytes_to_transfer); | ||
const size_t new_size = young_gen->max_capacity(); | ||
log_info(gc)("Forcing transfer of %zu region(s) from %s to %s, yielding increased size: " PROPERFMT, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be gc, ergo
like the method to transfer regions to old?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Making this change.
@@ -193,6 +193,19 @@ void ShenandoahGenerationSizer::force_transfer_to_old(size_t regions) const { | |||
regions, young_gen->name(), old_gen->name(), PROPERFMTARGS(new_size)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is now really only used for in-place promotions, can we change the log message to indicate the region is being promoted? I think when users see messages about things being "forced" in the log, they start to wonder if young/old sizes need to adjusted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It actually does have one other use. I'm refactoring to distinguish the different purposes. Replacing ShenandoahGenerationSizer log messages with log_develop_debug messages.
@@ -295,6 +295,7 @@ oop ShenandoahGenerationalHeap::try_evacuate_object(oop p, Thread* thread, Shena | |||
|
|||
if (copy == nullptr) { | |||
// If we failed to allocate in LAB, we'll try a shared allocation. | |||
#ifdef KELVIN_ORIGINAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like debugging code? Should we back this out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I forgot to clean this up. Refined the comment and removed the #ifdef controls
@@ -1666,7 +1666,12 @@ void ShenandoahHeap::on_cycle_start(GCCause::Cause cause, ShenandoahGeneration* | |||
set_gc_cause(cause); | |||
set_gc_generation(generation); | |||
|
|||
generation->heuristics()->record_cycle_start(); | |||
if (mode()->is_generational()) { | |||
// young-gen heuristics track young, bootstrap, and global GC cycle times |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like an unrelated change. Mixing in bootstrap and global gc cycle times is likely to increase the average time and make the heuristics more aggressive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I will remove this code from this PR. (It will come back in a subsequent PR, to support adaptive old evac ratios. There, we consolidate the accounting for all gc in the "young heuristics", but we parameterize each phase so as to not bias triggering. Marking phase time is parameterized with bytes marked, for example,
@@ -276,6 +276,10 @@ class ShenandoahHeapRegion { | |||
// Return adjusted max heap size | |||
static size_t setup_sizes(size_t max_heap_size); | |||
|
|||
inline bool is_recycling() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this used anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Removing this.
@@ -374,26 +374,42 @@ | |||
"runs out of memory too early.") \ | |||
\ | |||
product(uintx, ShenandoahOldEvacRatioPercent, 75, EXPERIMENTAL, \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Phew, this is a lot of explanatory text and it reads like the target audience are GC developers. If we are going to expose this as a user configurable option, I think the help text should just explain how the behavior changes as this values goes up or down. Something like:
Increasing this allows for more old regions in mixed collections. Decreasing this reduces the number of old regions in mixed collections.
The first sentence makes it seem as though this is the percentage of the entire heap to reserve for old evacuations, but the next clarifies that this is the percentage of the collection set.
Question about this sentence:
A value of 100 allows a mixed evacuation to focus entirely on old-gen memory, allowing no young-gen regions to be collected.
With a setting of 100, would GenShen still "preselect" young regions of tenuring age with sufficient garbage into the collection set?
I also find the name of the option slightly confusing - is it a ratio? or a percentage? Seems like it's really a percentage (though it controls the ratio of reserves used for the collection).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for honest feedback. Way too many words. I've pared the description back and tried to clarify the
confusion you describe. Am also renaming to ShenandoahOldEvacPercent. Please let me know what you think.
void compute_old_generation_balance(size_t old_xfer_limit, size_t old_cset_regions); | ||
|
||
// Transfers surplus old regions to young, or takes regions from young to satisfy old region deficit | ||
TransferResult balance_generations(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we still using this TransferResult
thing? Seems like we might be able to delete it with this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. Deleting. Thanks.
"full GC events. Faiure to defragment old-gen memory can also " \ | ||
"result in unconstrained expansion of old-gen, and shrinkage of " \ | ||
"young gen, causing inefficient high frequency of young-gen GC.") \ | ||
product(uintx, ShenandoahOldEvacPercent, 75, EXPERIMENTAL, \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is much easier to read. Thank you for making the change!
heap->young_generation()->set_evacuation_reserve(young_evac_reserve - regions_transferred_to_old * region_size_bytes); | ||
heap->old_generation()->set_evacuation_reserve(old_evac_reserve + regions_transferred_to_old * region_size_bytes); | ||
} | ||
heap->young_generation()->set_evacuation_reserve((size_t) (young_evac_bytes * ShenandoahEvacWaste)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we are using the amount to be evacuated out of young (suitably marked up to account for waste) from the collection set of a specific cycle to predict the same for the next cycle? And similarly for the promotion bytes.
This seems reasonable, but how does that compare with using the live data identified in the most recent marking cycle instead? I can imagine that the former is more accurate under steady state assumptions and the latter is an overestimate to the extent that not all live data will be evacuated because it's in mostly live, i.e. densely live regions. However, it would be interesting to see how they compare and which tracks reality better. Since this is in the nature of a prediction/estimate, once can consider a control algorithm that tries to move the estimate closer based on minimizing some historical deviation between marked vs evacuated.
This need not be done here, but can be considered a future enhancement/experiment.
} | ||
heap->young_generation()->set_evacuation_reserve((size_t) (young_evac_bytes * ShenandoahEvacWaste)); | ||
heap->old_generation()->set_evacuation_reserve((size_t) (old_evac_bytes * ShenandoahOldEvacWaste)); | ||
heap->old_generation()->set_promoted_reserve((size_t) (promo_bytes * ShenandoahPromoEvacWaste)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the census from the most recent mark provides both these bits of information, but doesn't account for other criteria (i.e. liveness denseness) that go into the exclusion of certain regions (and their objects) from the collection set. Therein lies the rub, but armed with historical numbers of each and reality, one might be able to predict this well (may be).
@kdnilsen This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply issue a |
Genshen independently reserves memory to hold evacuations into young and old generations. We have found that under duress, it is sometimes difficult for mixed evacuations to make progress because the reserves in old are too small and we cannot expand old because young is running so frequently that it does not have the excess memory required to justify expansion of old (and shrinking of young).
This PR exploits the fact that the reserves in young are often much larger than young requires to carry out its anticipated next GC cycle. In this case, we can share the young collector reserves with the old generation. This allows much more effective operation of mixed evacuations when GC is running at or near its full capacity.
The following spreadsheet snapshots highlight the benefits of this change. In control with 6G heap size, we perform large numbers of mixed evacuations, but each mixed evacuation has very low productivity (e.g. one region at a time). This causes excessive delays in reclaiming the garbage from old, which is required to shrink old and expand young. This is why we see the large number of unproductive GC cycles, many of which degenerate and a few of which upgrade to full GC. In the experiment with 6G heap size, there are far fewer mixed cycles, but they are each much more productive. The total number of GC cycles decreases significantly.
With 7G heap size, the benefits of this PR manifest as a decrease in mixed evacuations, which also allows us to decrease total GC cycles. By more quickly reclaiming old garbage, we are able to more quickly expand young, which decreases the number of young GC cycles. This reduces CPU load. The impact on response times is not as significant as with the 6G heap size. We see slight improvement at p50-p99.9, with slight degradation at p99.99 through p100.
At 8G heap size, the GC is not at all stressed. We see approximately the same numbers of GC cycles, slight degradation of response times at p50-p99, slight improvement in response times at p99.9-p100.
The command line for these comparisons follows:
We have tested this patch through our performance pipeline. Both aarch64 and x86 show similar results, a small increase in concurrent evacuation on the graphchi benchmark, with slight improvements of other metrics on a number of other test workloads:
Progress
Issue
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/25357/head:pull/25357
$ git checkout pull/25357
Update a local copy of the PR:
$ git checkout pull/25357
$ git pull https://git.openjdk.org/jdk.git pull/25357/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 25357
View PR using the GUI difftool:
$ git pr show -t 25357
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/25357.diff
Using Webrev
Link to Webrev Comment