Skip to content

Apply effects to otherwise edge in dataflow analysis #142707

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ashivaram23
Copy link
Contributor

@ashivaram23 ashivaram23 commented Jun 19, 2025

This allows ElaborateDrops to remove drops when a match wildcard arm covers multiple no-Drop enum variants. It modifies dataflow analysis to update the MaybeUninitializedPlaces and MaybeInitializedPlaces data for a block reached through an otherwise edge.

Fixes #142705.

@rustbot
Copy link
Collaborator

rustbot commented Jun 19, 2025

r? @petrochenkov

rustbot has assigned @petrochenkov.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 19, 2025
@rust-log-analyzer

This comment has been minimized.

@petrochenkov
Copy link
Contributor

r? compiler

@rustbot rustbot assigned fee1-dead and unassigned petrochenkov Jun 19, 2025
@workingjubilee
Copy link
Member

hm.
@bors2 try @rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors
Copy link

rust-bors bot commented Jun 19, 2025

⌛ Trying commit 5837955 with merge e3d7e41

To cancel the try build, run the command @bors2 try cancel.

rust-bors bot added a commit that referenced this pull request Jun 19, 2025
Update `MaybeUninitializedPlaces` and `MaybeInitializedPlaces` for `otherwise`

<!-- homu-ignore:start -->
<!--
If this PR is related to an unstable feature or an otherwise tracked effort,
please link to the relevant tracking issue here. If you don't know of a related
tracking issue or there are none, feel free to ignore this.

This PR will get automatically assigned to a reviewer. In case you would like
a specific user to review your work, you can assign it to them by using

    r? <reviewer name>
-->
<!-- homu-ignore:end -->

This allows `ElaborateDrops` to remove drops when a `match` wildcard arm covers multiple no-Drop enum variants. It modifies dataflow analysis to update the `MaybeUninitializedPlaces` and `MaybeInitializedPlaces` data for a block reached through an `otherwise` edge.

This appears to fix #142705, but I don't know for sure if it's actually correct (are there cases where this would be wrong and break things?). I also haven't tested that it actually improves compile times, or machine code output outside of the examples in the issue.
@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 19, 2025
@rust-bors
Copy link

rust-bors bot commented Jun 19, 2025

☀️ Try build successful (CI)
Build commit: e3d7e41 (e3d7e41d7e8c513f55398f6e34eb2af7fb7df49f, parent: 8de4c7234dd9b97c9d76b58671343fdbbc9a433e)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (e3d7e41): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
1.1% [0.6%, 1.5%] 2
Regressions ❌
(secondary)
0.6% [0.4%, 0.8%] 4
Improvements ✅
(primary)
-0.5% [-1.2%, -0.2%] 32
Improvements ✅
(secondary)
-0.7% [-1.3%, -0.2%] 19
All ❌✅ (primary) -0.4% [-1.2%, 1.5%] 34

Max RSS (memory usage)

Results (primary -0.2%, secondary 2.8%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
3.5% [2.3%, 4.7%] 3
Regressions ❌
(secondary)
2.8% [2.2%, 3.5%] 2
Improvements ✅
(primary)
-2.4% [-3.6%, -0.5%] 5
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -0.2% [-3.6%, 4.7%] 8

Cycles

Results (primary 0.9%, secondary -5.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
2.7% [2.6%, 2.9%] 2
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-1.0% [-1.2%, -0.8%] 2
Improvements ✅
(secondary)
-5.0% [-5.0%, -5.0%] 1
All ❌✅ (primary) 0.9% [-1.2%, 2.9%] 4

Binary size

Results (primary -0.1%, secondary -0.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.1% [0.0%, 0.2%] 7
Regressions ❌
(secondary)
0.0% [0.0%, 0.1%] 7
Improvements ✅
(primary)
-0.1% [-0.5%, -0.0%] 64
Improvements ✅
(secondary)
-0.1% [-0.1%, -0.0%] 50
All ❌✅ (primary) -0.1% [-0.5%, 0.2%] 71

Bootstrap: 691.904s -> 692.572s (0.10%)
Artifact size: 372.02 MiB -> 371.98 MiB (-0.01%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Jun 20, 2025
@ashivaram23
Copy link
Contributor Author

Hmm that's unfortunate. It might help to avoid paying the cost of the extra clone and checks when it won't make a difference, like in the trivial otherwise targets when there's no wildcard. Maybe the otherwise could be handled in a separate pass after collecting variants to kill/add in a SmallVec, in case that helps somehow.

What's more concerning is runtime benchmarks being worse. There are only two, and I don't know how noisy they are, but 8% wall time increase for css-parse-fb is pretty bad. There are also some compile time benchmarks whose graphs show significant increases in wall time spent in LLVM. Does this just not play well with later optimizations?

@rustbot
Copy link
Collaborator

rustbot commented Jun 21, 2025

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

@ashivaram23
Copy link
Contributor Author

The last commit makes it so that there's an option in the MaybeUninitializedPlaces and MaybeInitializedPlaces builders for whether or not to update the otherwise block's lattice element, and it's set to only do so for MaybeInitializedPlaces in RemoveUninitDrops and ElaborateDrops. That should be enough to clean up some drops while hopefully being faster and not messing with other passes in ways that cause runtime slowdowns.

@workingjubilee
Copy link
Member

the runtime benchmarks are unfortunately very noisy

move_data: &MoveData<'tcx>,
enum_place: mir::Place<'tcx>,
active_variant: VariantIdx,
mut handle_inactive_variant: impl FnMut(MovePathIndex),
mut handle_active_variant: Option<impl FnMut(MovePathIndex)>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it will be better, perf-wise to make this not take an option but just use a no-op for callers that do not want to handle it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's definitely possible, since I think in that case on_all_children_bits could be monomorphized into an empty function. I made the change and it does reduce stage2 librustc_driver.so size a bit.

@fee1-dead
Copy link
Member

this part of MIR is a little above my level of confidence for review, so

r? compiler

@rustbot rustbot assigned lcnr and unassigned fee1-dead Jun 22, 2025
@lcnr
Copy link
Contributor

lcnr commented Jun 23, 2025

r? wg-mir-opt

@rustbot rustbot assigned dianqk and unassigned lcnr Jun 23, 2025
@rustbot rustbot added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Jun 29, 2025
@rustbot
Copy link
Collaborator

rustbot commented Jun 29, 2025

Reminder, once the PR becomes ready for a review, use @rustbot ready.

@ashivaram23
Copy link
Contributor Author

Sorry it took me a while to get to this! I added a test in otherwise_drops.rs.

As for your suggestion, I think it makes sense. It's more readable and while it does involve an additional allocation and another pass through the variants and move paths, it also saves a clone of the dataflow state, and the non-otherwise targets do the same quadratic loops anyway (could only be avoided if there was a guarantee that move paths are in the same order as discriminants).

I can implement the ActiveVariants enum suggestion as is, but I was thinking of combining it with a few other changes that may or may not be a good idea, and would like to hear your thoughts.

My idea was that it could be nicer to collect the VariantIdxs of each switch target while creating MaybePlacesSwitchIntData so they could be passed directly to on_all_inactive_variants without a new allocation, and the non-otherwise targets won't have to skip over variants with next_discr every time either. Nobody else uses MaybePlacesSwitchIntData and its main purpose seems to be mapping switch target u128 values to VariantIdx, so we can avoid the extra work and indirection with something like this:

pub struct MaybePlacesSwitchIntData<'tcx> {
    enum_place: mir::Place<'tcx>,
    
    // only the targets in the SwitchInt, not all discriminants
    targets: Vec<(VariantIdx, mir::BasicBlock)>,

    // discriminants: Vec<(VariantIdx, Discr<'tcx>)>,
    // index: usize,
}

Then SwitchTargetValue could store VariantIdx instead of u128, so the whole next_discr process and index state tracking would be unnecessary, and the otherwise pass could use this filtered targets list that contains the VariantIdxs without needing to allocate a new Vec for it.

This would probably require removing the call to BasicBlocks::switch_sources in backward apply_effects_in_block because it's tough to make switch_sources get VariantIdxs. Setting aside how all of the backward code is currently unused, I think that could make it more efficient. Since get_switch_int_data is being called on each predecessor* anyway, it's not necessary to loop through targets returned by switch_sources (which goes through every block in the CFG on the first call, though that gets cached). The only benefit of doing so at the moment is avoiding a check to filter for the correct successor, but the point of this idea is for get_switch_int_data to filter everything upfront.


* To me it seems like it should be called on the block's predecessor, but it's currently being passed the block itself. Again it's unused so it doesn't matter, but I wonder if this was actually a typo. @nnethercote because this line is from #133328.

@rust-log-analyzer

This comment has been minimized.

Copy link
Member

@dianqk dianqk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, but I think it should be an independent PR.


// Ensures there are no drops for the wildcard match arm.

// EMIT_MIR otherwise_drops.result_ok.ElaborateDrops.after.mir
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not EMIT_MIR otherwise_drops.result_ok.ElaborateDrops.diff

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I run ./x test --bless with diff output, the first runs generate a slightly different "before" than the later runs, which means running ./x test immediately afterwards fails. There's one line that starts out as unwind: continue in the first platform tests and then becomes unwind: bb9 later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps you want to add //@ compile-flags: -Cpanic=abort or // EMIT_MIR_FOR_EACH_PANIC_STRATEGY. I prefer the first one here.

@@ -0,0 +1,16 @@
// skip-filecheck
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add what you want to check, such as // CHECK-NOT: drop(.

@dianqk
Copy link
Member

dianqk commented Jul 7, 2025

BTW, the PR title could probably be something like: "Applying effects to otherwise in dataflow"

@ashivaram23 ashivaram23 changed the title Update MaybeUninitializedPlaces and MaybeInitializedPlaces for otherwise Apply effects to otherwise edge in dataflow analysis Jul 8, 2025
@ashivaram23
Copy link
Contributor Author

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jul 8, 2025
@ashivaram23
Copy link
Contributor Author

In the last commit I fixed the test file and incorporated the suggestions. I used a SmallVec since that's what SwitchTargets and other structs do, and I switched it to InactiveVariants for the double negative.

@tmiasko
Copy link
Contributor

tmiasko commented Jul 8, 2025

To me it seems like it should be called on the block's predecessor, but it's currently being passed the block itself.

Yes, it should have been called on the predecessor which contains the switch.

I haven't followed all the discussion, but if removing unused switch int handling in the backward direction makes anything easier that seems perfectly fine.

Copy link
Member

@dianqk dianqk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


// Ensures there are no drops for the wildcard match arm.

// EMIT_MIR otherwise_drops.result_ok.ElaborateDrops.after.mir
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps you want to add //@ compile-flags: -Cpanic=abort or // EMIT_MIR_FOR_EACH_PANIC_STRATEGY. I prefer the first one here.

fn result_ok(result: Result<String, ()>) -> Option<String> {
// CHECK-LABEL: fn result_ok(
// CHECK-NOT: drop
// CHECK: return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// CHECK: return

The return statement isn't the final line of this function.

Comment on lines 16 to 18
fn main() {
result_ok(Err(()));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn main() {
result_ok(Err(()));
}

Removing this function makes file check easier.

@dianqk
Copy link
Member

dianqk commented Jul 8, 2025

@bors2 try @rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors
Copy link

rust-bors bot commented Jul 8, 2025

⌛ Trying commit 881c62f with merge 237f435

To cancel the try build, run the command @bors2 try cancel.

rust-bors bot added a commit that referenced this pull request Jul 8, 2025
Apply effects to `otherwise` edge in dataflow analysis

<!-- homu-ignore:start -->
<!--
If this PR is related to an unstable feature or an otherwise tracked effort,
please link to the relevant tracking issue here. If you don't know of a related
tracking issue or there are none, feel free to ignore this.

This PR will get automatically assigned to a reviewer. In case you would like
a specific user to review your work, you can assign it to them by using

    r? <reviewer name>
-->
<!-- homu-ignore:end -->

This allows `ElaborateDrops` to remove drops when a `match` wildcard arm covers multiple no-Drop enum variants. It modifies dataflow analysis to update the `MaybeUninitializedPlaces` and `MaybeInitializedPlaces` data for a block reached through an `otherwise` edge.

This appears to fix #142705, but I don't know for sure if it's actually correct (are there cases where this would be wrong and break things?). I also haven't tested that it actually improves compile times, or machine code output outside of the examples in the issue.
@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 8, 2025
@rust-bors
Copy link

rust-bors bot commented Jul 8, 2025

☀️ Try build successful (CI)
Build commit: 237f435 (237f435668e899ed46d943ff30ee6f097b6b03fb, parent: 45b80ac21a454d343833aad763ef604510c88375)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (237f435): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
0.7% [0.7%, 0.7%] 1
Regressions ❌
(secondary)
0.6% [0.2%, 0.9%] 13
Improvements ✅
(primary)
-0.4% [-1.1%, -0.1%] 15
Improvements ✅
(secondary)
-0.3% [-0.5%, -0.1%] 7
All ❌✅ (primary) -0.3% [-1.1%, 0.7%] 16

Max RSS (memory usage)

Results (primary 2.7%, secondary 3.2%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
4.8% [3.5%, 6.2%] 2
Regressions ❌
(secondary)
3.2% [3.2%, 3.2%] 1
Improvements ✅
(primary)
-1.7% [-1.7%, -1.7%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 2.7% [-1.7%, 6.2%] 3

Cycles

Results (primary 1.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
2.4% [2.4%, 2.5%] 2
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-2.0% [-2.0%, -2.0%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 1.0% [-2.0%, 2.5%] 3

Binary size

Results (primary -0.1%, secondary -0.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.3% [0.0%, 1.4%] 10
Regressions ❌
(secondary)
0.0% [0.0%, 0.1%] 4
Improvements ✅
(primary)
-0.1% [-1.5%, -0.0%] 63
Improvements ✅
(secondary)
-0.1% [-0.2%, -0.0%] 51
All ❌✅ (primary) -0.1% [-1.5%, 1.4%] 73

Bootstrap: 467.247s -> 463.296s (-0.85%)
Artifact size: 372.36 MiB -> 372.30 MiB (-0.02%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 8, 2025
@ashivaram23
Copy link
Contributor Author

Squashed and force pushed, the only changes since the last commit are addressing the test file review comments.

Regarding the perf run results, there are 8 cases this could affect:

  1. MaybeInitializedPlaces in MIR type check liveness analysis
  2. MaybeUninitializedPlaces in MIR borrow check
  3. MaybeInitializedPlaces in SanityCheck
  4. MaybeUninitializedPlaces in SanityCheck
  5. MaybeInitializedPlaces in lint_tail_expr_drop_order
  6. MaybeInitializedPlaces in RemoveUninitDrops
  7. MaybeInitializedPlaces in ElaborateDrops
  8. MaybeUninitializedPlaces in ElaborateDrops

The first perf run handled otherwise edges everywhere, the second did it only for 6 and 7, and the third did it for 6-8.

The first has the strongest results, with a lot of improvements and a few regressions. (Also a runtime test regression, which seems to follow a step function where it fluctuates between either exactly 0 or exactly 1.9% relative to the baseline over the past 30 days and went back to 1.9% here, not sure what that means)

The second still shows an overall improvement in compile times but applies to much fewer tests. It also has a different runtime regression with the same binary/step pattern.

The third has more regressions and fewer improvements, and an overall regression in compile times for all benchmarks (mostly due to the secondary benchmarks especially tt-muncher, it's a small overall improvement for the primary ones only). But there are no runtime test regressions and bootstrap times are four seconds faster.

Should I just enable the change for all 8 cases again? Or make it as limited as possible (number 7 alone is enough to fix the wildcard problem), or try other combinations?

@dianqk
Copy link
Member

dianqk commented Jul 9, 2025

These can be subsequent PRs. Thanks!
@bors r+

@bors
Copy link
Collaborator

bors commented Jul 9, 2025

📌 Commit c7ef03a has been approved by dianqk

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf-regression Performance regression. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Drop is called for match wildcards even when all matched variants have no Drop