-
Notifications
You must be signed in to change notification settings - Fork 193
merge_batcher: use Rust iterators and VecDeque
in place of VecQueue
#380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
frankmcsherry
merged 1 commit into
TimelyDataflow:master
from
petrosagg:remove-vecqueue
Feb 27, 2023
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This original code is UB since it is using
get_unchecked
to index out of bounds of the vector's contents. The vector this is indexing into has length 0 due to theset_len
below.Since rust-lang/rust#116915 (Rust 1.76+, nightly-2023-12-05+), this UB is going to cause pretty much anything calling this code in differential-dataflow 0.12.0 to miscompile. I found this by way of a miscompile of the cargo-tally crate.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the subtext here that it would help to ship 0.13.0? I think 0.12.0 is multiple years old at this point, and perhaps it's worth a new, utterly incompatible version that has less
unsafe
in it? (I/we mostly use the repo head rather than anything crates.io; you might be one of the few who do).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not blocked on an official 0.13.0 being cut since I grabbed the code from master yesterday and published it to https://crates.io/crates/differential-dataflow-master myself. Cargo-tally can use it like this: dtolnay/cargo-tally@24a1946. For my purposes, this is completely fine — it lets you folks keep developing against head and lets me use snapshots of head in my crate whenever I like.
If you do 0.13.0 it would mostly be for the benefit of anyone else using differential-dataflow from crates.io. Right now 0.12.0 is effectively unusable, I think, and that was worth calling your attention to (also because the PR description says "all uses of the API were correct" and did not identify that this code is really problematic). But it does not need to translate into urgency to publish 0.13.0.
I had originally looked into whether backporting this single fix onto 0.12.0 as a 0.12.1 release would fix the miscompiles. I first confirmed that the commit in which this PR merged (6ae61ad) does not have the miscompile whereas its parent commit (cc88469) still does, so this is definitely the relevant fix. I backported it onto 7bc5338 (the v0.12.0 tag) as dtolnay-contrib@46edd1f. However that still miscompiled. There must be some other invalid unsafe code elsewhere that got removed earlier than this PR — you mentioned overall there is less
unsafe
now compared to 3 years ago. I figured finding all the rest of what would need to be backported was not worthwhile so this is when I switched to the approach of publishing the code from master.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We merged #413 recently, which also removes some unsafe code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Erm, it seems less problematic to be totally honest; certainly at the time. Afaict it matches the PPYP pattern, and the main thing that has changed is what Rust views as UB (to be fair, afaiu little enough is defined in the first place). The Rust 1.76 change certainly seems to mess up the PPYP pattern; is there something else that Rust folks recommend instead now?
More generally, any recommendations for where to follow and perhaps litigate that certain "optimizations" around
unsafe
are potentially harmful? It's a recurring anxiety for anyone who wants solid performance, that Rust might just eventually deem any use ofunsafe
UB and misoptimize it.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, pretty much all use of
unsafe
is invalid, so this checks out. :D I think there are two uses ofunsafe
at the moment to work around LLVM not optimizingsplit_at_mut
very well (inconsolidation.rs
). Everything else has been removed.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I'm here because I was sifting through backlinks)
Out-of-bounds indexing via
get_unchecked
has been documented to be UB since Rust 1.38.0, which was published in late November 2019. Not that I really expect anyone to be aware that documentation has been updated, but there was no recent change here.The UB here has also been reliably detected by Miri for a long time, and if it is not possible to run Miri, the standard library has robust debug assertions that you can exercise today with cargo-careful or with
-Zbuild-std
. Some of those checks should soon be on by default without such tools.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the heads up! Unfortunately the referenced code predates 1.38, back when the method had no documented UB. =/
Fwiw, we've run Miri a fair amount, in particular to find a miscompilation induced by another crate, and it did not detect anything. I don't know if that is surprising.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, it's not. In the past, out-of-bounds
slice::get_unchecked
was detected almost by accident by Stacked Borrows, so if you pass-Zmiri-disable-stacked-borrows
or-Zmiri-tree-borrows
, as I suspect some number of codebases need to in order to execute much of anything, it's wasn't checked at all. Miri understands the optimization hint that landed in 1.76 though. So this isn't a problem anymore.The other problem is just the nature of dynamic checkers. If well-formed/non-malicious inputs never cause indexing out of bounds... well you'll never get a report from Miri. I've been meaning to write a helper to run Miri on a minimized fuzzing corpus to assist with this scenario. Still not a soundness proof though.