Skip to content

Commit a00fa96

Browse files
committed
Auto merge of #2479 - saethlin:tag-gc, r=oli-obk
Implement a garbage collector for tags The general approach here is to scan TLS, all locals, and the main memory map for all provenance, accumulating a `HashSet` of all pointer tags which are stored anywhere (we also have a special case for panic payloads). Then we iterate over every borrow stack and remove tags which are not in said `HashSet`, or which could be terminating a SRW block. Runtime of benchmarks decreases by between 17% and 81%. GC off: ``` Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/backtraces/Cargo.toml Time (mean ± σ): 7.080 s ± 0.249 s [User: 6.870 s, System: 0.202 s] Range (min … max): 6.933 s … 7.521 s 5 runs Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options. Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/mse/Cargo.toml Time (mean ± σ): 1.875 s ± 0.031 s [User: 1.630 s, System: 0.245 s] Range (min … max): 1.825 s … 1.910 s 5 runs Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/serde1/Cargo.toml Time (mean ± σ): 2.785 s ± 0.075 s [User: 2.536 s, System: 0.168 s] Range (min … max): 2.698 s … 2.851 s 5 runs Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/serde2/Cargo.toml Time (mean ± σ): 6.267 s ± 0.066 s [User: 6.072 s, System: 0.190 s] Range (min … max): 6.152 s … 6.314 s 5 runs Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/slice-get-unchecked/Cargo.toml Time (mean ± σ): 4.733 s ± 0.080 s [User: 4.177 s, System: 0.513 s] Range (min … max): 4.681 s … 4.874 s 5 runs Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/unicode/Cargo.toml Time (mean ± σ): 3.770 s ± 0.034 s [User: 3.549 s, System: 0.211 s] Range (min … max): 3.724 s … 3.819 s 5 runs ``` GC on: ``` Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/backtraces/Cargo.toml Time (mean ± σ): 5.886 s ± 0.054 s [User: 5.696 s, System: 0.182 s] Range (min … max): 5.799 s … 5.937 s 5 runs Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/mse/Cargo.toml Time (mean ± σ): 936.4 ms ± 7.0 ms [User: 815.4 ms, System: 119.6 ms] Range (min … max): 925.7 ms … 945.0 ms 5 runs Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/serde1/Cargo.toml Time (mean ± σ): 2.126 s ± 0.022 s [User: 1.979 s, System: 0.146 s] Range (min … max): 2.089 s … 2.143 s 5 runs Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/serde2/Cargo.toml Time (mean ± σ): 4.242 s ± 0.066 s [User: 4.051 s, System: 0.160 s] Range (min … max): 4.196 s … 4.357 s 5 runs Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/slice-get-unchecked/Cargo.toml Time (mean ± σ): 907.4 ms ± 2.4 ms [User: 788.6 ms, System: 118.2 ms] Range (min … max): 903.5 ms … 909.4 ms 5 runs Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/unicode/Cargo.toml Time (mean ± σ): 1.821 s ± 0.011 s [User: 1.687 s, System: 0.133 s] Range (min … max): 1.802 s … 1.831 s 5 runs ``` But much more importantly for me this drops the peak memory usage of the first 1 minute of running `regex`'s tests from 103 GB to 1.7 GB. Thanks to `@oli-obk` for suggesting a while ago that this was possible and `@darksonn` for reminding me that we can just search through memory to find Provenance to locate pointers. Fixes #1367
2 parents 3886a63 + f59605c commit a00fa96

File tree

12 files changed

+243
-8
lines changed

12 files changed

+243
-8
lines changed

.github/workflows/ci.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,10 @@ jobs:
3434
steps:
3535
- uses: actions/checkout@v3
3636

37+
- name: Set the tag GC interval to 1 on linux
38+
if: runner.os == 'macOS'
39+
run: echo "MIRIFLAGS=-Zmiri-tag-gc=1" >> $GITHUB_ENV
40+
3741
# We install gnu-tar because BSD tar is buggy on macOS builders of GHA.
3842
# See <https://github.com/actions/cache/issues/403>.
3943
- name: Install GNU tar

README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -323,6 +323,10 @@ environment variable. We first document the most relevant and most commonly used
323323
ensure alignment. (The standard library `align_to` method works fine in both modes; under
324324
symbolic alignment it only fills the middle slice when the allocation guarantees sufficient
325325
alignment.)
326+
* `-Zmiri-tag-gc=<blocks>` configures how often the pointer tag garbage collector runs. The default
327+
is to search for and remove unreachable tags once every `10,000` basic blocks. Setting this to
328+
`0` disables the garbage collector, which causes some programs to have explosive memory usage
329+
and/or super-linear runtime.
326330

327331
The remaining flags are for advanced use only, and more likely to change or be removed.
328332
Some of these are **unsound**, which means they can lead

ci.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ function run_tests {
3131
# optimizations up all the way).
3232
# Optimizations change diagnostics (mostly backtraces), so we don't check them
3333
#FIXME(#2155): we want to only run the pass and panic tests here, not the fail tests.
34-
MIRIFLAGS="-O -Zmir-opt-level=4" MIRI_SKIP_UI_CHECKS=1 ./miri test -- tests/{pass,panic}
34+
MIRIFLAGS="${MIRIFLAGS:-} -O -Zmir-opt-level=4" MIRI_SKIP_UI_CHECKS=1 ./miri test -- tests/{pass,panic}
3535
fi
3636

3737
## test-cargo-miri

src/bin/miri.rs

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -521,6 +521,12 @@ fn main() {
521521
Err(err) => show_error!("-Zmiri-report-progress requires a `u32`: {}", err),
522522
};
523523
miri_config.report_progress = Some(interval);
524+
} else if let Some(param) = arg.strip_prefix("-Zmiri-tag-gc=") {
525+
let interval = match param.parse::<u32>() {
526+
Ok(i) => i,
527+
Err(err) => show_error!("-Zmiri-tag-gc requires a `u32`: {}", err),
528+
};
529+
miri_config.gc_interval = interval;
524530
} else if let Some(param) = arg.strip_prefix("-Zmiri-measureme=") {
525531
miri_config.measureme_out = Some(param.to_string());
526532
} else if let Some(param) = arg.strip_prefix("-Zmiri-backtrace=") {

src/concurrency/thread.rs

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -289,6 +289,10 @@ impl<'mir, 'tcx: 'mir> ThreadManager<'mir, 'tcx> {
289289
&mut self.threads[self.active_thread].stack
290290
}
291291

292+
pub fn iter(&self) -> impl Iterator<Item = &Thread<'mir, 'tcx>> {
293+
self.threads.iter()
294+
}
295+
292296
pub fn all_stacks(
293297
&self,
294298
) -> impl Iterator<Item = &[Frame<'mir, 'tcx, Provenance, FrameData<'tcx>>]> {

src/eval.rs

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -132,6 +132,8 @@ pub struct MiriConfig {
132132
/// The location of a shared object file to load when calling external functions
133133
/// FIXME! consider allowing users to specify paths to multiple SO files, or to a directory
134134
pub external_so_file: Option<PathBuf>,
135+
/// Run a garbage collector for SbTags every N basic blocks.
136+
pub gc_interval: u32,
135137
}
136138

137139
impl Default for MiriConfig {
@@ -164,6 +166,7 @@ impl Default for MiriConfig {
164166
report_progress: None,
165167
retag_fields: false,
166168
external_so_file: None,
169+
gc_interval: 10_000,
167170
}
168171
}
169172
}

src/lib.rs

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,7 @@ mod operator;
6262
mod range_map;
6363
mod shims;
6464
mod stacked_borrows;
65+
mod tag_gc;
6566

6667
// Establish a "crate-wide prelude": we often import `crate::*`.
6768

@@ -110,6 +111,7 @@ pub use crate::range_map::RangeMap;
110111
pub use crate::stacked_borrows::{
111112
CallId, EvalContextExt as StackedBorEvalContextExt, Item, Permission, SbTag, Stack, Stacks,
112113
};
114+
pub use crate::tag_gc::EvalContextExt as _;
113115

114116
/// Insert rustc arguments at the beginning of the argument list that Miri wants to be
115117
/// set per default, for maximal validation power.

src/machine.rs

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -394,6 +394,11 @@ pub struct Evaluator<'mir, 'tcx> {
394394

395395
/// Handle of the optional shared object file for external functions.
396396
pub external_so_lib: Option<(libloading::Library, std::path::PathBuf)>,
397+
398+
/// Run a garbage collector for SbTags every N basic blocks.
399+
pub(crate) gc_interval: u32,
400+
/// The number of blocks that passed since the last SbTag GC pass.
401+
pub(crate) since_gc: u32,
397402
}
398403

399404
impl<'mir, 'tcx> Evaluator<'mir, 'tcx> {
@@ -469,6 +474,8 @@ impl<'mir, 'tcx> Evaluator<'mir, 'tcx> {
469474
lib_file_path.clone(),
470475
)
471476
}),
477+
gc_interval: config.gc_interval,
478+
since_gc: 0,
472479
}
473480
}
474481

@@ -1008,6 +1015,7 @@ impl<'mir, 'tcx> Machine<'mir, 'tcx> for Evaluator<'mir, 'tcx> {
10081015

10091016
fn before_terminator(ecx: &mut InterpCx<'mir, 'tcx, Self>) -> InterpResult<'tcx> {
10101017
ecx.machine.basic_block_count += 1u64; // a u64 that is only incremented by 1 will "never" overflow
1018+
ecx.machine.since_gc += 1;
10111019
// Possibly report our progress.
10121020
if let Some(report_progress) = ecx.machine.report_progress {
10131021
if ecx.machine.basic_block_count % u64::from(report_progress) == 0 {
@@ -1016,6 +1024,16 @@ impl<'mir, 'tcx> Machine<'mir, 'tcx> for Evaluator<'mir, 'tcx> {
10161024
});
10171025
}
10181026
}
1027+
1028+
// Search for SbTags to find all live pointers, then remove all other tags from borrow
1029+
// stacks.
1030+
// When debug assertions are enabled, run the GC as often as possible so that any cases
1031+
// where it mistakenly removes an important tag become visible.
1032+
if ecx.machine.gc_interval > 0 && ecx.machine.since_gc >= ecx.machine.gc_interval {
1033+
ecx.machine.since_gc = 0;
1034+
ecx.garbage_collect_tags()?;
1035+
}
1036+
10191037
// These are our preemption points.
10201038
ecx.maybe_preempt_active_thread();
10211039
Ok(())

src/shims/tls.rs

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -233,6 +233,12 @@ impl<'tcx> TlsData<'tcx> {
233233
data.remove(&thread_id);
234234
}
235235
}
236+
237+
pub fn iter(&self, mut visitor: impl FnMut(&Scalar<Provenance>)) {
238+
for scalar in self.keys.values().flat_map(|v| v.data.values()) {
239+
visitor(scalar);
240+
}
241+
}
236242
}
237243

238244
impl<'mir, 'tcx: 'mir> EvalContextPrivExt<'mir, 'tcx> for crate::MiriEvalContext<'mir, 'tcx> {}

src/stacked_borrows/mod.rs

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,8 @@ pub struct Stacks {
8080
history: AllocHistory,
8181
/// The set of tags that have been exposed inside this allocation.
8282
exposed_tags: FxHashSet<SbTag>,
83+
/// Whether this memory has been modified since the last time the tag GC ran
84+
modified_since_last_gc: bool,
8385
}
8486

8587
/// Extra global state, available to the memory access hooks.
@@ -422,6 +424,7 @@ impl<'tcx> Stack {
422424
let item = self.get(idx).unwrap();
423425
Stack::item_popped(&item, global, dcx)?;
424426
}
427+
425428
Ok(())
426429
}
427430

@@ -496,6 +499,20 @@ impl<'tcx> Stack {
496499
}
497500
// # Stacked Borrows Core End
498501

502+
/// Integration with the SbTag garbage collector
503+
impl Stacks {
504+
pub fn remove_unreachable_tags(&mut self, live_tags: &FxHashSet<SbTag>) {
505+
if self.modified_since_last_gc {
506+
for stack in self.stacks.iter_mut_all() {
507+
if stack.len() > 64 {
508+
stack.retain(live_tags);
509+
}
510+
}
511+
self.modified_since_last_gc = false;
512+
}
513+
}
514+
}
515+
499516
/// Map per-stack operations to higher-level per-location-range operations.
500517
impl<'tcx> Stacks {
501518
/// Creates a new stack with an initial tag. For diagnostic purposes, we also need to know
@@ -514,6 +531,7 @@ impl<'tcx> Stacks {
514531
stacks: RangeMap::new(size, stack),
515532
history: AllocHistory::new(id, item, current_span),
516533
exposed_tags: FxHashSet::default(),
534+
modified_since_last_gc: false,
517535
}
518536
}
519537

@@ -528,6 +546,7 @@ impl<'tcx> Stacks {
528546
&mut FxHashSet<SbTag>,
529547
) -> InterpResult<'tcx>,
530548
) -> InterpResult<'tcx> {
549+
self.modified_since_last_gc = true;
531550
for (offset, stack) in self.stacks.iter_mut(range.start, range.size) {
532551
let mut dcx = dcx_builder.build(&mut self.history, offset);
533552
f(stack, &mut dcx, &mut self.exposed_tags)?;

0 commit comments

Comments
 (0)