Skip to content

Commit 8b2ec77

Browse files
authored
Autoharness Subcommand (#3874)
This PR introduces an initial implementation for the `autoharness` subcommand and a book chapter describing the feature. I recommend reading the book chapter before reviewing the code (or proceeding further in this PR description). ## Callouts `--harness` is to manual harnesses what `--include-function` and `--exclude-function` are to autoharness; both allow the user to filter which harnesses/functions get verified. Their implementation is different, however--`--harness` is a driver-only flag, i.e., we still compile every harness, but then only call CBMC on the ones specified in `--harness`. `--include-function` and `--exclude-function` get passed to the compiler. I made this design choice to try to be more aggressive about improving compiler performance--if a user only asks to verify one function and we go try to codegen a thousand, the compiler will take much longer than it needs to. I thought this more aggressive optimization was warranted given that crates are likely to have far many more functions eligible for autoverification than manual Kani harnesses. (See also the limitations in the book chapter). ## Testing Besides the new tests in this PR, I also ran against [s2n-quic](https://github.com/aws/s2n-quic) to confirm that the subcommand works on larger crates. Towards #3832 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.
1 parent 4f78926 commit 8b2ec77

File tree

55 files changed

+1516
-101
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

55 files changed

+1516
-101
lines changed

docs/src/SUMMARY.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@
1818
- [Reference](./reference.md)
1919
- [Attributes](./reference/attributes.md)
2020
- [Experimental features](./reference/experimental/experimental-features.md)
21+
- [Automatic Harness Generation](./reference/experimental/autoharness.md)
2122
- [Coverage](./reference/experimental/coverage.md)
2223
- [Stubbing](./reference/experimental/stubbing.md)
2324
- [Contracts](./reference/experimental/contracts.md)
Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
# Automatic Harness Generation
2+
3+
Recall the harness for `estimate_size` that we wrote in [First Steps](../../tutorial-first-steps.md):
4+
```rust
5+
{{#include ../../tutorial/first-steps-v1/src/lib.rs:kani}}
6+
```
7+
8+
This harness first declares a local variable `x` using `kani::any()`, then calls `estimate_size` with argument `x`.
9+
Many proof harnesses follow this predictable format—to verify a function `foo`, we create arbitrary values for each of `foo`'s arguments, then call `foo` on those arguments.
10+
11+
The `autoharness` subcommand leverages this observation to automatically generate and run harnesses.
12+
Kani scans the crate for functions whose arguments all implement the `kani::Arbitrary` trait, generates harnesses for them, then runs them.
13+
These harnesses are internal to Kani--i.e., Kani does not make any changes to your source code.
14+
15+
## Usage
16+
Run either:
17+
```
18+
# cargo kani autoharness -Z unstable-options
19+
```
20+
or
21+
```
22+
# kani autoharness -Z unstable-options <FILE>
23+
```
24+
25+
If Kani detects that all of a function `foo`'s arguments implement `kani::Arbitrary`, it will generate and run a `#[kani::proof]` harness, which prints:
26+
27+
```
28+
Autoharness: Checking function foo against all possible inputs...
29+
<VERIFICATION RESULTS>
30+
```
31+
32+
However, if Kani detects that `foo` has a [contract](./contracts.md), it will instead generate a `#[kani::proof_for_contract]` harness and verify the contract:
33+
```
34+
Autoharness: Checking function foo's contract against all possible inputs...
35+
<VERIFICATION RESULTS>
36+
```
37+
38+
Kani generates and runs these harnesses internally—the user only sees the verification results.
39+
40+
The `autoharness` subcommand has options `--include-function` and `--exclude-function` to include and exclude particular functions.
41+
These flags look for partial matches against the fully qualified name of a function.
42+
43+
For example, if a module `my_module` has many functions, but we are only interested in `my_module::foo` and `my_module::bar`, we can run:
44+
```
45+
cargo run autoharness -Z unstable-options --include-function foo include-function bar
46+
```
47+
To exclude `my_module` entirely, run:
48+
```
49+
cargo run autoharness -Z unstable-options --exclude-function my_module
50+
```
51+
52+
## Example
53+
Using the `estimate_size` example from [First Steps](../../tutorial-first-steps.md) again:
54+
```rust
55+
{{#include ../../tutorial/first-steps-v1/src/lib.rs:code}}
56+
```
57+
58+
We get:
59+
60+
```
61+
# cargo kani autoharness -Z unstable-options
62+
Autoharness: Checking function estimate_size against all possible inputs...
63+
RESULTS:
64+
Check 3: estimate_size.assertion.1
65+
- Status: FAILURE
66+
- Description: "Oh no, a failing corner case!"
67+
[...]
68+
69+
Verification failed for - estimate_size
70+
Complete - 0 successfully verified functions, 1 failures, 1 total.
71+
```
72+
73+
## Request for comments
74+
This feature is experimental and is therefore subject to change.
75+
If you have ideas for improving the user experience of this feature,
76+
please add them to [this GitHub issue](https://github.com/model-checking/kani/issues/3832).
77+
78+
## Limitations
79+
Kani will only generate an automatic harness for a function if it can determine that all of the function's arguments implement Arbitrary.
80+
It does not attempt to derive/implement Arbitrary for any types, even if those types could implement Arbitrary.
81+
82+
If a function contains a loop with a loop contract, Kani will detect the presence of a loop contract and verify that contract.
83+
If, however, the loop does not have a contract, then there is currently no way to specify an unwinding bound for the function, meaning that Kani may hang as it tries to unwind the loop.
84+
We recommend using the `--exclude-function` option to exclude any functions that have this issue (or `--harness-timeout` to bail after attempting verification for some amount of time).

kani-compiler/src/args.rs

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,9 @@ pub enum ReachabilityType {
3030
PubFns,
3131
/// Start the cross-crate reachability analysis from all *test* (i.e. `#[test]`) harnesses in the local crate.
3232
Tests,
33+
/// Start the cross-crate reachability analysis from all functions in the local crate.
34+
/// Currently, this mode is only used for automatic harness generation.
35+
AllFns,
3336
}
3437

3538
/// Command line arguments that this instance of the compiler run was called
@@ -88,6 +91,12 @@ pub struct Arguments {
8891
/// Print the final LLBC file to stdout.
8992
#[clap(long)]
9093
pub print_llbc: bool,
94+
/// If we are running the autoharness subcommand, the functions to autoharness
95+
#[arg(long = "autoharness-include-function", num_args(1))]
96+
pub autoharness_included_functions: Vec<String>,
97+
/// If we are running the autoharness subcommand, the functions to exclude from autoverification
98+
#[arg(long = "autoharness-exclude-function", num_args(1))]
99+
pub autoharness_excluded_functions: Vec<String>,
91100
}
92101

93102
#[derive(Debug, Clone, Copy, AsRefStr, EnumString, VariantNames, PartialEq, Eq)]

kani-compiler/src/codegen_aeneas_llbc/compiler_interface.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -253,7 +253,7 @@ impl CodegenBackend for LlbcCodegenBackend {
253253
units.store_modifies(&modifies_instances);
254254
units.write_metadata(&queries, tcx);
255255
}
256-
ReachabilityType::Tests => todo!(),
256+
ReachabilityType::Tests | ReachabilityType::AllFns => todo!(),
257257
ReachabilityType::None => {}
258258
ReachabilityType::PubFns => {
259259
let unit = CodegenUnit::default();

kani-compiler/src/codegen_cprover_gotoc/compiler_interface.rs

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -264,7 +264,7 @@ impl CodegenBackend for GotocCodegenBackend {
264264
let reachability = queries.args().reachability_analysis;
265265
let mut results = GotoCodegenResults::new(tcx, reachability);
266266
match reachability {
267-
ReachabilityType::Harnesses => {
267+
ReachabilityType::AllFns | ReachabilityType::Harnesses => {
268268
let mut units = CodegenUnits::new(&queries, tcx);
269269
let mut modifies_instances = vec![];
270270
let mut loop_contracts_instances = vec![];
@@ -375,7 +375,9 @@ impl CodegenBackend for GotocCodegenBackend {
375375
// Print compilation report.
376376
results.print_report(tcx);
377377

378-
if reachability != ReachabilityType::Harnesses {
378+
if reachability != ReachabilityType::Harnesses
379+
&& reachability != ReachabilityType::AllFns
380+
{
379381
// In a workspace, cargo seems to be using the same file prefix to build a crate that is
380382
// a package lib and also a dependency of another package.
381383
// To avoid overriding the metadata for its verification, we skip this step when

kani-compiler/src/kani_middle/codegen_units.rs

Lines changed: 161 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,11 @@
88
//! according to their stub configuration.
99
1010
use crate::args::ReachabilityType;
11-
use crate::kani_middle::attributes::is_proof_harness;
12-
use crate::kani_middle::metadata::{gen_contracts_metadata, gen_proof_metadata};
11+
use crate::kani_middle::attributes::{KaniAttributes, is_proof_harness};
12+
use crate::kani_middle::kani_functions::{KaniIntrinsic, KaniModel};
13+
use crate::kani_middle::metadata::{
14+
gen_automatic_proof_metadata, gen_contracts_metadata, gen_proof_metadata,
15+
};
1316
use crate::kani_middle::reachability::filter_crate_items;
1417
use crate::kani_middle::resolve::expect_resolve_fn;
1518
use crate::kani_middle::stubbing::{check_compatibility, harness_stub_map};
@@ -20,8 +23,8 @@ use rustc_middle::ty::TyCtxt;
2023
use rustc_session::config::OutputType;
2124
use rustc_smir::rustc_internal;
2225
use stable_mir::CrateDef;
23-
use stable_mir::mir::mono::Instance;
24-
use stable_mir::ty::{FnDef, IndexedVal, RigidTy, TyKind};
26+
use stable_mir::mir::{TerminatorKind, mono::Instance};
27+
use stable_mir::ty::{FnDef, GenericArgKind, GenericArgs, IndexedVal, RigidTy, TyKind};
2528
use std::collections::{BTreeMap, BTreeSet, HashMap, HashSet};
2629
use std::fs::File;
2730
use std::io::BufWriter;
@@ -57,26 +60,66 @@ pub struct CodegenUnit {
5760
impl CodegenUnits {
5861
pub fn new(queries: &QueryDb, tcx: TyCtxt) -> Self {
5962
let crate_info = CrateInfo { name: stable_mir::local_crate().name.as_str().into() };
60-
if queries.args().reachability_analysis == ReachabilityType::Harnesses {
61-
let base_filepath = tcx.output_filenames(()).path(OutputType::Object);
62-
let base_filename = base_filepath.as_path();
63-
let harnesses = filter_crate_items(tcx, |_, instance| is_proof_harness(tcx, instance));
64-
let all_harnesses = harnesses
65-
.into_iter()
66-
.map(|harness| {
67-
let metadata = gen_proof_metadata(tcx, harness, &base_filename);
68-
(harness, metadata)
69-
})
70-
.collect::<HashMap<_, _>>();
63+
let base_filepath = tcx.output_filenames(()).path(OutputType::Object);
64+
let base_filename = base_filepath.as_path();
65+
let args = queries.args();
66+
match args.reachability_analysis {
67+
ReachabilityType::Harnesses => {
68+
let all_harnesses = get_all_manual_harnesses(tcx, base_filename);
69+
// Even if no_stubs is empty we still need to store rustc metadata.
70+
let units = group_by_stubs(tcx, &all_harnesses);
71+
validate_units(tcx, &units);
72+
debug!(?units, "CodegenUnits::new");
73+
CodegenUnits { units, harness_info: all_harnesses, crate_info }
74+
}
75+
ReachabilityType::AllFns => {
76+
let mut all_harnesses = get_all_manual_harnesses(tcx, base_filename);
77+
let mut units = group_by_stubs(tcx, &all_harnesses);
78+
validate_units(tcx, &units);
7179

72-
// Even if no_stubs is empty we still need to store rustc metadata.
73-
let units = group_by_stubs(tcx, &all_harnesses);
74-
validate_units(tcx, &units);
75-
debug!(?units, "CodegenUnits::new");
76-
CodegenUnits { units, harness_info: all_harnesses, crate_info }
77-
} else {
78-
// Leave other reachability type handling as is for now.
79-
CodegenUnits { units: vec![], harness_info: HashMap::default(), crate_info }
80+
let kani_fns = queries.kani_functions();
81+
let kani_harness_intrinsic =
82+
kani_fns.get(&KaniIntrinsic::AutomaticHarness.into()).unwrap();
83+
let kani_any_inst = kani_fns.get(&KaniModel::Any.into()).unwrap();
84+
85+
let verifiable_fns = filter_crate_items(tcx, |_, instance: Instance| -> bool {
86+
// If the user specified functions to include or exclude, only allow instances matching those filters.
87+
let user_included = if !args.autoharness_included_functions.is_empty() {
88+
fn_list_contains_instance(&instance, &args.autoharness_included_functions)
89+
} else if !args.autoharness_excluded_functions.is_empty() {
90+
!fn_list_contains_instance(&instance, &args.autoharness_excluded_functions)
91+
} else {
92+
true
93+
};
94+
user_included
95+
&& is_eligible_for_automatic_harness(tcx, instance, *kani_any_inst)
96+
});
97+
let automatic_harnesses = get_all_automatic_harnesses(
98+
tcx,
99+
verifiable_fns,
100+
*kani_harness_intrinsic,
101+
base_filename,
102+
);
103+
// We generate one contract harness per function under contract, so each harness is in its own unit,
104+
// and these harnesses have no stubs.
105+
units.extend(
106+
automatic_harnesses
107+
.keys()
108+
.map(|harness| CodegenUnit {
109+
harnesses: vec![*harness],
110+
stubs: HashMap::default(),
111+
})
112+
.collect::<Vec<_>>(),
113+
);
114+
all_harnesses.extend(automatic_harnesses);
115+
// No need to validate the units again because validation only checks stubs, and we haven't added any stubs.
116+
debug!(?units, "CodegenUnits::new");
117+
CodegenUnits { units, harness_info: all_harnesses, crate_info }
118+
}
119+
_ => {
120+
// Leave other reachability type handling as is for now.
121+
CodegenUnits { units: vec![], harness_info: HashMap::default(), crate_info }
122+
}
80123
}
81124
}
82125

@@ -94,7 +137,15 @@ impl CodegenUnits {
94137
/// We flag that the harness contains usage of loop contracts.
95138
pub fn store_loop_contracts(&mut self, harnesses: &[Harness]) {
96139
for harness in harnesses {
97-
self.harness_info.get_mut(harness).unwrap().has_loop_contracts = true;
140+
let metadata = self.harness_info.get_mut(harness).unwrap();
141+
metadata.has_loop_contracts = true;
142+
// If we're generating this harness automatically and we encounter a loop contract,
143+
// ensure that the HarnessKind is updated to be a contract harness
144+
// targeting the function to verify.
145+
if metadata.is_automatically_generated {
146+
metadata.attributes.kind =
147+
HarnessKind::ProofForContract { target_fn: metadata.pretty_name.clone() }
148+
}
98149
}
99150
}
100151

@@ -252,3 +303,89 @@ fn apply_transitivity(tcx: TyCtxt, harness: Harness, stubs: Stubs) -> Stubs {
252303
}
253304
new_stubs
254305
}
306+
307+
/// Fetch all manual harnesses (i.e., functions provided by the user) and generate their metadata
308+
fn get_all_manual_harnesses(
309+
tcx: TyCtxt,
310+
base_filename: &Path,
311+
) -> HashMap<Harness, HarnessMetadata> {
312+
let harnesses = filter_crate_items(tcx, |_, instance| is_proof_harness(tcx, instance));
313+
harnesses
314+
.into_iter()
315+
.map(|harness| {
316+
let metadata = gen_proof_metadata(tcx, harness, &base_filename);
317+
(harness, metadata)
318+
})
319+
.collect::<HashMap<_, _>>()
320+
}
321+
322+
/// For each function eligible for automatic verification,
323+
/// generate a harness Instance for it, then generate its metadata.
324+
/// Note that the body of each harness instance is still the dummy body of `kani_harness_intrinsic`;
325+
/// the AutomaticHarnessPass will later transform the bodies of these instances to actually verify the function.
326+
fn get_all_automatic_harnesses(
327+
tcx: TyCtxt,
328+
verifiable_fns: Vec<Instance>,
329+
kani_harness_intrinsic: FnDef,
330+
base_filename: &Path,
331+
) -> HashMap<Harness, HarnessMetadata> {
332+
verifiable_fns
333+
.into_iter()
334+
.map(|fn_to_verify| {
335+
// Set the generic arguments of the harness to be the function it is verifying
336+
// so that later, in AutomaticHarnessPass, we can retrieve the function to verify
337+
// and generate the harness body accordingly.
338+
let harness = Instance::resolve(
339+
kani_harness_intrinsic,
340+
&GenericArgs(vec![GenericArgKind::Type(fn_to_verify.ty())]),
341+
)
342+
.unwrap();
343+
let metadata = gen_automatic_proof_metadata(tcx, &base_filename, &fn_to_verify);
344+
(harness, metadata)
345+
})
346+
.collect::<HashMap<_, _>>()
347+
}
348+
349+
/// Determine whether `instance` is eligible for automatic verification.
350+
fn is_eligible_for_automatic_harness(tcx: TyCtxt, instance: Instance, any_inst: FnDef) -> bool {
351+
// `instance` is ineligble if it is a harness or has an nonexistent/empty body
352+
if is_proof_harness(tcx, instance) || !instance.has_body() {
353+
return false;
354+
}
355+
let body = instance.body().unwrap();
356+
357+
// `instance` is ineligble if it is an associated item of a Kani trait implementation,
358+
// or part of Kani contract instrumentation.
359+
// FIXME -- find a less hardcoded way of checking the former condition (perhaps upstream PR to StableMIR).
360+
if instance.name().contains("kani::Arbitrary")
361+
|| instance.name().contains("kani::Invariant")
362+
|| KaniAttributes::for_instance(tcx, instance)
363+
.fn_marker()
364+
.is_some_and(|m| m.as_str() == "kani_contract_mode")
365+
{
366+
return false;
367+
}
368+
369+
// Each non-generic argument of `instance`` must implement Arbitrary.
370+
body.arg_locals().iter().all(|arg| {
371+
let kani_any_body =
372+
Instance::resolve(any_inst, &GenericArgs(vec![GenericArgKind::Type(arg.ty)]))
373+
.unwrap()
374+
.body()
375+
.unwrap();
376+
if let TerminatorKind::Call { func, .. } = &kani_any_body.blocks[0].terminator.kind {
377+
if let Some((def, args)) = func.ty(body.arg_locals()).unwrap().kind().fn_def() {
378+
return Instance::resolve(def, &args).is_ok();
379+
}
380+
}
381+
false
382+
})
383+
}
384+
385+
/// Return whether the name of `instance` is included in `fn_list`.
386+
/// If `exact = true`, then `instance`'s exact, fully-qualified name must be in `fn_list`; otherwise, `fn_list`
387+
/// can have a substring of `instance`'s name.
388+
fn fn_list_contains_instance(instance: &Instance, fn_list: &[String]) -> bool {
389+
let pretty_name = instance.name();
390+
fn_list.contains(&pretty_name) || fn_list.iter().any(|fn_name| pretty_name.contains(fn_name))
391+
}

kani-compiler/src/kani_middle/kani_functions.rs

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,8 @@ pub enum KaniIntrinsic {
4848
CheckedAlignOf,
4949
#[strum(serialize = "CheckedSizeOfIntrinsic")]
5050
CheckedSizeOf,
51+
#[strum(serialize = "AutomaticHarnessIntrinsic")]
52+
AutomaticHarness,
5153
#[strum(serialize = "IsInitializedIntrinsic")]
5254
IsInitialized,
5355
#[strum(serialize = "ValidValueIntrinsic")]

0 commit comments

Comments
 (0)