Skip to content

Commit 3d85abc

Browse files
authored
analyze: add scripts for computing pointwise metrics (#1074)
Adds scripts for computing "pointwise success rate" metrics. For each function, we run the static analysis and rewrite that function in isolation, producing a new `.rs` file where that function has been rewritten but all other code remains the same. Then we remove the `unsafe` qualifier from the target function and try to compile the code. The "pointwise success rate" is the number of functions on which this procedure succeeds. The main entry point is `c2rust-analyze/scripts/run_pointwise_metrics_lighttpd.sh` (as the name suggests, this is designed to compute the success rate on lighttpd specifically). It uses a few helpers: `pointwise_try_build.sh` tries to remove `unsafe` and compile the rewritten code for a specific function, `pointwise_try_build_unmodified.sh` does the same but on the unmodified, non-rewritten code (used for computing a baseline success rate), and `pointwise_metrics.py` tallies up the results and prints overall counts. Current output on lighttpd: ``` pointwise: 98/1008 functions passed unmodified: 149/1008 functions passed improved 20 functions broke 71 functions ``` This PR depends on #1073, which implements the `pointwise` rewrite mode in `c2rust-analyze`.
2 parents d85b4d0 + 833e5f6 commit 3d85abc

File tree

3 files changed

+224
-0
lines changed

3 files changed

+224
-0
lines changed
Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
'''
2+
Process logs to compute pointwise success rate metrics.
3+
4+
These metrics are measured as follows. For each function, we run the static
5+
analysis and rewrite that function in isolation, producing a new `.rs` file
6+
where that function has been rewritten but all other code remains the same.
7+
Then we remove the `unsafe` qualifier from the target function and try to
8+
compile the code. The "pointwise success rate" is the number of functions on
9+
which this procedure succeeds.
10+
11+
As a performance optimization, instead of running analysis separately for each
12+
function, we run `c2rust-analyze` with `--rewrite-mode pointwise`, which runs
13+
the analysis part once and then rewrites each function in isolation using the
14+
same analysis results. This provides a significant speedup for large codebases
15+
where the static analysis portion is very slow.
16+
17+
To provide a basis for comparison, in addition to attempting to compile all
18+
pointwise rewrites, we also try removing `unsafe` and compiling each function
19+
in the original, unmodified code. This provides a baseline for how many
20+
functions are "trivially safe" without rewriting.
21+
'''
22+
23+
from pprint import pprint
24+
import re
25+
import sys
26+
27+
# `pointwise_log_path` should be a log generated by running
28+
# `pointwise_try_build.sh` on each output file of a pointwise rewrite
29+
# (`foo.*.rs`, one per function). The outputs for all files should be
30+
# concatenated in a single log. This gives the results of pointwise rewriting
31+
# and compiling each function.
32+
#
33+
# `unmodified_log_path` should come from `pointwise_try_build_unmodified.sh`
34+
# instead. This gives results of pointwise compiling each function without
35+
# rewriting.
36+
pointwise_log_path, unmodified_log_path = sys.argv[1:]
37+
38+
39+
FUNC_ERRORS_RE = re.compile(r'^got ([0-9]+) errors for ([^ \n]+)$')
40+
41+
def read_func_errors(f):
42+
func_errors = {}
43+
for line in f:
44+
m = FUNC_ERRORS_RE.match(line)
45+
if m is None:
46+
continue
47+
func = m.group(2)
48+
errors = int(m.group(1))
49+
assert func not in func_errors, 'duplicate entry for %r' % func
50+
func_errors[func] = errors
51+
return func_errors
52+
53+
pointwise_func_errors = read_func_errors(open(pointwise_log_path))
54+
pointwise_ok = set(func for func, errors in pointwise_func_errors.items() if errors == 0)
55+
print('pointwise: %5d/%d functions passed (%.1f%%)' % (
56+
len(pointwise_ok), len(pointwise_func_errors),
57+
len(pointwise_ok) / len(pointwise_func_errors) * 100))
58+
59+
unmodified_func_errors = read_func_errors(open(unmodified_log_path))
60+
unmodified_ok = set(func for func, errors in unmodified_func_errors.items() if errors == 0)
61+
print('unmodified: %5d/%d functions passed (%.1f%%)' % (
62+
len(unmodified_ok), len(unmodified_func_errors),
63+
len(unmodified_ok) / len(unmodified_func_errors) * 100))
64+
65+
assert len(pointwise_func_errors) == len(unmodified_func_errors)
66+
num_total = len(pointwise_func_errors)
67+
num_unmodified_ok = len(unmodified_ok)
68+
num_unmodified_bad = num_total - num_unmodified_ok
69+
70+
improved = pointwise_ok - unmodified_ok
71+
print('improved: %5d/%d functions (%.1f%%)' % (
72+
len(improved), num_unmodified_bad, len(improved) / num_unmodified_bad * 100))
73+
broke = unmodified_ok - pointwise_ok
74+
print('broke: %5d/%d functions (%.1f%%)' % (
75+
len(broke), num_unmodified_ok, len(broke) / num_unmodified_ok * 100))
Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
#!/bin/bash
2+
set -euo pipefail
3+
4+
echo
5+
6+
f=$1
7+
mode=$2
8+
shift 2
9+
flags=( "$@" )
10+
echo "f=$f"
11+
echo "mode=$mode"
12+
13+
name=${f%%.*.rs}
14+
name=${name##**/}
15+
echo "name=$name"
16+
17+
func=${f%.rs}
18+
func=${func##*.}
19+
echo "func=$func"
20+
21+
filter_errors() {
22+
jq 'select(.level == "error") | .message' -r |
23+
{ grep -v -e '^aborting due to ' -e '^call to unsafe function is unsafe ' || true; }
24+
}
25+
26+
case "$mode" in
27+
pointwise)
28+
sed -i -e "/fn $func\\>/s/\\<unsafe //" $f
29+
;;
30+
unmodified)
31+
d="$(dirname "$f")"
32+
f="$d/${name}_safe_${func}.rs"
33+
cp "$d/$name.rs" "$f"
34+
sed -i -e "/fn $func\\>/s/\\<unsafe //" $f
35+
;;
36+
*)
37+
echo "unsupported mode $mode" 1>&2
38+
exit 1
39+
;;
40+
esac
41+
42+
rustc --error-format json --emit metadata --crate-name $name "$f" "${flags[@]}" 2>rustc-$func.json || true
43+
num_lines="$(cat rustc-$func.json | filter_errors | wc -l)"
44+
echo "got $num_lines errors for $func"
45+
if [[ "$num_lines" -eq 0 ]]; then
46+
exit 0
47+
else
48+
cat rustc-$func.json | filter_errors
49+
exit 1
50+
fi
Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
#!/bin/bash
2+
set -euo pipefail
3+
4+
# Run pointwise metrics on lighttpd_rust_amalgamated.
5+
6+
if [[ $# -ne 1 ]]; then
7+
echo "Usage: $0 <path/to/lighttpd_rust_amalgamated/>"
8+
exit 1
9+
fi
10+
11+
SCRIPT_DIR="$(dirname "$0")"
12+
13+
# Get the path to lighttpd_rust_amalgamated
14+
MODULE_DIR="$1"
15+
shift 1
16+
17+
# Find the sysroot directory of rustc
18+
SYSROOT="$(rustc --print sysroot)"
19+
20+
# Find the necessary rlibs
21+
extern() {
22+
local name=$1
23+
local rlib=$(find "$MODULE_DIR/target/debug/deps" -name "lib${name}*.rlib" -print -quit)
24+
echo >&2 "found rlib for $name: $rlib"
25+
echo --extern $name=$rlib
26+
}
27+
28+
now=$(date +%Y%m%d-%H%M%S)
29+
30+
31+
# Set $rustc_flags and run the analysis as appropriate for the target project.
32+
# $rustc_flags is also used below for `pointwise_try_build.sh`.
33+
project="$(basename "$MODULE_DIR")"
34+
case "$project" in
35+
lighttpd_*)
36+
rustc_flags=(
37+
--edition 2021
38+
--crate-type rlib
39+
#--sysroot "$SYSROOT"
40+
-L "dependency=$MODULE_DIR/target/debug/deps"
41+
$(extern c2rust_bitfields)
42+
$(extern libc)
43+
-A warnings
44+
)
45+
46+
C2RUST_ANALYZE_NO_CARGO=1 \
47+
C2RUST_ANALYZE_REWRITE_MODE=pointwise \
48+
C2RUST_ANALYZE_USE_MANUAL_SHIMS=1 \
49+
cargo run --bin c2rust-analyze --release -- "$MODULE_DIR/src/main.rs" \
50+
--crate-name "$(basename "$MODULE_DIR")" \
51+
"${rustc_flags[@]}" \
52+
|& tee pointwise-lighttpd-analyze-$now.log \
53+
|| true
54+
55+
;;
56+
57+
cfs_*)
58+
: cargo run --bin c2rust-analyze --release -- \
59+
--rewrite-mode pointwise --use-manual-shims -- \
60+
build --manifest-path "$MODULE_DIR/Cargo.toml" \
61+
|& tee pointwise-cfs-analyze-$now.log \
62+
|| true
63+
64+
rustc_flags=(
65+
--edition 2021
66+
--crate-type rlib
67+
#--sysroot "$SYSROOT"
68+
-L "dependency=$MODULE_DIR/target/debug/deps"
69+
$(extern c2rust_bitfields)
70+
$(extern f128)
71+
$(extern libc)
72+
$(extern memoffset)
73+
-A warnings
74+
)
75+
76+
;;
77+
78+
*)
79+
echo "unsupported project $project" 1>&2
80+
exit 1
81+
esac
82+
83+
84+
# Try to compile each function separately.
85+
86+
pointwise_log_file=pointwise-lighttpd-pointwise-$now.log
87+
for f in "$MODULE_DIR"/src/main.*.rs; do
88+
"$SCRIPT_DIR/pointwise_try_build.sh" "$f" pointwise "${rustc_flags[@]}" || true
89+
done |& tee "$pointwise_log_file"
90+
91+
unmodified_log_file=pointwise-lighttpd-unmodified-$now.log
92+
for f in "$MODULE_DIR"/src/main.*.rs; do
93+
"$SCRIPT_DIR/pointwise_try_build.sh" "$f" unmodified "${rustc_flags[@]}" || true
94+
done |& tee "$unmodified_log_file"
95+
96+
echo
97+
echo
98+
99+
python3 "$SCRIPT_DIR/pointwise_metrics.py" "$pointwise_log_file" "$unmodified_log_file"

0 commit comments

Comments
 (0)