Skip to content

Commit ea98116

Browse files
[dfsan] Track field/index-level shadow values in variables
************* * The problem ************* See motivation examples in compiler-rt/test/dfsan/pair.cpp. The current DFSan always uses a 16bit shadow value for a variable with any type by combining all shadow values of all bytes of the variable. So it cannot distinguish two fields of a struct: each field's shadow value equals the combined shadow value of all fields. This introduces an overtaint issue. Consider a parsing function std::pair<char*, int> get_token(char* p); where p points to a buffer to parse, the returned pair includes the next token and the pointer to the position in the buffer after the token. If the token is tainted, then both the returned pointer and int ar tainted. If the parser keeps on using get_token for the rest parsing, all the following outputs are tainted because of the tainted pointer. The CL is the first change to address the issue. ************************** * The proposed improvement ************************** Eventually all fields and indices have their own shadow values in variables and memory. For example, variables with type {i1, i3}, [2 x i1], {[2 x i4], i8}, [2 x {i1, i1}] have shadow values with type {i16, i16}, [2 x i16], {[2 x i16], i16}, [2 x {i16, i16}] correspondingly; variables with primary type still have shadow values i16. *************************** * An potential implementation plan *************************** The idea is to adopt the change incrementially. 1) This CL Support field-level accuracy at variables/args/ret in TLS mode, load/store/alloca still use combined shadow values. After the alloca promotion and SSA construction phases (>=-O1), we assume alloca and memory operations are reduced. So if struct variables do not relate to memory, their tracking is accurate at field level. 2) Support field-level accuracy at alloca 3) Support field-level accuracy at load/store These two should make O0 and real memory access work. 4) Support vector if necessary. 5) Support Args mode if necessary. 6) Support passing more accurate shadow values via custom functions if necessary. *************** * About this CL. *************** The CL did the following 1) extended TLS arg/ret to work with aggregate types. This is similar to what MSan does. 2) implemented how to map between an original type/value/zero-const to its shadow type/value/zero-const. 3) extended (insert|extract)value to use field/index-level progagation. 4) for other instructions, propagation rules are combining inputs by or. The CL converts between aggragate and primary shadow values at the cases. 5) Custom function interfaces also need such a conversion because all existing custom functions use i16. It is unclear whether custome functions need more accurate shadow propagation yet. 6) Added test cases for aggregate type related cases. Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D92261
1 parent c9bc414 commit ea98116

File tree

9 files changed

+1362
-62
lines changed

9 files changed

+1362
-62
lines changed

compiler-rt/test/dfsan/pair.cpp

Lines changed: 32 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
1-
// RUN: %clangxx_dfsan %s -mllvm -dfsan-fast-16-labels -mllvm -dfsan-track-select-control-flow=false -mllvm -dfsan-combine-pointer-labels-on-load=false -o %t && %run %t
1+
// RUN: %clangxx_dfsan %s -mllvm -dfsan-fast-16-labels -mllvm -dfsan-track-select-control-flow=false -mllvm -dfsan-combine-pointer-labels-on-load=false -O0 -DO0 -o %t && %run %t
2+
// RUN: %clangxx_dfsan %s -mllvm -dfsan-fast-16-labels -mllvm -dfsan-track-select-control-flow=false -mllvm -dfsan-combine-pointer-labels-on-load=false -O1 -o %t && %run %t
23

34
#include <algorithm>
45
#include <assert.h>
@@ -64,29 +65,49 @@ void test_simple_constructors() {
6465
int i1 = pair1.second;
6566
int *ptr1 = pair1.first;
6667

68+
#ifdef O0
6769
assert(dfsan_read_label(&i1, sizeof(i1)) == 10);
6870
assert(dfsan_read_label(&ptr1, sizeof(ptr1)) == 10);
71+
#else
72+
assert(dfsan_read_label(&i1, sizeof(i1)) == 8);
73+
assert(dfsan_read_label(&ptr1, sizeof(ptr1)) == 2);
74+
#endif
6975

7076
std::pair<int *, int> pair2 = copy_pair1(pair1);
7177
int i2 = pair2.second;
7278
int *ptr2 = pair2.first;
7379

80+
#ifdef O0
7481
assert(dfsan_read_label(&i2, sizeof(i2)) == 10);
7582
assert(dfsan_read_label(&ptr2, sizeof(ptr2)) == 10);
83+
#else
84+
assert(dfsan_read_label(&i2, sizeof(i2)) == 8);
85+
assert(dfsan_read_label(&ptr2, sizeof(ptr2)) == 2);
86+
#endif
7687

7788
std::pair<int *, int> pair3 = copy_pair2(&pair1);
7889
int i3 = pair3.second;
7990
int *ptr3 = pair3.first;
8091

92+
#ifdef O0
8193
assert(dfsan_read_label(&i3, sizeof(i3)) == 10);
8294
assert(dfsan_read_label(&ptr3, sizeof(ptr3)) == 10);
95+
#else
96+
assert(dfsan_read_label(&i3, sizeof(i3)) == 8);
97+
assert(dfsan_read_label(&ptr3, sizeof(ptr3)) == 2);
98+
#endif
8399

84100
std::pair<int *, int> pair4 = copy_pair3(std::move(pair1));
85101
int i4 = pair4.second;
86102
int *ptr4 = pair4.first;
87103

104+
#ifdef O0
88105
assert(dfsan_read_label(&i4, sizeof(i4)) == 10);
89106
assert(dfsan_read_label(&ptr4, sizeof(ptr4)) == 10);
107+
#else
108+
assert(dfsan_read_label(&i4, sizeof(i4)) == 8);
109+
assert(dfsan_read_label(&ptr4, sizeof(ptr4)) == 2);
110+
#endif
90111
}
91112

92113
void test_branches() {
@@ -118,14 +139,24 @@ void test_branches() {
118139

119140
{
120141
std::pair<const char *, uint32_t> r = return_ptr_and_i32(q, res);
142+
#ifdef O0
121143
assert(dfsan_read_label(&r.first, sizeof(r.first)) == 10);
122144
assert(dfsan_read_label(&r.second, sizeof(r.second)) == 10);
145+
#else
146+
assert(dfsan_read_label(&r.first, sizeof(r.first)) == 2);
147+
assert(dfsan_read_label(&r.second, sizeof(r.second)) == 8);
148+
#endif
123149
}
124150

125151
{
126152
std::pair<const char *, uint64_t> r = return_ptr_and_i64(q, res);
153+
#ifdef O0
127154
assert(dfsan_read_label(&r.first, sizeof(r.first)) == 10);
128155
assert(dfsan_read_label(&r.second, sizeof(r.second)) == 10);
156+
#else
157+
assert(dfsan_read_label(&r.first, sizeof(r.first)) == 2);
158+
assert(dfsan_read_label(&r.second, sizeof(r.second)) == 8);
159+
#endif
129160
}
130161
}
131162
}

compiler-rt/test/dfsan/struct.c

Lines changed: 35 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,7 @@
1-
// RUN: %clang_dfsan %s -o %t && %run %t
1+
// RUN: %clang_dfsan %s -O1 -mllvm -dfsan-fast-16-labels=true -DFAST16_O1 -o %t && %run %t
2+
// RUN: %clang_dfsan %s -O1 -DO1 -o %t && %run %t
3+
// RUN: %clang_dfsan %s -O0 -mllvm -dfsan-fast-16-labels=true -DFAST16_O0 -o %t && %run %t
4+
// RUN: %clang_dfsan %s -O0 -DO0 -o %t && %run %t
25

36
#include <assert.h>
47
#include <sanitizer/dfsan_interface.h>
@@ -35,9 +38,14 @@ Pair copy_pair2(const Pair pair0) {
3538
int main(void) {
3639
int i = 1;
3740
char *ptr = NULL;
41+
#if defined(FAST16_O1) || defined(FAST16_O0)
42+
dfsan_label i_label = 1;
43+
dfsan_label ptr_label = 2;
44+
#else
3845
dfsan_label i_label = dfsan_create_label("i", 0);
39-
dfsan_set_label(i_label, &i, sizeof(i));
4046
dfsan_label ptr_label = dfsan_create_label("ptr", 0);
47+
#endif
48+
dfsan_set_label(i_label, &i, sizeof(i));
4149
dfsan_set_label(ptr_label, &ptr, sizeof(ptr));
4250

4351
Pair pair1 = make_pair(i, ptr);
@@ -46,32 +54,57 @@ int main(void) {
4654

4755
dfsan_label i1_label = dfsan_read_label(&i1, sizeof(i1));
4856
dfsan_label ptr1_label = dfsan_read_label(&ptr1, sizeof(ptr1));
57+
#if defined(O0) || defined(O1)
4958
assert(dfsan_has_label(i1_label, i_label));
5059
assert(dfsan_has_label(i1_label, ptr_label));
5160
assert(dfsan_has_label(ptr1_label, i_label));
5261
assert(dfsan_has_label(ptr1_label, ptr_label));
62+
#elif defined(FAST16_O0)
63+
assert(i1_label == (i_label | ptr_label));
64+
assert(ptr1_label == (i_label | ptr_label));
65+
#else
66+
assert(i1_label == i_label);
67+
assert(ptr1_label == ptr_label);
68+
#endif
5369

5470
Pair pair2 = copy_pair1(&pair1);
5571
int i2 = pair2.i;
5672
char *ptr2 = pair2.ptr;
5773

5874
dfsan_label i2_label = dfsan_read_label(&i2, sizeof(i2));
5975
dfsan_label ptr2_label = dfsan_read_label(&ptr2, sizeof(ptr2));
76+
#if defined(O0) || defined(O1)
6077
assert(dfsan_has_label(i2_label, i_label));
6178
assert(dfsan_has_label(i2_label, ptr_label));
6279
assert(dfsan_has_label(ptr2_label, i_label));
6380
assert(dfsan_has_label(ptr2_label, ptr_label));
81+
#elif defined(FAST16_O0)
82+
assert(i2_label == (i_label | ptr_label));
83+
assert(ptr2_label == (i_label | ptr_label));
84+
#else
85+
assert(i2_label == i_label);
86+
assert(ptr2_label == ptr_label);
87+
#endif
6488

6589
Pair pair3 = copy_pair2(pair1);
6690
int i3 = pair3.i;
6791
char *ptr3 = pair3.ptr;
6892

6993
dfsan_label i3_label = dfsan_read_label(&i3, sizeof(i3));
7094
dfsan_label ptr3_label = dfsan_read_label(&ptr3, sizeof(ptr3));
95+
#if defined(O0) || defined(O1)
7196
assert(dfsan_has_label(i3_label, i_label));
7297
assert(dfsan_has_label(i3_label, ptr_label));
7398
assert(dfsan_has_label(ptr3_label, i_label));
7499
assert(dfsan_has_label(ptr3_label, ptr_label));
100+
#elif defined(FAST16_O0)
101+
assert(i3_label == (i_label | ptr_label));
102+
assert(ptr3_label == (i_label | ptr_label));
103+
#else
104+
assert(i3_label == i_label);
105+
assert(ptr3_label == ptr_label);
106+
#endif
107+
75108

76109
return 0;
77110
}

0 commit comments

Comments
 (0)