6
6
# Summary
7
7
[ summary ] : #summary
8
8
9
- This RFC introduces the ` #[optimize] ` attribute, specifically its ` #[optimize(size)] ` variant for
10
- controlling optimisation level on a per-item basis.
9
+ This RFC introduces the ` #[optimize] ` attribute for controlling optimisation level on a per-item
10
+ basis.
11
11
12
12
# Motivation
13
13
[ motivation ] : #motivation
@@ -17,24 +17,26 @@ crate. With LTO and RLIB-only crates these options become applicable to a whole-
17
17
reduces the ability to control optimisation even further.
18
18
19
19
For applications such as embedded, it is critical, that they satisfy the size constraints. This
20
- means, that code must consciously pick one or the other optimisation level. However, since
21
- optimisation level is increasingly applied program-wide, options like ` -Copt-level=3 ` or
22
- ` -Copt-level=s ` are less and less useful – it is no longer feasible (and never was feasible with
23
- cargo) to use the former one for code where performance matters and the latter everywhere else.
20
+ means, that code must consciously pick one or the other optimisation level. Absence of a method to
21
+ selectively optimise different parts of a program in different ways precludes users from utilising
22
+ the hardware they have to the greatest degree.
24
23
25
- With a C toolchain this is fairly easy to achieve by compiling the relevant objects with different
26
- options. In Rust ecosystem, however, where this concept does not exist, an alternate solution is
27
- necessary.
24
+ With a C toolchain selective optimisation is fairly easy to achieve by compiling the relevant
25
+ codegen units (objects) with different options. In Rust ecosystem, where the concept of such units
26
+ does not exist, an alternate solution is necessary.
28
27
29
- With ` #[optimize(size) ] ` it is possible to annotate separate functions, so that they are optimized
30
- for size in a project otherwise optimized for speed (which is the default for ` cargo --release ` ) .
28
+ With the ` #[optimize] ` attribute it is possible to annotate the optimisation level of separate
29
+ items, so that they are optimized differently from the global optimisation option .
31
30
32
31
# Guide-level explanation
33
32
[ guide-level-explanation ] : #guide-level-explanation
34
33
35
- Sometimes, optimisations are a tradeoff between execution time and the code size. Some
34
+ ## ` #[optimize(size)] `
35
+
36
+ Sometimes, optimisations are a trade-off between execution time and the code size. Some
36
37
optimisations, such as loop unrolling increase code size many times on average (compared to
37
- original function size).
38
+ original function size) for marginal performance benefits. In case such optimisation is not
39
+ desirable…
38
40
39
41
``` rust
40
42
#[optimize(size)]
@@ -43,7 +45,7 @@ fn banana() {
43
45
}
44
46
```
45
47
46
- Will instruct rustc to consider this tradeoff more carefully and avoid optimising in a way that
48
+ …will instruct rustc to consider this trade-off more carefully and avoid optimising in a way that
47
49
would result in larger code rather than a smaller one. It may also have effect on what instructions
48
50
are selected to appear in the final binary.
49
51
@@ -55,26 +57,66 @@ Using this attribute is recommended when inspection of generated code reveals un
55
57
function or functions, but use of ` -O ` is still preferable over ` -C opt-level=s ` or `-C
56
58
opt-level=z`.
57
59
60
+ ## ` #[optimize(speed)] `
61
+
62
+ Conversely, when one of the global optimisation options for code size is used (` -Copt-level=s ` or
63
+ ` -Copt-level=z ` ), profiling might reveal some functions that are unnecessarily “hot”. In that case,
64
+ those functions may be annotated with the ` #[optimize(speed)] ` to make the compiler make its best
65
+ effort to produce faster code.
66
+
67
+ ``` rust
68
+ #[optimize(speed)]
69
+ fn banana () {
70
+ // code
71
+ }
72
+ ```
73
+
74
+ Much like with ` #[optimize(size)] ` , the ` speed ` counterpart is also a hint and will likely not
75
+ yield the same results as using the global optimisation option for speed.
76
+
58
77
# Reference-level explanation
59
78
[ reference-level-explanation ] : #reference-level-explanation
60
79
61
- The ` #[optimize(size)] ` attribute applied to a function definition will instruct the optimisation
62
- engine to avoid applying optimisations that could result in a size increase and machine code
63
- generator to generate code that’s smaller rather than larger.
80
+ The ` #[optimize(size)] ` attribute applied to an item will instruct the optimisation pipeline to
81
+ avoid applying optimisations that could result in a size increase and machine code generator to
82
+ generate code that’s smaller rather than faster.
83
+
84
+ The ` #[optimize(speed)] ` attribute applied to an item will instruct the optimisation pipeline to
85
+ apply optimisations that are likely to yield performance wins and machine code generator to
86
+ generate code that’s faster rather than smaller.
87
+
88
+ The ` #[optimize] ` attributes are just a hint to the compiler and are not guaranteed to result in
89
+ any different code.
90
+
91
+ If an ` #[optimize] ` attribute is applied to some grouping item (such as ` mod ` or a crate), it
92
+ propagates transitively to all items defined within the grouping item.
93
+
94
+ It is an error to specify multiple incompatible ` #[optimize] ` options to a single item at once.
95
+ A more explicit ` #[optimize] ` attribute overrides a propagated attribute.
96
+
97
+ ` #[optimize(speed)] ` is a no-op when a global optimisation for speed option is set (i.e. `-C
98
+ opt-level=1-3` ). Similarly ` #[ optimize(size)] ` is a no-op when a global optimisation for size
99
+ option is set (i.e. ` -C opt-level=s/z ` ). ` #[optimize] ` attributes are no-op when no optimizations
100
+ are done globally (i.e. ` -C opt-level=0 ` ). In all other cases the * exact* interaction of the
101
+ ` #[optimize] ` attribute with the global optimization level is not specified and is left up to
102
+ implementation to decide.
103
+
104
+ # Implementation approach
105
+
106
+ For the LLVM backend, these attributes may be implemented in a following manner:
64
107
65
- Note that the ` #[optimize(size)] ` attribute is just a hint and is not guaranteed to result in any
66
- different or smaller code .
108
+ ` #[optimize(size)] ` – explicit function attributes exist at LLVM level. Items with
109
+ ` optimize(size) ` would simply apply the LLVM attributes to the functions .
67
110
68
- Since ` #[optimize(size)] ` instructs optimisations to behave in a certain way, this means that this
69
- attribute has no effect when no optimisations are run (such as is the case when ` -Copt-level=0 ` ).
70
- Interaction of this attribute with the ` -Copt-level=s ` and ` -Copt-level=z ` flags is not specified
71
- and is left up to implementation to decide.
111
+ ` #[optimize(speed)] ` in conjunction with ` -C opt-level=s/z ` – use a global optimisation level of
112
+ ` -C opt-level=2/3 ` and apply the equivalent LLVM function attribute (` optsize ` , ` minsize ` ) to all
113
+ items which do not have an ` #[optimize(speed)] ` attribute.
72
114
73
115
# Drawbacks
74
116
[ drawbacks ] : #drawbacks
75
117
76
118
* Not all of the alternative codegen backends may be able to express such a request, hence the
77
- “this is an optimisation hint” note on the ` #[optimize(size) ] ` attribute.
119
+ “this is a hint” note on the ` #[optimize] ` attribute.
78
120
* As a fallback, this attribute may be implemented in terms of more specific optimisation hints
79
121
(such as ` inline(never) ` , the future ` unroll(never) ` etc).
80
122
@@ -85,32 +127,35 @@ Proposed is a very semantic solution (describes the desired result, instead of b
85
127
problem of needing to sometimes inhibit some of the trade-off optimisations such as loop unrolling.
86
128
87
129
Alternative, of course, would be to add attributes controlling such optimisations, such as
88
- ` #[unroll(no)] ` on top of a a loop statement. There’s already precedent for this in the ` #[inline] `
130
+ ` #[unroll(no)] ` on top of a loop statement. There’s already precedent for this in the ` #[inline] `
89
131
annotations.
90
132
91
- The author would like to argue that we should eventually have * both* , the ` #[optimize(size) ] ` for
92
- people who look at generated code and decide that it is too large , and the targetted attributes for
93
- people who know * why* the code is too large .
133
+ The author would like to argue that we should eventually have * both* , the ` #[optimize] ` for
134
+ people who look at generated code but are not willing to dig for exact reasons , and the targeted
135
+ attributes for people who know * why* the code is not satisfactory .
94
136
95
- Furthermore, currently ` optimize(size) ` is able to do more than any possible combination of
96
- targetted attributes would be able to such as influencing the instruction selection or switch
97
- codegen strategy (jump table, if chain, etc.) This makes the attribute useful even in presence of
98
- all the targetted optimisation knobs we might have in the future.
137
+ Furthermore, currently ` optimize ` is able to do more than any possible combination of targeted
138
+ attributes would be able to such as influencing the instruction selection or switch codegen
139
+ strategy (jump table, if chain, etc.) This makes the attribute useful even in presence of all the
140
+ targeted optimisation knobs we might have in the future.
99
141
100
142
# Prior art
101
143
[ prior-art ] : #prior-art
102
144
103
145
* LLVM: ` optsize ` , ` optnone ` , ` minsize ` function attributes (exposed in Clang in some way);
104
146
* GCC: ` __attribute__((optimize)) ` function attribute which allows setting the optimisation level
105
147
and using certain(?) ` -f ` flags for each function;
106
- * IAR: Optimisations have a checkbox for “No size constraints”, which allows compiler to go out of
107
- its way to optimize without considering the size tradeoff . Can only be applied on a
108
- per-compilation-unit basis. Enabled by default, as is appropriate for a compiler targetting
148
+ * IAR: Optimisations have a check box for “No size constraints”, which allows compiler to go out of
149
+ its way to optimize without considering the size trade-off . Can only be applied on a
150
+ per-compilation-unit basis. Enabled by default, as is appropriate for a compiler targeting
109
151
embedded use-cases.
110
152
111
153
# Unresolved questions
112
154
[ unresolved ] : #unresolved-questions
113
155
114
- * Should we support such an attribute at module-level? Crate-level?
115
- * If yes, should we also implement ` optimize(always) ` ? ` optimize(level=x) ` ?
116
- * Left for future discussion, but should make sure such extension is possible.
156
+ * Should we also implement ` optimize(always) ` ? ` optimize(level=x) ` ?
157
+ * Left for future discussion, but should make sure such extension is possible.
158
+ * Should there be any way to specify what global optimisation for speed level is used in
159
+ conjunction with the optimisation for speed option (e.g. ` -Copt-level=s3 ` could be equivalent to
160
+ ` -Copt-level=3 ` and ` #[optimize(size)] ` on the crate item);
161
+ * This may matter for users of ` #[optimize(speed)] ` .
0 commit comments