Skip to content

Commit e2b665f

Browse files
AlanSternpaulmckrcu
authored andcommitted
tools/memory-model: Explain syntactic and semantic dependencies
Paul Heidekrüger pointed out that the Linux Kernel Memory Model documentation doesn't mention the distinction between syntactic and semantic dependencies. This is an important difference, because the compiler can easily break dependencies that are only syntactic, not semantic. This patch adds a few paragraphs to the LKMM documentation explaining these issues and illustrating how they can matter. Suggested-by: Paul Heidekrüger <paul.heidekrueger@in.tum.de> Reviewed-by: Akira Yokosawa <akiyks@gmail.com> Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
1 parent da12301 commit e2b665f

File tree

1 file changed

+51
-0
lines changed

1 file changed

+51
-0
lines changed

tools/memory-model/Documentation/explanation.txt

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -485,6 +485,57 @@ have R ->po X. It wouldn't make sense for a computation to depend
485485
somehow on a value that doesn't get loaded from shared memory until
486486
later in the code!
487487

488+
Here's a trick question: When is a dependency not a dependency? Answer:
489+
When it is purely syntactic rather than semantic. We say a dependency
490+
between two accesses is purely syntactic if the second access doesn't
491+
actually depend on the result of the first. Here is a trivial example:
492+
493+
r1 = READ_ONCE(x);
494+
WRITE_ONCE(y, r1 * 0);
495+
496+
There appears to be a data dependency from the load of x to the store
497+
of y, since the value to be stored is computed from the value that was
498+
loaded. But in fact, the value stored does not really depend on
499+
anything since it will always be 0. Thus the data dependency is only
500+
syntactic (it appears to exist in the code) but not semantic (the
501+
second access will always be the same, regardless of the value of the
502+
first access). Given code like this, a compiler could simply discard
503+
the value returned by the load from x, which would certainly destroy
504+
any dependency. (The compiler is not permitted to eliminate entirely
505+
the load generated for a READ_ONCE() -- that's one of the nice
506+
properties of READ_ONCE() -- but it is allowed to ignore the load's
507+
value.)
508+
509+
It's natural to object that no one in their right mind would write
510+
code like the above. However, macro expansions can easily give rise
511+
to this sort of thing, in ways that often are not apparent to the
512+
programmer.
513+
514+
Another mechanism that can lead to purely syntactic dependencies is
515+
related to the notion of "undefined behavior". Certain program
516+
behaviors are called "undefined" in the C language specification,
517+
which means that when they occur there are no guarantees at all about
518+
the outcome. Consider the following example:
519+
520+
int a[1];
521+
int i;
522+
523+
r1 = READ_ONCE(i);
524+
r2 = READ_ONCE(a[r1]);
525+
526+
Access beyond the end or before the beginning of an array is one kind
527+
of undefined behavior. Therefore the compiler doesn't have to worry
528+
about what will happen if r1 is nonzero, and it can assume that r1
529+
will always be zero regardless of the value actually loaded from i.
530+
(If the assumption turns out to be wrong the resulting behavior will
531+
be undefined anyway, so the compiler doesn't care!) Thus the value
532+
from the load can be discarded, breaking the address dependency.
533+
534+
The LKMM is unaware that purely syntactic dependencies are different
535+
from semantic dependencies and therefore mistakenly predicts that the
536+
accesses in the two examples above will be ordered. This is another
537+
example of how the compiler can undermine the memory model. Be warned.
538+
488539

489540
THE READS-FROM RELATION: rf, rfi, and rfe
490541
-----------------------------------------

0 commit comments

Comments
 (0)