Open
Description
With test code like this:
template <class Span>
__attribute__((noinline)) int testForLoop(Span span) {
int sum = 0;
for (auto it = span.begin() + 1; it < span.end(); ++it) {
sum += *it - *(it - 1);
}
return sum;
}
Without bounded iterators, Clang vectorizes this code. When we enable bounded iterators though, Clang isn't able to vectorize anymore.
For information, the current representation of __bounded_iter
is as follows:
_Iterator __current_; // current iterator
_Iterator __begin_, __end_; // valid range represented as [begin, end]
I played around with changing the representation of __bounded_iter
to see if that could make any difference, and it looks like the following representation instead allows Clang to vectorize this code:
size_t __current_; // current index inside [begin, end]
_Iterator __begin_, __end_; // valid range represented as [begin, end]
Funnily enough, it seems that this representation leads to better code than the version that doesn't use bounded iterators!
Godbolt with a pointer (status quo): https://godbolt.org/z/efa77anqK
Godbolt with the index: https://godbolt.org/z/f9n5zs8sv