Skip to content

[clang] Avoid printing overly large integer/_BitInt numbers in static assertion failure diagnostics #71675 #145053

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion clang/docs/ReleaseNotes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -647,7 +647,11 @@ Improvements to Clang's diagnostics
#GH69470, #GH59391, #GH58172, #GH46215, #GH45915, #GH45891, #GH44490,
#GH36703, #GH32903, #GH23312, #GH69874.


- Improved the performance of static assertions involving large integers by
using hex format instead of string for particularly large values.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we still using hex?


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't do a newline here I think? If you do it separates the things during rendering.

Fixes #GH71675

Improvements to Clang's time-trace
----------------------------------

Expand Down
4 changes: 3 additions & 1 deletion clang/lib/Sema/SemaDeclCXX.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@
#include "clang/Sema/SemaOpenMP.h"
#include "clang/Sema/Template.h"
#include "clang/Sema/TemplateDeduction.h"
#include "llvm/ADT/APSInt.h"
#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/StringExtras.h"
Expand Down Expand Up @@ -17575,7 +17576,8 @@ static bool ConvertAPValueToString(const APValue &V, QualType T,
break;
}
}
V.getInt().toString(Str);

V.getInt().toStringTruncated(Str);
}

break;
Expand Down
8 changes: 4 additions & 4 deletions clang/test/AST/ByteCode/intap.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -87,12 +87,12 @@ typedef unsigned __int128 uint128_t;
static const __uint128_t UINT128_MAX =__uint128_t(__int128_t(-1L));
static_assert(UINT128_MAX == -1, "");
static_assert(UINT128_MAX == 1, ""); // both-error {{static assertion failed}} \
// both-note {{'340282366920938463463374607431768211455 == 1'}}
// both-note {{'FFFFFFFFFFFFFFFFFFFF...FFFFFFFFFFFFFFFFFFFF == 1'}}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is making these print in hex?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was not fixed yet I was hitting a problem where the division was throwing a Floting point error that I was still looking at.


static const __int128_t INT128_MAX = UINT128_MAX >> (__int128_t)1;
static_assert(INT128_MAX != 0, "");
static_assert(INT128_MAX == 0, ""); // both-error {{failed}} \
// both-note {{evaluates to '170141183460469231731687303715884105727 == 0'}}
// both-note {{evaluates to '7FFFFFFFFFFFFFFFFFFF...FFFFFFFFFFFFFFFFFFFF == 0'}}
static const __int128_t INT128_MIN = -INT128_MAX - 1;


Expand All @@ -113,14 +113,14 @@ namespace i128 {
static const __uint128_t UINT128_MAX =__uint128_t(__int128_t(-1L));
static_assert(UINT128_MAX == -1, "");
static_assert(UINT128_MAX == 1, ""); // both-error {{static assertion failed}} \
// both-note {{'340282366920938463463374607431768211455 == 1'}}
// both-note {{'FFFFFFFFFFFFFFFFFFFF...FFFFFFFFFFFFFFFFFFFF == 1'}}

constexpr uint128_t TooMuch = UINT128_MAX * 2;

static const __int128_t INT128_MAX = UINT128_MAX >> (__int128_t)1;
static_assert(INT128_MAX != 0, "");
static_assert(INT128_MAX == 0, ""); // both-error {{failed}} \
// both-note {{evaluates to '170141183460469231731687303715884105727 == 0'}}
// both-note {{evaluates to '7FFFFFFFFFFFFFFFFFFF...FFFFFFFFFFFFFFFFFFFF == 0'}}

constexpr int128_t TooMuch2 = INT128_MAX * INT128_MAX; // both-error {{must be initialized by a constant expression}} \
// both-note {{value 28948022309329048855892746252171976962977213799489202546401021394546514198529 is outside the range of representable}}
Expand Down
10 changes: 10 additions & 0 deletions clang/test/SemaCXX/static-assert.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -255,6 +255,16 @@ int f() {
}
}

namespace GH71675 {
constexpr unsigned _BitInt(__BITINT_MAXWIDTH__ >> 6) F = ~0; // expected-warning {{Clang extension}}
static_assert(F == 1,""); // expected-error {{static assertion failed due to requirement 'F == 1'}} \
// expected-note {{expression evaluates to '40141321820360630391...65812318570934173695 == 1'}}

constexpr unsigned _BitInt(__BITINT_MAXWIDTH__) G = ~0; // expected-warning {{Clang extension}}
static_assert(G == 1,""); // expected-error {{static assertion failed due to requirement 'G == 1'}} \
// expected-note {{expression evaluates to '...85551374411818336255 == 1'}}
} // namespace GH71675

namespace Diagnostics {
/// No notes for literals.
static_assert(false, ""); // expected-error {{failed}}
Expand Down
11 changes: 11 additions & 0 deletions llvm/include/llvm/ADT/APInt.h
Original file line number Diff line number Diff line change
Expand Up @@ -1686,6 +1686,17 @@ class [[nodiscard]] APInt {
bool UpperCase = true,
bool InsertSeparators = false) const;

/// Converts an APInt to a string and append it to Str. Str is commonly a
/// SmallString. If Radix > 10, UpperCase determine the case of letter
/// digits.
/// Very large numbers ar truncated and only the first and last 20 digits will
/// be added with an elipsis separating them.
LLVM_ABI void toStringTruncated(SmallVectorImpl<char> &Str, unsigned Radix,
bool Signed, bool truncate = true,
bool formatAsCLiteral = false,
bool UpperCase = true,
bool InsertSeparators = false) const;

/// Considers the APInt to be unsigned and converts it into a string in the
/// radix given. The radix can be 2, 8, 10 16, or 36.
void toStringUnsigned(SmallVectorImpl<char> &Str, unsigned Radix = 10) const {
Expand Down
6 changes: 6 additions & 0 deletions llvm/include/llvm/ADT/APSInt.h
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,12 @@ class [[nodiscard]] APSInt : public APInt {
}
using APInt::toString;

void toStringTruncated(SmallVectorImpl<char> &Str,
unsigned Radix = 10) const {
APInt::toStringTruncated(Str, Radix, isSigned());
}
using APInt::toStringTruncated;

/// If this int is representable using an int64_t.
bool isRepresentableByInt64() const {
// For unsigned values with 64 active bits, they technically fit into a
Expand Down
27 changes: 27 additions & 0 deletions llvm/lib/Support/APInt.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2164,6 +2164,13 @@ void APInt::fromString(unsigned numbits, StringRef str, uint8_t radix) {
void APInt::toString(SmallVectorImpl<char> &Str, unsigned Radix, bool Signed,
bool formatAsCLiteral, bool UpperCase,
bool InsertSeparators) const {
toStringTruncated(Str, Radix, Signed, false, formatAsCLiteral, UpperCase,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IS there ANY perf hit whatsoever for doing this? We're doing quite a bit of work on what is perhaps a hot-ish path, can you run the compiler perf tester thing on this? @Endilll knows how if you don't.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've never run them. @Endhill if you could give me some pointers, appreciate it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See "Pre-Commit Measurements" section at https://llvm-compile-time-tracker.com/about.php

InsertSeparators);
}

void APInt::toStringTruncated(SmallVectorImpl<char> &Str, unsigned Radix,
bool Signed, bool truncate, bool formatAsCLiteral,
bool UpperCase, bool InsertSeparators) const {
assert((Radix == 10 || Radix == 8 || Radix == 16 || Radix == 2 ||
Radix == 36) &&
"Radix should be 2, 8, 10, 16, or 36!");
Expand Down Expand Up @@ -2278,8 +2285,28 @@ void APInt::toString(SmallVectorImpl<char> &Str, unsigned Radix, bool Signed,
}
} else {
int Pos = 0;
// The value of cutOffSize is not special, it is just a number of
// characters that gives us enough info without losing readability.
constexpr int cutOffSize = 20;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be an argument?

while (Tmp.getBoolValue()) {
uint64_t Digit;
if (truncate && Pos == cutOffSize) {
unsigned numDigits = (int32_t)(Tmp.logBase2()/log2(Radix))+1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
unsigned numDigits = (int32_t)(Tmp.logBase2()/log2(Radix))+1;
unsigned numDigits = static_cast<int32_t>(Tmp.logBase2()/log2(Radix))+1);

ALSO: Why cast to int32, but then assign to an unsigned? Should we be doing math as unsigned? or as signed?

Also, a comment on what this is doing could be helpful.

Also... is there value to pre-calculating log2(Radix) and keeping it somewhere so we don't re-evaluate it every time? or is log2 pure?

if(numDigits-cutOffSize > 0) {
// Calculating pow of exponents over 300000 takes a long time.
// To keep note printing time short(under 3s), values with more digits
// will only return the last 20 digits.
if(numDigits < 300000) {
APInt divider = APIntOps::pow(APInt(Tmp.getBitWidth(),Radix),numDigits-cutOffSize);
Tmp = Tmp.udiv(divider);
Str.append(3,'.');
}
else {
Str.append(3,'.');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would there be any way to find the 'first' digit or something in this case? I find myself wondering if there is a 'better' way here...

IS there any special-casing we can do here for base-10? I DO find myself wondering if pow is the best way for Radix==10. I know that V<<3 + v << 1 is the same as *10, I wonder if someone better at math than I could come up with a way to apply that as a way to do the 10^N as a simple set of shifts/math?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not 100% happy with that else either so I have been thinking about this.

I could not find any way to do it with just shifts also because if we want to support what toString supports, which I think makes sense, the else branch which does divisions instead of shift is for base 10 and base 36.

I had one alternative idea but I was still checking to see if it would work, the property I am trying to check is that if we take a word worth of bites in one of these multiword APInt we can work with it in isolation I mean that there is no digit that is half in one word half in the other.

The idea was that instead of looking at the number of digits already printed we could take a word worth of bites and use the usual approach that uses division to get some number of digit one by one. After that if we were truncating we would shift log2(getActiveWords-2*wordlenght) (might be -1 at that point rather than -2 actually). The idea is this should put us at the first active word, there we would continue using divisions to get the digits. This would give with the default word size some 9/10 digits from the end, and some arbitrary number from the beginning but at most again 9/10. Does it sound like this could work?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that sounds like it could work? I'm not sure of the size implications...

One thing to note, 'exactly 20' numbers in the 1st number isn't important. What IS important is 'enough of the first few that a couple of patterns are obvious', so the numbers have to be accurate, but the number OF them isn't important.

I'm not sure I have a better idea, but I'm still thinking :) Have an attempt at the above and see if it works out.

break;
}
}
}
udivrem(Tmp, Radix, Tmp, Digit);
assert(Digit < Radix && "divide failed");
if (InsertSeparators && Pos % Grouping == 0 && Pos > 0)
Expand Down