Skip to content

[clang] Avoid printing overly large integer/_BitInt numbers in static assertion failure diagnostics #71675 #145053

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion clang/docs/ReleaseNotes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -647,7 +647,11 @@ Improvements to Clang's diagnostics
#GH69470, #GH59391, #GH58172, #GH46215, #GH45915, #GH45891, #GH44490,
#GH36703, #GH32903, #GH23312, #GH69874.


- Improved the performance of static assertions envolving large integers by
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Improved the performance of static assertions envolving large integers by
- Improved the performance of static assertions involving large integers by

using hex format instead of string.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
using hex format instead of string.
using hex format instead of string for particularly large values.


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't do a newline here I think? If you do it separates the things during rendering.

Fixes #GH71675

Improvements to Clang's time-trace
----------------------------------

Expand Down
17 changes: 16 additions & 1 deletion clang/lib/Sema/SemaDeclCXX.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@
#include "clang/Sema/SemaOpenMP.h"
#include "clang/Sema/Template.h"
#include "clang/Sema/TemplateDeduction.h"
#include "llvm/ADT/APSInt.h"
#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/StringExtras.h"
Expand Down Expand Up @@ -17575,7 +17576,21 @@ static bool ConvertAPValueToString(const APValue &V, QualType T,
break;
}
}
V.getInt().toString(Str);

llvm::APSInt vInt = V.getInt();
if (llvm::APSInt::compareValues(
vInt, llvm::APSInt::getUnsigned(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole bit here is equivilent to: isRepresentableByInt64

std::numeric_limits<uint64_t>::max())) >= 0 ||
vInt < std::numeric_limits<int64_t>::min()) {
// The value of cutSize is not special, it is just a number of
// characters that gives us enough info without losing readability.
const int cutSize = 20;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const int cutSize = 20;
constexpr int cutOffSize = 20;

vInt.toString(Str, 16);
Str.erase(Str.begin() + cutSize, Str.end() - cutSize);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This here doesn't really seem right. We're just erasing blindly in the middle, despite not knowing that we have enough room. I think what we REALLY want is to decide to keep the first N characters, and the last N characters (and still do the ...).

Additionally, I think instead of doing the whole to-string mechanism here (and since we're doing it in hex it is easy to do so), we could just do a numeric-trim of hte front/back instead and join them afterwards.

THAT SAID: I'm not sold on doing this as a hex-print, I think we can keep it decimal. We could have a new function in APInt that does a print only the first N and last N, with dots between kind of thing. If we have that, we don't actually have to do any branching here, because choosing a reasonable number there plus decent algorithm for it (basically, normal RHS processing until we reach 'N' digits, then figure out how many orders of magnitude to divide by using 'log' to figure out how to get the remaining 'N').

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of keeping the decimals. I think it avoids the weirdness happening in the enum test, which I think might have been caused by some characters getting cut in half. Working on it

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi just wanted to run something by you before I make further changes.
Before, we were using the value of the APInt to decide if we would truncate or not. I am tempted since we will be looking at the number of digits anyway to decide how many to skip to instead use that as the deciding factor, so something like, if the number of digits in the APInt value is larger than 40 (This is if we want to keep N=20) then we truncate keeping the first and last 20.
Were you thinking of some other way to decide what N should be by the way?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really know how this is supposed to work. We need to know what to display without printing the value to a string at all, because THAT is what's breaking our neck. The simplest solution would be to just check the value (or even getActiveBits()?) and if that is larger than some arbitrary value (you can just check that printing the note doesn't take longer than, say, 3 seconds), we simply skip the note altogether.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi just wanted to run something by you before I make further changes. Before, we were using the value of the APInt to decide if we would truncate or not. I am tempted since we will be looking at the number of digits anyway to decide how many to skip to instead use that as the deciding factor, so something like, if the number of digits in the APInt value is larger than 40 (This is if we want to keep N=20) then we truncate keeping the first and last 20. Were you thinking of some other way to decide what N should be by the way?

I had no problem with the "around 20 digits" on each side. My suggestion was to have the actual printing in APInt do the magic (via separate function). So algorithmically, it is something like:

unsigned end_digits_printed = 0;
// Print the 'end digits'.
while(value != 0 && end_digits_printed <20) {
  APInt temp;
  int64_t rem;
  APInt::udivrem(value, 10, temp, rem);//udiv becasue before this we figured out 'negative' and made it positive.
 
  //add-rem-to-buffer
  value = temp;
}
// print the '...'
if (value == 0) return;

// This gives more of a log-base-16 by dividing it by 4 I think (or should this be a multiply?), but gives us an approximation.
unsigned ApproxDigitsLeft = value.logBase2() / 4;

if (ApproxDigitsLeft > 20)
  value = value.udiv( 10 * (ApproxDigitsLeft - 20)); // removes the 'middle' approx all-but-20.


// print the leading part.
while (value != 0) {
   // same stuff as above.
}

So basically, we don't have to calculate the whole thing to figure out how many to remove, but we can be 'reasonably accurate' here. WDYT? ALSO, the base-16 to base-10 approximation might need some tweeking so that we always print some of the leading digits (though I think 'at least 20 hex digits' is... a lot to make sure of that.). We MIGHT want to make the top-half a little smaller since there is a hex->decimal relationship, so maybe make sure we're printing the top 16 hex-digit-worth?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, that is basically what I had in mind. I'm extending toString and there is a Pos variable as you have there counting how many digits were already inserted and we can get the number of remaing using value.logBase2/log2(Radix) +1 that actually gives the exact count. Then the division will be value.udiv( Radix ^ (ApproxDigitsLeft - 20)). I just need to get that pow working it's throwing a seg fault but I'll figure that out.

Copy link
Author

@maramatias maramatias Jun 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tbaederr I think the last comment from @erichkeane probably made it clearer but basically what we were discussing is creating a separate version of toString (tentatively naming it toStringTruncated) that does not print all of the number, the implementation of that function basically takes the number and does a huge number of divisions to get each digit in the resulting String, the approach suggested is that once we have printed 20 digits we do a division that jumps us to the first 20. I'm hoping I can have something to push in the next couple days time permitting.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can get the number of remaing using value.logBase2/log2(Radix) +1 that actually gives the exact count.

That sounds VAGUELY familiar :) Its been a few decades since I've been in a math class....

With the 'pow', you probably need SOME sort of guard to make sure you don't exceed the number we can store.

Str.insert(Str.begin() + cutSize, 3, '.');
} else {
vInt.toString(Str);
}
}

break;
Expand Down
8 changes: 4 additions & 4 deletions clang/test/AST/ByteCode/intap.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -87,12 +87,12 @@ typedef unsigned __int128 uint128_t;
static const __uint128_t UINT128_MAX =__uint128_t(__int128_t(-1L));
static_assert(UINT128_MAX == -1, "");
static_assert(UINT128_MAX == 1, ""); // both-error {{static assertion failed}} \
// both-note {{'340282366920938463463374607431768211455 == 1'}}
// both-note {{'FFFFFFFFFFFFFFFFFFFF...FFFFFFFFFFFFFFFFFFFF == 1'}}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is making these print in hex?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was not fixed yet I was hitting a problem where the division was throwing a Floting point error that I was still looking at.


static const __int128_t INT128_MAX = UINT128_MAX >> (__int128_t)1;
static_assert(INT128_MAX != 0, "");
static_assert(INT128_MAX == 0, ""); // both-error {{failed}} \
// both-note {{evaluates to '170141183460469231731687303715884105727 == 0'}}
// both-note {{evaluates to '7FFFFFFFFFFFFFFFFFFF...FFFFFFFFFFFFFFFFFFFF == 0'}}
static const __int128_t INT128_MIN = -INT128_MAX - 1;


Expand All @@ -113,14 +113,14 @@ namespace i128 {
static const __uint128_t UINT128_MAX =__uint128_t(__int128_t(-1L));
static_assert(UINT128_MAX == -1, "");
static_assert(UINT128_MAX == 1, ""); // both-error {{static assertion failed}} \
// both-note {{'340282366920938463463374607431768211455 == 1'}}
// both-note {{'FFFFFFFFFFFFFFFFFFFF...FFFFFFFFFFFFFFFFFFFF == 1'}}

constexpr uint128_t TooMuch = UINT128_MAX * 2;

static const __int128_t INT128_MAX = UINT128_MAX >> (__int128_t)1;
static_assert(INT128_MAX != 0, "");
static_assert(INT128_MAX == 0, ""); // both-error {{failed}} \
// both-note {{evaluates to '170141183460469231731687303715884105727 == 0'}}
// both-note {{evaluates to '7FFFFFFFFFFFFFFFFFFF...FFFFFFFFFFFFFFFFFFFF == 0'}}

constexpr int128_t TooMuch2 = INT128_MAX * INT128_MAX; // both-error {{must be initialized by a constant expression}} \
// both-note {{value 28948022309329048855892746252171976962977213799489202546401021394546514198529 is outside the range of representable}}
Expand Down
2 changes: 1 addition & 1 deletion clang/test/Sema/enum.c
Original file line number Diff line number Diff line change
Expand Up @@ -196,7 +196,7 @@ enum GH59352 { // expected-warning {{enumeration values exceed range of largest
BigVal = 66666666666666666666wb
};
_Static_assert(BigVal == 66666666666666666666wb); /* expected-error {{static assertion failed due to requirement 'BigVal == 66666666666666666666wb'}}
expected-note {{expression evaluates to '11326434445538011818 == 66666666666666666666'}}
expected-note {{expression evaluates to '11326434445538011818 == 39D2F941E420AAAAA<U+0000><U+0000><U+0000>...<U+0000><U+0000><U+0000>39D2F941E420AAAAA'}}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems VERY wrong here...

*/
_Static_assert(
_Generic(BigVal, // expected-error {{static assertion failed}}
Expand Down
6 changes: 6 additions & 0 deletions clang/test/SemaCXX/static-assert.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -255,6 +255,12 @@ int f() {
}
}

namespace GH71675 {
constexpr unsigned _BitInt(__BITINT_MAXWIDTH__ >> 6) F = ~0; // expected-warning {{Clang extension}}
static_assert(F == 1,""); // expected-error {{static assertion failed due to requirement 'F == 1'}} \
// expected-note {{expression evaluates to 'FFFFFFFFFFFFFFFFFFFF...FFFFFFFFFFFFFFFFFFFF == 1'}}
} // namespace GH71675

namespace Diagnostics {
/// No notes for literals.
static_assert(false, ""); // expected-error {{failed}}
Expand Down