-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
Make more types jl_static_show readably #58512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
For #40652, can we thread something through to not show the field name in precompile printing contexts? Seems unnecessary to have the correctness of the printing be bike shredded on how people want it to print in the debugger. |
Please can you add tests for |
It might be worth printing these floating-point numbers as hexadecimal literals (specifically in a type context): julia> 0x1p0
1.0
julia> 0x1.8p3
12.0
julia> x = 0x.4p-1
0.125 These can be easy to parse / print bit-accurately, if we have trouble finding floating point <-> decimal conversion routines that we trust to be digit-/bit-accurate. Of course, if we trust the decimal conversion on all platforms, that's just as good and more human-readable. |
n backslashes followed by " => 2n+1 backslashes and " n backslahes at the end => 2n backslashes
Unfortunately we have no hex Float32 literals, and I would like to preserve readability. That's also why I prefer |
Co-authored-by: Jameson Nash <vtjnash@gmail.com>
Co-authored-by: Jameson Nash <vtjnash@gmail.com>
No, but it losslessly casts to Float32 in the same way that Int64 -> Int32 does That said, I'm all in favor of the readability |
@@ -985,11 +1069,14 @@ static size_t jl_static_show_x_(JL_STREAM *out, jl_value_t *v, jl_datatype_t *vt | |||
n += jl_printf(out, "0x%08" PRIx32, *(uint32_t*)v); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you fix-up Pointer types as well?
julia> ccall(:jl_, Cvoid, (Any,), Val(Ptr{Cvoid}(1)))
Base.Val{0x0000000000000001}()
It's a little awkward, because their repr
doesn't round-trip either:
julia> Val(Ptr{Cvoid}(1))
Val{Ptr{Nothing} @0x0000000000000001}()
julia> Val{Ptr{Nothing} @0x0000000000000001}()
ERROR: ParseError:
# Error @ REPL[10]:1:18
Val{Ptr{Nothing} @0x0000000000000001}()
# └─────────────────┘ ── Expected `}`
but regardless of whether it parses properly, I don't think we should be throwing out the type information
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also consider fixing the jl_tvar_type
case also. Right now it only prints the name after checking that the parent UnionAll is present, but we should also check that that name is unique, and if not, to find some way to make it unique in printing.
else if (vt == jl_long_type) { | ||
// Avoid unnecessary Int64(x)/Int32(x) | ||
n += jl_printf(out, "%" PRIdPTR, *(intptr_t*)v); | ||
} | ||
else if (vt == jl_int64_type) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like negative Int128
values are still problematic:
julia> ccall(:jl_, Cvoid, (Any,), typemin(Int128))
Int128(0x80000000000000000000000000000000)
julia> ccall(:jl_, Cvoid, (Any,), Int128(-1))
Int128(0xffffffffffffffffffffffffffffffff)
And UInt128
prints with an explicit conversion it doesn't need:
julia> ccall(:jl_, Cvoid, (Any,), UInt128(0))
UInt128(0x00000000000000000000000000000000)
julia> UInt128(0x00000000000000000000000000000000) === 0x00000000000000000000000000000000
true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn't problematic, since that Int128()
printing means a bitcast, not a function call
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure - we can fix this as part of the bit-casting fix-ups, but regardless it doesn't round-trip:
julia> Int128(0x80000000000000000000000000000000)
ERROR: InexactError: convert(Int128, 0x80000000000000000000000000000000)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It works just fine: you're just making the mistake of calling eval
, when the semantics of this printer have never been to be usable with eval, but rather is a representation for reinterpret.
julia> reinterpret(Int128, 0x80000000000000000000000000000000)
-170141183460469231731687303715884105728
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we interested in an "almost" eval-able human-readable serialization format?
The only trade-off that I can understand is that it's a bit more compact than a proper repr
, but more importantly it needs a custom de-serializer which afaict isn't implemented
Since we need a programmatic round-trip, it's not enough IMO to appeal to a de-serializer that "should" be used if it doesn't actually exist
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because it is human readable, and thus a useful debugging tool. Someone should implement a de-serializer if they want to use this for generating precompile statements. Just blinding executing code will never give the correct answers anyways, so these little gotcha are just reminders that using eval
is a dead-end approach and not fixable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a disingenuous argument @vtjnash -- nobody reads this output for debugging (and we've tried); on the contrary, it is actively used for generating precompile statements. It would be better to remove all the "little gotchas".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just blinding executing code will never give the correct answers anyways, so these little gotcha are just reminders that using
eval
is a dead-end approach and not fixable.
I feel like this is lacking justification - all of the recent serialization problems we've enumerated so far (floating point, integer types, bfloat, restricted inner constructors, symbol escaping, etc.) are basic problems around making sure that we print the correct literal (or the correct literal + a direct, loss-less conversion to a built-in datatype). Of course, the other major issue (still unsolved) is around custom constructors making a reinterpret
required to round-trip, but the point stands that most of those are direct fixes to the de-parse (print "1.0" not "1") and not problems inherent to eval
.
I'm very open to more fundamental arguments about why eval
is broken for this, but the only caveat that I'm aware of is that eval
may not map type names (Vector
, String
, Foo
) correctly depending on the evaluation context, or that these may have a different meaning (esp. anonymized typenames like #foo##2
) in different executions.
It is true that Serialization does this better by using the PkgId / UUID
, but is that the only fundamental thing that we gain by moving away from eval
? Is there a more fundamental reason why eval
cannot be made to work here? Do we trust Serialization to do all of this correctly, or does that have outstanding correct-ness issues too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whenever there is an alternative serialization format to describe a set of methods to compile that is better I'm sure we will all happily switch to that. Until that exists it seems reasonable to do incremental improvements to the existing one, even if it is doomed to never be 100% correct in the end.
Makes more types survive `jl_static_show` unambiguously: - Symbols - Symbols printed in the `:var"foo"` form use raw string escaping, fixing `:var"a\b"`, `:var"a\\"`, `:var"$a"`, etc. - Symbols that require parens use parens (`:(=)`, ...) - Signed integers: Except for `Int`, signed integers print like `Int8(1)`. - Floats: floats are printed in a naive but reversible (TODO: double check) way. `Inf(16|32|)` and `NaN(16|32|)` are printed, and `Float16`/`Float32` print the type (`Float32(1.5)`). `Float64`s are printed with a trailing `.0` if it is necessary to disambiguate from `Int`. Fixes JuliaLang#52677, JuliaLang#58484 (comment), JuliaLang#58484 (comment), and the specific case mentioned in JuliaLang#58484. Improves the situation for round-trip (inexhaustive list): - Non-canonical NaNs - BFloat16 - User-defined primitive types. This one is tricky, because they can have a size different from any type we have literals for.
Makes more types survive `jl_static_show` unambiguously: - Symbols - Symbols printed in the `:var"foo"` form use raw string escaping, fixing `:var"a\b"`, `:var"a\\"`, `:var"$a"`, etc. - Symbols that require parens use parens (`:(=)`, ...) - Signed integers: Except for `Int`, signed integers print like `Int8(1)`. - Floats: floats are printed in a naive but reversible (TODO: double check) way. `Inf(16|32|)` and `NaN(16|32|)` are printed, and `Float16`/`Float32` print the type (`Float32(1.5)`). `Float64`s are printed with a trailing `.0` if it is necessary to disambiguate from `Int`. Fixes JuliaLang#52677, JuliaLang#58484 (comment), JuliaLang#58484 (comment), and the specific case mentioned in JuliaLang#58484. Improves the situation for round-trip (inexhaustive list): - Non-canonical NaNs - BFloat16 - User-defined primitive types. This one is tricky, because they can have a size different from any type we have literals for.
Makes more types survive `jl_static_show` unambiguously: - Symbols - Symbols printed in the `:var"foo"` form use raw string escaping, fixing `:var"a\b"`, `:var"a\\"`, `:var"$a"`, etc. - Symbols that require parens use parens (`:(=)`, ...) - Signed integers: Except for `Int`, signed integers print like `Int8(1)`. - Floats: floats are printed in a naive but reversible (TODO: double check) way. `Inf(16|32|)` and `NaN(16|32|)` are printed, and `Float16`/`Float32` print the type (`Float32(1.5)`). `Float64`s are printed with a trailing `.0` if it is necessary to disambiguate from `Int`. Fixes JuliaLang#52677, JuliaLang#58484 (comment), JuliaLang#58484 (comment), and the specific case mentioned in JuliaLang#58484. Improves the situation for round-trip (inexhaustive list): - Non-canonical NaNs - BFloat16 - User-defined primitive types. This one is tricky, because they can have a size different from any type we have literals for.
Makes more types survive
jl_static_show
::var"foo"
form use raw string escaping, fixing:var"a\b"
,:var"a\\"
,:var"$a"
, etc.:(=)
, ...)Int
, signed integers print likeInt8(1)
.Inf(16|32|)
andNaN(16|32|)
are printed, andFloat16
/Float32
print the type (Float32(1.5)
).Float64
s are printed with a trailing.0
if it is necessary to disambiguate fromInt
.Fixes #52677, #58484 (comment), #58484 (comment), and the specific case mentioned in #58484. Improves the situation for #38902 but does not close it, because a few cases still do not round-trip (inexhaustive list):