You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Python and Javascript do it differently: `\xff`mean`\u{ff}`, because their strings behave like UTF-32 or UTF-16 rather than UTF-8.
96
-
(Also, Python's byte strings "accept" `\u`escape codes as just `'\\', 'u'`, without any warning or error.)
80
+
- Python and Javascript do it differently: `\xff`means`\u{ff}`, because their strings behave like UTF-32 or UTF-16 rather than UTF-8.
81
+
(Also, Python's byte strings "accept" `\u` as just `'\\', 'u'`, without any warning or error.)
97
82
98
83
# Unresolved questions
99
84
[unresolved-questions]: #unresolved-questions
@@ -113,8 +98,15 @@ However, for regular string literals that will result in an error in nearly all
113
98
Probably not, since a `char` is not UTF-8 encoded; it's a single UTF-32 codepoint.
114
99
_Decoding_ UTF-8 from `\x` escape codes back into UTF-32 would be a bit surprising.
115
100
101
+
(But note that `'\x41'` already works, for single byte UTF-8 characters, aka ASCII.)
102
+
116
103
# Future possibilities
117
104
[future-possibilities]: #future-possibilities
118
105
106
+
- Postpone the UTF-8 validation to a later stage, such that macros can accept literals with invalid UTF-8. E.g. `cstr!("\xff")`.
107
+
108
+
- If we do that, we could also decide to accept _all_ escape codes, even unknown ones, to allow things like `some_macro!("\a\b\c")`.
109
+
(The tokenizer would only need to know about `\"`.)
110
+
119
111
- Update the `concat!()` macro to accept `b""` strings and also not implicitly convert integers to strings, such that `concat!(b"", $x, b"\0")` becomes usable.
0 commit comments