Skip to content

Commit 3409a49

Browse files
committed
Update.
1 parent 4038665 commit 3409a49

File tree

1 file changed

+12
-1
lines changed

1 file changed

+12
-1
lines changed

text/3349-mixed-utf8-literals.md

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ Allow the exact same characters and escape codes in `"…"` and `b"…"` literal
1111
That is:
1212

1313
- Allow unicode characters, including `\u{…}` escape codes, in byte string literals. E.g. `b"hello\xff我叫\u{1F980}"`
14-
- Allow `\x…` escape codes in regular string literals, as long as they are valid UTF-8. E.g. `"\xf0\x9f\xa6\x80"`
14+
- Also allow non-ASCII `\x…` escape codes in regular string literals, as long as they are valid UTF-8. E.g. `"\xf0\x9f\xa6\x80"`
1515

1616
# Motivation
1717
[motivation]: #motivation
@@ -102,6 +102,17 @@ However, for regular string literals that will result in an error in nearly all
102102

103103
(I don't care. I guess we should do whatever is easiest to implement.)
104104

105+
- How about single byte and character literals?
106+
107+
- Should `b'\u{30}` work? (It's a unicode escape code, but it's still just one byte in UTF-8.)
108+
109+
I think yes. I see no reason to disallow it.
110+
111+
- Should `'\xf0\x9f\xa6\x80'` work? (It's multiple escape codes, but it's still just one character in UTF-8.)
112+
113+
Probably not, since a `char` is not UTF-8 encoded; it's a single UTF-32 codepoint.
114+
_Decoding_ UTF-8 from `\x` escape codes back into UTF-32 would be a bit surprising.
115+
105116
# Future possibilities
106117
[future-possibilities]: #future-possibilities
107118

0 commit comments

Comments
 (0)