Skip to content

Hint on unknown escape of Unicode quotation marks in string literal #128906

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

lolbinarycat
Copy link
Contributor

Fixes #128858

I opted not to produce a suggestion, since it's not obvious what the user meant to do.

Fixes rust-lang#128858

I opted not to produce a suggestion, since it's not obvious what the user meant to do.
@rustbot
Copy link
Collaborator

rustbot commented Aug 9, 2024

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @estebank (or someone else) some time within the next two weeks.

Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (S-waiting-on-review and S-waiting-on-author) stays updated, invoking these commands when appropriate:

  • @rustbot author: the review is finished, PR author should check the comments and take action accordingly
  • @rustbot review: the author is ready for a review, this PR will be queued again in the reviewer's queue

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Aug 9, 2024
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer
Copy link
Collaborator

The job x86_64-gnu-llvm-17 failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
------
 > importing cache manifest from ghcr.io/rust-lang/rust-ci-cache:3aacb9c90579defe09351ac5e8ee504359f8054da6326ff19038f1b7c90e3cb2aafe33685c6d9b76ee8d2ccbd187ca80c46ab5380485abdd8c0ce7d69cd8d8fd:
------
##[endgroup]
Setting extra environment values for docker:  --env ENABLE_GCC_CODEGEN=1 --env GCC_EXEC_PREFIX=/usr/lib/gcc/
[CI_JOB_NAME=x86_64-gnu-llvm-17]
---
sccache: Starting the server...
##[group]Configure the build
configure: processing command line
configure: 
configure: build.configure-args := ['--build=x86_64-unknown-linux-gnu', '--llvm-root=/usr/lib/llvm-17', '--enable-llvm-link-shared', '--set', 'rust.thin-lto-import-instr-limit=10', '--set', 'change-id=99999999', '--enable-verbose-configure', '--enable-sccache', '--disable-manage-submodules', '--enable-locked-deps', '--enable-cargo-native-static', '--set', 'rust.codegen-units-std=1', '--set', 'dist.compression-profile=balanced', '--dist-compression-formats=xz', '--set', 'rust.lld=false', '--disable-dist-src', '--release-channel=nightly', '--enable-debug-assertions', '--enable-overflow-checks', '--enable-llvm-assertions', '--set', 'rust.verify-llvm-ir', '--set', 'rust.codegen-backends=llvm,cranelift,gcc', '--set', 'llvm.static-libstdcpp', '--enable-new-symbol-mangling']
configure: target.x86_64-unknown-linux-gnu.llvm-config := /usr/lib/llvm-17/bin/llvm-config
configure: llvm.link-shared     := True
configure: rust.thin-lto-import-instr-limit := 10
configure: change-id            := 99999999
---
failures:

---- [ui] tests/ui/unicode-quote.rs stdout ----

error: /checkout/tests/ui/unicode-quote.rs:2: unexpected error: '2:26: 2:27: unknown character escape: `\u{201c}`'

error: /checkout/tests/ui/unicode-quote.rs:2: unexpected error: '2:32: 2:33: unknown character escape: `\u{201d}`'
error: 2 unexpected errors found, 0 expected errors not found
status: exit status: 1
status: exit status: 1
command: env -u RUSTC_LOG_COLOR RUSTC_ICE="0" RUST_BACKTRACE="short" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "/checkout/tests/ui/unicode-quote.rs" "-Zthreads=1" "-Zsimulate-remapped-rust-src-base=/rustc/FAKE_PREFIX" "-Ztranslate-remapped-path-to-local-path=no" "-Z" "ignore-directory-in-diagnostics-source-blocks=/cargo" "-Z" "ignore-directory-in-diagnostics-source-blocks=/checkout/vendor" "--sysroot" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2" "--target=x86_64-unknown-linux-gnu" "--check-cfg" "cfg(FALSE)" "--error-format" "json" "--json" "future-incompat" "-Ccodegen-units=1" "-Zui-testing" "-Zdeduplicate-diagnostics=no" "-Zwrite-long-types-to-disk=no" "-Cstrip=debuginfo" "--emit" "metadata" "-C" "prefer-dynamic" "--out-dir" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/ui/unicode-quote" "-A" "unused" "-A" "internal_features" "-Crpath" "-Cdebuginfo=0" "-Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "-L" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/ui/unicode-quote/auxiliary"
--- unexpected errors (from JSON output) ---
--- unexpected errors (from JSON output) ---
ERROR     line   2: 2:26: 2:27: unknown character escape: `\u{201c}`
ERROR     line   2: 2:32: 2:33: unknown character escape: `\u{201d}`
thread '[ui] tests/ui/unicode-quote.rs' panicked at src/tools/compiletest/src/runtest.rs:1495:13:
errors differ from expected


Comment on lines +163 to +166
"{ec} is not an ascii quote, \
but may look like one in some fonts.\n\
consider writing it in its \
escaped form for clarity."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the spacing is a bit awkward here. Can we use a semicolon to turn this into one line?

Suggested change
"{ec} is not an ascii quote, \
but may look like one in some fonts.\n\
consider writing it in its \
escaped form for clarity."
"{ec} is not an ascii quote but may look like one in some fonts; consider escaping it to avoid confusion"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm pretty sure that would put it over the 100 char line length limit.

i'm also unsure if you have a problem with the formatting of the output or the code (the code is 4 lines, but the actual output is only 2)

Copy link
Member

@compiler-errors compiler-errors Aug 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The output -- we don't typically have multi-line diagnostics (unless formatting a list or something), and we try to avoid periods in diagnostic outputs as a matter of style. I personally find multi-sentence notes to be a bit wordy.

Yeah, you'll need to re-\ the string literal.

@compiler-errors
Copy link
Member

Also please squash this into one commit, no need for an additional fmt commit

@PatchMixolydic
Copy link
Contributor

I opted not to produce a suggestion, since it's not obvious what the user meant to do.

You could emit a MaybeIncorrect suggestion to replace the quote character with its escaped form. If I remember correctly, this would make the suggestion available to tools that support user intervention (rust-analyzer) while allowing fully automated tools (rustfix) to ignore it. (This would also make the suggested fix super obvious just from glancing at the diagnostic.)

Comment on lines +7 to +8
= help: \u{201c} is not an ascii quote, but may look like one in some fonts.
consider writing it in its escaped form for clarity.
Copy link
Member

@jieyouxu jieyouxu Aug 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: the help message itself could look like

help: U+201C Left Double Quotation Mark (“) looks like U+0022 Quotation Mark (") (ASCII quotation mark) but are different characters

and a MaybeIncorrect suggestion as mentioned could look something like

help: consider writing the Unicode escape
  |
2 |     dbg!("since when is \u{201c}THIS\” not allowed in a string literal");
  |                         ++++++++

(exact wording may vary)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does rustc have a database of unicode character names?

@estebank estebank added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 2, 2024
@estebank
Copy link
Contributor

estebank commented Sep 2, 2024

I'm not going to be around for a few weeks. r? @compiler-errors as you were already looking at it :)

@rustbot rustbot assigned compiler-errors and unassigned estebank Sep 2, 2024
@alex-semenyuk alex-semenyuk added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Oct 25, 2024
@Dylan-DPC
Copy link
Member

@lolbinarycat any updates on this? thanks

@lolbinarycat
Copy link
Contributor Author

@Dylan-DPC I've mostly moved on to working on rustdoc, since the codebase is smaller and i feel like i can make a more meaningful impact on it. I may come back to this at some point, but if someone else wants to take over this PR, I wouldn't mind.

@apiraino
Copy link
Contributor

superseded by #137067

@apiraino apiraino closed this Mar 20, 2025
@apiraino apiraino removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Mar 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Hint on unknown escape of Unicode quotation marks in string literal
10 participants