From 451fabee105c1c81e1cc73d33cd4d859e42393ae Mon Sep 17 00:00:00 2001 From: dengziming Date: Sat, 5 Jul 2025 17:07:38 +0800 Subject: [PATCH] [SPARK-52545][SQL][DOCS]Update string literal docs for quote escaping rules --- docs/sql-ref-literals.md | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/docs/sql-ref-literals.md b/docs/sql-ref-literals.md index 7a10676cce237..97da462e67e11 100644 --- a/docs/sql-ref-literals.md +++ b/docs/sql-ref-literals.md @@ -43,7 +43,7 @@ A string literal is used to specify a character string value. * **char** - One character from the character set. Use `\` to escape special characters (e.g., `'` or `\`). + One character from the character set. Use `\` to escape special characters (e.g., `'` or `\`), additionally, consecutive quotes can be used for escaping (e.g., `'S''park'` equals `'S\'park'`, `"S""park"` equals `"S\"park"`). To represent unicode characters, use 16-bit or 32-bit unicode escape of the form `\uxxxx` or `\Uxxxxxxxx`, where xxxx and xxxxxxxx are 16-bit and 32-bit code points in hexadecimal respectively (e.g., `\u3042` for `あ` and `\U0001F44D` for `👍`). An ASCII character can also be represented as an octal number preceded by `\` like `\101`, which represents `A`. @@ -62,9 +62,13 @@ The following escape sequences are recognized in regular string literals (withou - `\%` -> `\%`; - `\_` -> `\_`; - `\` -> ``, skip the slash and leave the character as is. +- `""` -> `"`, skip first `"` in double-quoted string. +- `''` -> `'`, skip first `'` in single-quoted string. The unescaping rules above can be turned off by setting the SQL config `spark.sql.parser.escapedStringLiterals` to `true`. +When consecutive quotes conflict with string concatenation, escaping takes precedence (e.g., `'a''b'` → `a'b` not `a`+`b`). To force string concatenation behavior instead, set `spark.sql.legacy.consecutiveStringLiterals.enabled` to `true`. + #### Examples ```sql @@ -95,6 +99,13 @@ SELECT r"'\n' represents newline character." AS col; +----------------------------------+ |'\n' represents newline character.| +----------------------------------+ + +SELECT "S""park" AS f1, 'S''park' AS f2; ++--------+--------+ +| f1| f2| ++--------+--------+ +| S"park | S'park | ++--------+--------+ ``` ### Binary Literal