Skip to content

[BUG] JSQLParser 5.1: PostgreSQL: fail to parse dollar-quoted string constants with tags #2233

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
E1izabeth opened this issue Apr 29, 2025 · 18 comments
Assignees
Labels

Comments

@E1izabeth
Copy link

The parser fails to parse dollar-quoted string constants with tags (see https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-DOLLAR-QUOTING)
SQL example:

select
$json$
[
{"some":"json","with":"properties"},
{"other":"json","with":"properties"}
]
$json$::jsonb

The error I get:

net.sf.jsqlparser.parser.ParseException: Encountered unexpected token: "::" "::"
    at line 7, column 7.

Was expecting one of:

    <EOF>
    <ST_SEMICOLON>

However, replacing dollar-quotes $json$ with single quotes ' or removing the tag (to leave only double dollar $$) works.

Software Information:

  • JSqlParser 5.1
  • PostgreSQL

Additional information

It used to be supported in JSQLParser 4.5
Connected issue in DBeaver dbeaver/dbeaver#37028

@manticore-projects
Copy link
Contributor

@E1izabeth: thank you for reporting, I will look into this asap.

@manticore-projects
Copy link
Contributor

Greetings.

"Dollar quoting" is actually supported and the following works:

select
$$
[
{"some":"json","with":"properties"},
{"other":"json","with":"properties"}
]
$$::jsonb

What is missing is the support for the "optional tag" and I do wonder if this ever has worked (not arguing though).

So I will have to look for a solution to extend this token.

@E1izabeth
Copy link
Author

Thanks

@manticore-projects
Copy link
Contributor

@E1izabeth: I dived deep into this issue and unfortunately I can not offer much hope here:

  1. I am quite sure, that this was never supported before. It just failed faster when not even "Dollar quotes w/o Tags" were supported
  2. while I can offer some remedy for parsing "Dollar quotes w/ Tags" it will never work perfectly because of white space:
-- this works
select $tag$this is a test $tag$
-- this works but alters the content
select $tag$this 
is 
a 
test 
$tag$

So the statement will be parsed and return the correct AST. But content of the String Value can't be guaranteed.
Unfortunately there is nothing I can do, because JavaCC mangles the white spaces and returns only token streams.

This gets us back to item 1) It never worked but failed faster.
It looks to me like you are not calling the parser supervised (obeying the timeout) and I would like to encourage you again to look into this. We will always have some Freezing Statements and the supervised execution terminates those reliably and efficiently without disturbing the user experience.

Best and cheers!

@E1izabeth
Copy link
Author

@manticore-projects thank you so much for your effort and responsiveness

@manticore-projects
Copy link
Contributor

I am still fighting to get this white space right ...

@manticore-projects
Copy link
Contributor

How cool is this one:

    @Test
    void testDollarQuotedText() throws JSQLParserException {
        String sqlStr = "SELECT $tag$This\nis\na\ntest\n$tag$ from dual where a=b";
        PlainSelect st = (PlainSelect) CCJSqlParserUtil.parse(sqlStr);

        StringValue stringValue = st.getSelectItem(0).getExpression(StringValue.class);

        Assertions.assertEquals("This\nis\na\ntest\n", stringValue.getValue());
    }

Now, unfortunately I am battling with some collisions with other valid syntax, mainly $parameter. But I am confident getting this solved somehow soon.

@manticore-projects
Copy link
Contributor

manticore-projects commented May 2, 2025

@E1izabeth:

Unfortunately I have only bad news:

  1. $tag$this is a test$tag$ collides with (currently) allowed identifiers $tag$this (because we allow identifiers to start with and contain a $)

  2. the solution would be to tighten identifiers so they either start with a $ (and then don't contain any following $) or they contain a $ only when they don't start with a $

  3. though adding such an additional rule breaks the JavaCC7 parser generator: error: code too large for try statement

We are using too many tokens and are pretty much at the end of JavaCC 7 here and I will have to test/decide if switching to JavaCC 8 or CongoCC can solve this challenge. That would be a "next version" though, after releasing the current JSQLParser 5.2.

So what are our possible instant options here:

a) this has never worked before, but just failed instantly while now it hangs (based on support of $$...$$ blocks)

--> I can help you by showing how to run JSQLParser supervised so it will fail within definable timeout.

This should be done anyway since also other complex statements can freeze.

b) or I can provide a DBeaver branch temporarily, where we switch off identifiers containing $ (which is not SQL:2016 complaint)

If you have any better idea, please do let me know.
Best and cheers!

@E1izabeth
Copy link
Author

E1izabeth commented May 2, 2025

I see. Thank you.
For option a), you're proposing to add .withTimeOut(..) parameter for the parser, right?
We have configured this way right now:

        CCJSqlParser parser = new CCJSqlParser(new StringProvider(sqlWithoutComments));
        parser.withAllowComplexParsing(false);
        if (...) {
            parser.withSquareBracketQuotation(true);
        }

For option b), I have to discuss it with a team.

@manticore-projects
Copy link
Contributor

On 1)

ExecutorService executorService = Executors.newSingleThreadExecutor();

final CCJSqlParser parser = new CCJSqlParser(sqlStr)
                    .withSquareBracketQuotation(false)
                    .withAllowComplexParsing(true)
                    .withBackslashEscapeCharacter(false);

Future<Statements> future = executorService.submit(new Callable<Statements>() {
    @Override
    public Statements call() throws ParseException {
        return parser.Statements();
    }
});

try {
    future.get(6000, TimeUnit.MILLISECONDS);
    long endMillis = System.currentTimeMillis();

    System.out.println("Time to parse [ms]: " + (endMillis - startMillis) / i);
} catch (TimeoutException | InterruptedException ex2) {
    parser.interrupted = true;
    future.cancel(true);
    throw new JSQLParserException("Failed to within reasonable time ", ex2);
}

Or simply:

CCJSqlParserUtil.parse(String sql, ExecutorService executorService, Consumer<CCJSqlParser> consumer);

@E1izabeth
Copy link
Author

Thank you! We'll modify according to your suggestions.

@manticore-projects
Copy link
Contributor

Greetings @E1izabeth

I have started a JavaCC 8 based development branch and I can confirm that JavaCC 8 lift the restrictions on defining tokens and allows us to expand the grammar.

This will allow us to incorporate the Dollar-quoted tags above, tests have been positive.

Unfortunately, some of the Token internals have changed and right now rewriting of Tokens (used for T-SQL bracket-quoting and advanced escaping) does not work.

So all of this will need extra work and will take some time, but I won't give up on it.

@E1izabeth
Copy link
Author

Hi, @manticore-projects!
Thanks for the update.

- b) or I can provide a DBeaver branch temporarily, where we switch off identifiers containing $ (which is not SQL:2016 complaint)
- For option b), I have to discuss it with a team.

It would be great if you could provide us with any branch or snapshot when it becomes possible.

@manticore-projects
Copy link
Contributor

Sure, I am down to 54 failing tests out of 2,900 tests in total.
I am eliminating some of the exotic stuff as we speak and then will provide you with a drop-in replacement asap.

JavaCC 8 based branch: https://github.com/JSQLParser/JSqlParser/tree/javacc8

@E1izabeth
Copy link
Author

Thanks

@manticore-projects
Copy link
Contributor

Thanks

DBeaver licenses are welcome :-)

@manticore-projects
Copy link
Contributor

@E1izabeth Compliments!

I have fixed a massive performance regression in released JSQLParser-5.2 and have added proper JHM benchmarks for avoiding just mischief in the future.

Please see https://github.com/JSQLParser/JSqlParser#performance

You can use JSQLParser-5.3 Snapshot as a drop-in hotfix. It does not change anything except performance and is fully compatible.

I will work in parallel on releasing a hotfix asap but its NOT in my control.

Any questions or concerns, please just ask.

@E1izabeth
Copy link
Author

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants