-
Notifications
You must be signed in to change notification settings - Fork 28.7k
[SPARK-48353][FOLLOW UP][SQL] Exception handling improvements #51034
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
davidm-db
wants to merge
3
commits into
apache:master
from
davidm-db:exception_handling_improvements
Closed
[SPARK-48353][FOLLOW UP][SQL] Exception handling improvements #51034
davidm-db
wants to merge
3
commits into
apache:master
from
davidm-db:exception_handling_improvements
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
cc @cloud-fan @dejankrak-db @miland-db @dusantism-db please review |
dejankrak-db
approved these changes
May 27, 2025
cloud-fan
approved these changes
May 28, 2025
I'm fixing the mima checks at #51038 , please rebase after the fix is merged |
73da2bb
to
15b8b9d
Compare
cc @cloud-fan this can be merged now |
thanks, merging to master/4.0 (as it fixes bugs in the released scripting feature)! |
cloud-fan
pushed a commit
that referenced
this pull request
May 29, 2025
### What changes were proposed in this pull request? This pull request proposes an improvement in how conditions are matched with exception handlers. Previously, condition would match with an exception handler only if the match was full/complete - i.e. condition `<MAIN>.<SUBCLASS>` would match only to the handler defined for `<MAIN>.<SUBCLASS>`. With this improvement, `<MAIN>.<SUBCLASS>` condition would match to the handlers defined for `<MAIN>` condition as well, with correct precedence. This pull requests also proposes a number of fixes for different missing things: - `CompoundBodyExec.reset()` is not resetting the `scopeStatus` field. - Exception handler body is never reset (this includes internal iterator reset). This causes issues if the handler is defined in a loop and gets matched multiple times. - When searching for a place to inject `LEAVE` statement, after the `EXIT` handler has been executed, the logic considered only `CompoundBodyExec` nodes, whereas it should have included all non-leaf statements. - Exception handling for exceptions that happen in conditions (for each statement type - if/else, while, case, etc) is not working properly because the injected `LEAVE` statement is not recognized properly. - Exception handling for exceptions that happen in the last statement of the body of if/else and searched case statement is not working properly because the injected `LEAVE` statement is not recognized properly. - `hasNext()` in `ForStatementExec` is executing the query. This causes the issues in cases when `FOR` is the first statement in the compound and the query fails. It means that exception would happen during the `hasNext()` checks instead of the actual iteration. In such case, the parent compound (of the `FOR` statement) is not entered (because that happens during the iteration) and call stack is not properly setup for exception handling. Changes corresponding to this problems, in the same order: - Reset the `scopeStatus` field in `CompoundBodyExec.reset()`. - Call `reset()` on handler before starting its execution in `SqlScriptingExecution.handleException()`. - Move `curr` pointer to `NonLeafStatementExec` and use that in `SqlScriptingExecution. getNextStatement()` in case when `EXIT` handler is finished and removed from the stack. - Add special case handling for `LeaveStatementExec` in `*.Condition` cases in all of the relevant statement types. When exception happens during the condition execution, the `LeaveStatementExec` is injected into `curr` field, but the state hasn't changed. This means that when the `LEAVE` statement is to be executed, the state would correspond to the condition. **I don't know how to do this better, so any suggestions are more than welcome!** - Add special case handling for `LeaveStatementExec` in `*Body` cases of if/else and searched case statement - equivalent to the previous bullet. - Reorder conditions in `ForStatementExec.hasNext()`. ### Why are the changes needed? These changes are fixing wrong logic and improving some of the already existing exception handling mechanisms. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? New unit tests are added for to test/guard all of the introduced improvements. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #51034 from davidm-db/exception_handling_improvements. Authored-by: David Milicevic <david.milicevic@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 80cd867) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
yhuang-db
pushed a commit
to yhuang-db/spark
that referenced
this pull request
Jun 9, 2025
### What changes were proposed in this pull request? This pull request proposes an improvement in how conditions are matched with exception handlers. Previously, condition would match with an exception handler only if the match was full/complete - i.e. condition `<MAIN>.<SUBCLASS>` would match only to the handler defined for `<MAIN>.<SUBCLASS>`. With this improvement, `<MAIN>.<SUBCLASS>` condition would match to the handlers defined for `<MAIN>` condition as well, with correct precedence. This pull requests also proposes a number of fixes for different missing things: - `CompoundBodyExec.reset()` is not resetting the `scopeStatus` field. - Exception handler body is never reset (this includes internal iterator reset). This causes issues if the handler is defined in a loop and gets matched multiple times. - When searching for a place to inject `LEAVE` statement, after the `EXIT` handler has been executed, the logic considered only `CompoundBodyExec` nodes, whereas it should have included all non-leaf statements. - Exception handling for exceptions that happen in conditions (for each statement type - if/else, while, case, etc) is not working properly because the injected `LEAVE` statement is not recognized properly. - Exception handling for exceptions that happen in the last statement of the body of if/else and searched case statement is not working properly because the injected `LEAVE` statement is not recognized properly. - `hasNext()` in `ForStatementExec` is executing the query. This causes the issues in cases when `FOR` is the first statement in the compound and the query fails. It means that exception would happen during the `hasNext()` checks instead of the actual iteration. In such case, the parent compound (of the `FOR` statement) is not entered (because that happens during the iteration) and call stack is not properly setup for exception handling. Changes corresponding to this problems, in the same order: - Reset the `scopeStatus` field in `CompoundBodyExec.reset()`. - Call `reset()` on handler before starting its execution in `SqlScriptingExecution.handleException()`. - Move `curr` pointer to `NonLeafStatementExec` and use that in `SqlScriptingExecution. getNextStatement()` in case when `EXIT` handler is finished and removed from the stack. - Add special case handling for `LeaveStatementExec` in `*.Condition` cases in all of the relevant statement types. When exception happens during the condition execution, the `LeaveStatementExec` is injected into `curr` field, but the state hasn't changed. This means that when the `LEAVE` statement is to be executed, the state would correspond to the condition. **I don't know how to do this better, so any suggestions are more than welcome!** - Add special case handling for `LeaveStatementExec` in `*Body` cases of if/else and searched case statement - equivalent to the previous bullet. - Reorder conditions in `ForStatementExec.hasNext()`. ### Why are the changes needed? These changes are fixing wrong logic and improving some of the already existing exception handling mechanisms. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? New unit tests are added for to test/guard all of the introduced improvements. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#51034 from davidm-db/exception_handling_improvements. Authored-by: David Milicevic <david.milicevic@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
This pull request proposes an improvement in how conditions are matched with exception handlers. Previously, condition would match with an exception handler only if the match was full/complete - i.e. condition
<MAIN>.<SUBCLASS>
would match only to the handler defined for<MAIN>.<SUBCLASS>
. With this improvement,<MAIN>.<SUBCLASS>
condition would match to the handlers defined for<MAIN>
condition as well, with correct precedence.This pull requests also proposes a number of fixes for different missing things:
CompoundBodyExec.reset()
is not resetting thescopeStatus
field.LEAVE
statement, after theEXIT
handler has been executed, the logic considered onlyCompoundBodyExec
nodes, whereas it should have included all non-leaf statements.LEAVE
statement is not recognized properly.LEAVE
statement is not recognized properly.hasNext()
inForStatementExec
is executing the query. This causes the issues in cases whenFOR
is the first statement in the compound and the query fails. It means that exception would happen during thehasNext()
checks instead of the actual iteration. In such case, the parent compound (of theFOR
statement) is not entered (because that happens during the iteration) and call stack is not properly setup for exception handling.Changes corresponding to this problems, in the same order:
scopeStatus
field inCompoundBodyExec.reset()
.reset()
on handler before starting its execution inSqlScriptingExecution.handleException()
.curr
pointer toNonLeafStatementExec
and use that inSqlScriptingExecution. getNextStatement()
in case whenEXIT
handler is finished and removed from the stack.LeaveStatementExec
in*.Condition
cases in all of the relevant statement types. When exception happens during the condition execution, theLeaveStatementExec
is injected intocurr
field, but the state hasn't changed. This means that when theLEAVE
statement is to be executed, the state would correspond to the condition. I don't know how to do this better, so any suggestions are more than welcome!LeaveStatementExec
in*Body
cases of if/else and searched case statement - equivalent to the previous bullet.ForStatementExec.hasNext()
.Why are the changes needed?
These changes are fixing wrong logic and improving some of the already existing exception handling mechanisms.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
New unit tests are added for to test/guard all of the introduced improvements.
Was this patch authored or co-authored using generative AI tooling?
No.