Skip to content

Followups to the new subquery guard #32994

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

ggevay
Copy link
Contributor

@ggevay ggevay commented Jul 11, 2025

These are some minor follow-ups to #32978

See individual commits.

Motivation

  • This PR refactors existing code.

Tips for reviewer

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

@ggevay ggevay requested a review from a team as a code owner July 11, 2025 17:21
@ggevay ggevay added the A-optimization Area: query optimization and transformation label Jul 11, 2025
@ggevay ggevay requested a review from frankmcsherry July 11, 2025 17:22
Comment on lines 3946 to 3955
// Note that we can't get a 0 count here, since the `Reduce count` below us is not
// an SQL `count`, but an MIR `count`, which has an empty output on an empty input.
// If we get a negative count, then we have a negative accumulation error somewhere.
soft_assert_or_log!(
count >= 1,
"GuardSubquerySize encountered invalid count: {}",
count
);
if count != 1 {
// TODO(ggevay): Add count to error msg after the old subquery code is gone.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the value of this, or of extending the error message. Especially extending the error message (do you mean error variant?) as this will inflate the resource requirements of maintaining the errors, even in non-faulty operation (each distinct count will need to be maintained separately). This is a net-new test, untested in the version of the code prior to the new subquery guard, already tested in the operator that produces the count in the first place, and seems to be largely unactionable (other than pinging you to ask what it might mean). Is it very important to add?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the TODO comment:

Especially extending the error message (do you mean error variant?)

Yes, I meant adding it to the EvalError variant that we are creating here, and then also showing it when we print the eval error as a message to the user.

as this will inflate the resource requirements of maintaining the errors

Well, it's a tradeoff: It could help users debug their queries, but it would increase the resource requirements a bit when there are many distinct counts.

even in non-faulty operation (each distinct count will need to be maintained separately)

Well, in the happy path the only distinct count should be 1.

But it's not too important. I've removed the todo.


But the soft_assert_or_log still seems good to me:

This is a net-new test, untested in the version of the code prior to the new subquery guard,

It would have been inconvenient/impossible to test this in the old subquery guard, but it's easy and cheap with the new subquery guard. I consider this to be an advantage of the new subquery guard.

already tested in the operator that produces the count in the first place,

You mean the Reduce aggregates=[count(*)]? It seems this Reduce variant doesn't test for negative accumulations. (I'm not sure whether this was just an oversight, or it would be difficult to add the test.) For example, in the following query the count(*) has a negative input, but the query doesn't soft_panic, or log any errors, or show a user-level error:

create table t(x int);
insert into t values (7);

select count(*) from (select * from t), repeat_row(-3);

However, in a scalar subquery situation, there is another operator somewhat nearby that does test for negative accumulations:

create table t(x int);
insert into t values (7);

select 5, (select 1 from (select * from t), repeat_row(-3));

This produces ERROR: Evaluation error: internal error: Non-positive multiplicity in DistinctBy. This is the plan:

                Optimized Plan                
----------------------------------------------
 Explained Query:                            +
   With                                      +
     cte l0 =                                +
       CrossJoin type=differential           +
         ArrangeBy keys=[[]]                 +
           Project ()                        +
             ReadStorage materialize.public.t+
         ArrangeBy keys=[[]]                 +
           Constant                          +
             - (() x -3)                     +
     cte l1 =                                +
       Union                                 +
         Map (1)                             +
           Get l0                            +
         Project (#1)                        +
           FlatMap guard_subquery_size(#0)   +
             Reduce aggregates=[count(*)]    +
               Get l0                        +
   Return                                    +
     Project (#1, #0)                        +
       Map (5)                               +
         Union                               +
           Get l1                            +
           Map (null)                        +
             Union                           +
               Negate                        +
                 Distinct project=[]         +  <---- Comes from this
                   Project ()                +
                     Get l1                  +
               Constant                      +
                 - ()                        +
                                             +
 Source materialize.public.t                 +

But maybe in a more complicated situation the Distinct would be further away or would be completely absent.

and seems to be largely unactionable (other than pinging you to ask what it might mean). Is it very important to add?

I mean, it gives slightly more info on what is going on when we see it: even when something else also catches the negative accumulation, the new soft_assert says that a subquery guard also saw the negative accumulations. I think we should have as many negative accumulation checks as possible (without badly impacting performance) throughout the codebase, to help with localizing where a negative accumulation is originating from. The proposed soft_assert doesn't give us any groundbreaking debugging capabilities, but it also seems very cheap, so why not add it?

Btw. I've now modified it from a soft_assert to just do tracing::error!, because this is what we do in other negative accumulation checks.

@ggevay ggevay force-pushed the scalar-subquery-flatmap-followups-1 branch from 7bd97d7 to 9558efa Compare July 12, 2025 09:47
@ggevay ggevay force-pushed the scalar-subquery-flatmap-followups-1 branch from 9558efa to 0f60cdb Compare July 12, 2025 09:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-optimization Area: query optimization and transformation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants