Skip to content

Commit f0d615f

Browse files
authored
fix(query): fix incorrect selectivity estimation under NOT column. (#18331)
* fix(query): fix incorrect selectivity estimation under NOT column. * fix(query): fix incorrect selectivity estimation under NOT column. * fix(query): fix incorrect selectivity estimation under NOT column. * fix(query): fix incorrect selectivity estimation under NOT column * fix(query): fix incorrect selectivity estimation under NOT column.
1 parent 428d686 commit f0d615f

File tree

2 files changed

+74
-18
lines changed

2 files changed

+74
-18
lines changed

โ€Žsrc/query/sql/src/planner/optimizer/ir/stats/selectivity.rs

Lines changed: 16 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -104,8 +104,22 @@ impl<'a> SelectivityEstimator<'a> {
104104
}
105105

106106
ScalarExpr::FunctionCall(func) if func.func_name == "not" => {
107-
let argument_selectivity = self.compute_selectivity(&func.arguments[0], false)?;
108-
1.0 - argument_selectivity
107+
match &func.arguments[0] {
108+
ScalarExpr::BoundColumnRef(_) => {
109+
// Not column e.g.
110+
// `SELECT * FROM t WHERE not c1`, the selectivity is 1.
111+
1.0
112+
}
113+
ScalarExpr::FunctionCall(func) if func.func_name == "not" => {
114+
// (NOT (NOT predicate))
115+
self.compute_selectivity(&func.arguments[0], false)?
116+
}
117+
_ => {
118+
let argument_selectivity =
119+
self.compute_selectivity(&func.arguments[0], false)?;
120+
1.0 - argument_selectivity
121+
}
122+
}
109123
}
110124

111125
ScalarExpr::FunctionCall(func) => {

โ€Žtests/sqllogictests/suites/mode/standalone/explain/explain.test

Lines changed: 58 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1023,7 +1023,7 @@ explain select * from t1 where a not in (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 1
10231023
Filter
10241024
โ”œโ”€โ”€ output columns: [t1.a (#0), t1.b (#1)]
10251025
โ”œโ”€โ”€ filters: [is_true(NOT 3 (#3))]
1026-
โ”œโ”€โ”€ estimated rows: 0.00
1026+
โ”œโ”€โ”€ estimated rows: 3.00
10271027
โ””โ”€โ”€ HashJoin
10281028
โ”œโ”€โ”€ output columns: [t1.a (#0), t1.b (#1), marker (#3)]
10291029
โ”œโ”€โ”€ join type: LEFT MARK
@@ -1260,23 +1260,23 @@ HashJoin: RIGHT OUTER
12601260
query T
12611261
explain join SELECT c.customer_name FROM customers c WHERE NOT EXISTS ( SELECT category FROM products WHERE category NOT IN ( SELECT p.category FROM sales s JOIN products p ON s.product_id = p.product_id WHERE s.customer_id = c.customer_id ) ) ORDER BY c.customer_name;
12621262
----
1263-
HashJoin: RIGHT MARK
1263+
HashJoin: LEFT MARK
12641264
โ”œโ”€โ”€ Build
1265-
โ”‚ โ””โ”€โ”€ HashJoin: RIGHT MARK
1266-
โ”‚ โ”œโ”€โ”€ Build
1267-
โ”‚ โ”‚ โ””โ”€โ”€ HashJoin: INNER
1268-
โ”‚ โ”‚ โ”œโ”€โ”€ Build
1269-
โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ Scan: default.default.products (#3) (read rows: 10)
1270-
โ”‚ โ”‚ โ””โ”€โ”€ Probe
1271-
โ”‚ โ”‚ โ””โ”€โ”€ Scan: default.default.sales (#2) (read rows: 500)
1272-
โ”‚ โ””โ”€โ”€ Probe
1273-
โ”‚ โ””โ”€โ”€ HashJoin: CROSS
1274-
โ”‚ โ”œโ”€โ”€ Build
1275-
โ”‚ โ”‚ โ””โ”€โ”€ Scan: default.default.products (#1) (read rows: 10)
1276-
โ”‚ โ””โ”€โ”€ Probe
1277-
โ”‚ โ””โ”€โ”€ Scan: default.default.customers (#0) (read rows: 100)
1265+
โ”‚ โ””โ”€โ”€ Scan: default.default.customers (#0) (read rows: 100)
12781266
โ””โ”€โ”€ Probe
1279-
โ””โ”€โ”€ Scan: default.default.customers (#0) (read rows: 100)
1267+
โ””โ”€โ”€ HashJoin: RIGHT MARK
1268+
โ”œโ”€โ”€ Build
1269+
โ”‚ โ””โ”€โ”€ HashJoin: INNER
1270+
โ”‚ โ”œโ”€โ”€ Build
1271+
โ”‚ โ”‚ โ””โ”€โ”€ Scan: default.default.products (#3) (read rows: 10)
1272+
โ”‚ โ””โ”€โ”€ Probe
1273+
โ”‚ โ””โ”€โ”€ Scan: default.default.sales (#2) (read rows: 500)
1274+
โ””โ”€โ”€ Probe
1275+
โ””โ”€โ”€ HashJoin: CROSS
1276+
โ”œโ”€โ”€ Build
1277+
โ”‚ โ””โ”€โ”€ Scan: default.default.products (#1) (read rows: 10)
1278+
โ””โ”€โ”€ Probe
1279+
โ””โ”€โ”€ Scan: default.default.customers (#0) (read rows: 100)
12801280

12811281
statement ok
12821282
drop table customers;
@@ -1664,3 +1664,45 @@ Filter
16641664

16651665
statement ok
16661666
DROP TABLE IF EXISTS t;
1667+
1668+
query T
1669+
EXPLAIN SELECT a.number FROM numbers(10) AS a INNER JOIN (SELECT * FROM numbers(10) WHERE NOT number) AS b ON a.number = b.number
1670+
----
1671+
HashJoin
1672+
โ”œโ”€โ”€ output columns: [a.number (#0)]
1673+
โ”œโ”€โ”€ join type: INNER
1674+
โ”œโ”€โ”€ build keys: [numbers.number (#1)]
1675+
โ”œโ”€โ”€ probe keys: [a.number (#0)]
1676+
โ”œโ”€โ”€ keys is null equal: [false]
1677+
โ”œโ”€โ”€ filters: []
1678+
โ”œโ”€โ”€ build join filters:
1679+
โ”‚ โ””โ”€โ”€ filter id:0, build key:numbers.number (#1), probe key:a.number (#0), filter type:inlist,min_max
1680+
โ”œโ”€โ”€ estimated rows: 100.00
1681+
โ”œโ”€โ”€ Filter(Build)
1682+
โ”‚ โ”œโ”€โ”€ output columns: [numbers.number (#1)]
1683+
โ”‚ โ”œโ”€โ”€ filters: [NOT CAST(numbers.number (#1) AS Boolean)]
1684+
โ”‚ โ”œโ”€โ”€ estimated rows: 10.00
1685+
โ”‚ โ””โ”€โ”€ TableScan
1686+
โ”‚ โ”œโ”€โ”€ table: default.system.numbers
1687+
โ”‚ โ”œโ”€โ”€ output columns: [number (#1)]
1688+
โ”‚ โ”œโ”€โ”€ read rows: 10
1689+
โ”‚ โ”œโ”€โ”€ read size: < 1 KiB
1690+
โ”‚ โ”œโ”€โ”€ partitions total: 1
1691+
โ”‚ โ”œโ”€โ”€ partitions scanned: 1
1692+
โ”‚ โ”œโ”€โ”€ push downs: [filters: [NOT CAST(numbers.number (#1) AS Boolean)], limit: NONE]
1693+
โ”‚ โ””โ”€โ”€ estimated rows: 10.00
1694+
โ””โ”€โ”€ Filter(Probe)
1695+
โ”œโ”€โ”€ output columns: [a.number (#0)]
1696+
โ”œโ”€โ”€ filters: [NOT CAST(a.number (#0) AS Boolean)]
1697+
โ”œโ”€โ”€ estimated rows: 10.00
1698+
โ””โ”€โ”€ TableScan
1699+
โ”œโ”€โ”€ table: default.system.numbers
1700+
โ”œโ”€โ”€ output columns: [number (#0)]
1701+
โ”œโ”€โ”€ read rows: 10
1702+
โ”œโ”€โ”€ read size: < 1 KiB
1703+
โ”œโ”€โ”€ partitions total: 1
1704+
โ”œโ”€โ”€ partitions scanned: 1
1705+
โ”œโ”€โ”€ push downs: [filters: [NOT CAST(numbers.number (#0) AS Boolean)], limit: NONE]
1706+
โ”œโ”€โ”€ apply join filters: [#0]
1707+
โ””โ”€โ”€ estimated rows: 10.00
1708+

0 commit comments

Comments
ย (0)