-
Notifications
You must be signed in to change notification settings - Fork 45
datafusion 48 #3560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
datafusion 48 #3560
Conversation
log::debug!("DataFusion expression can't be pushed down: {expr:?}"); | ||
false | ||
} | ||
fn can_be_pushed_down(expr: &PhysicalExprRef, schema: &Schema) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was just thinking last night due we support new things here? We have In
now too right? (@joseph-isaacs?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But its a pretty slow rn, what type would it run over?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whatever is available I guess? if its not a perf win right now we can add that later
I think the tests fails because datafusion used to default to |
Benchmarks: TPC-H on NVMETable of Results
|
Benchmarks: TPC-H on S3Table of Results
|
Benchmarks: Clickbench on NVMETable of Results
|
IDK what happened to query 6, but we're 20x worse and parquet is 25x worse. There were a bunch of changes to statistics and repartitioning recently and I wonder if that's the issue there, the query is: SELECT MIN("EventDate"), MAX("EventDate") FROM hits; |
Wonder if they no longer correctly prune on file level statistics when they added partition group statistics |
@robert3005 thats my suspicion, parts of that were already in v47 but v48 has at least one more big change around that |
I am also curious, will have a look, query 0 is also 10x @ parquet and 3x @ vortex |
I filed an issue upstream apache/datafusion#16444 |
I think this fine to merge, we can re-enable the datafusion config (+ #3569) and once apache/datafusion#16447 is released it'll be the default behavior again. |
Signed-off-by: Onur Satici <onur@spiraldb.com>
This is required for apache/datafusion#16447 to take effect (once its merged). By default datafusion will take care of that if you use the SQL interface (and do all the session state setup). Signed-off-by: Adam Gutglick <adam@spiraldb.com>
No description provided.