-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Add contains function, and support in datafusion substrait consumer #10879
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
) -> Result<ArrayRef, DataFusionError> { | ||
let mod_str = as_generic_string_array::<T>(&args[0])?; | ||
let match_str = as_generic_string_array::<T>(&args[1])?; | ||
let res = arrow::compute::kernels::comparison::regexp_is_match_utf8( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to escape search string as it's used in regexp? Wondering what's the result of contains("abcdefg", ".*")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Lordworms can you possibly respond to @waynexia 's comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To keep the review queue clear, I filed #10929 to follow up on this item and will merge this PR in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Lordworms, this looks good to me in general
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also please add an entry to the documentation about this new function? https://datafusion.apache.org/user-guide/sql/scalar_functions.html#scalar-functions
@@ -149,6 +149,9 @@ pub mod expr_fn { | |||
),( | |||
uuid, | |||
"returns uuid v4 as a string value", | |||
), ( | |||
contains, | |||
"Return true if search_string is found within string.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should also mention here that search_string is treated like a regex if that is the intention
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
Sure, sorry for the ignorance |
Co-authored-by: Alex Huang <huangweijun1001@gmail.com>
Please add an entry here datafusion/docs/source/user-guide/sql/scalar_functions.md Lines 645 to 646 in 1a2a1bf
|
The CI failure seems unrelated to the code in this PR: https://github.com/apache/datafusion/actions/runs/9510703358?pr=10879 |
Thanks again @waynexia @Lordworms and @Weijun-H ! |
…pache#10879) * adding new function contains * adding substrait test * adding doc * adding doc * Update docs/source/user-guide/sql/scalar_functions.md Co-authored-by: Alex Huang <huangweijun1001@gmail.com> * adding entry --------- Co-authored-by: Alex Huang <huangweijun1001@gmail.com>
…11060) * wip create and register ext file types with session * Add contains function, and support in datafusion substrait consumer (#10879) * adding new function contains * adding substrait test * adding doc * adding doc * Update docs/source/user-guide/sql/scalar_functions.md Co-authored-by: Alex Huang <huangweijun1001@gmail.com> * adding entry --------- Co-authored-by: Alex Huang <huangweijun1001@gmail.com> * logical planning updated * compiling * removing filetype enum * compiling * working on tests * fix some tests * test fixes * cli fix * cli fmt * Update datafusion/core/src/datasource/file_format/mod.rs Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * Update datafusion/core/src/execution/session_state.rs Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * review comments * review comments * review comments * typo fix * fmt * fix err log style * fmt --------- Co-authored-by: Lordworms <48054792+Lordworms@users.noreply.github.com> Co-authored-by: Alex Huang <huangweijun1001@gmail.com> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
…pache#11060) * wip create and register ext file types with session * Add contains function, and support in datafusion substrait consumer (apache#10879) * adding new function contains * adding substrait test * adding doc * adding doc * Update docs/source/user-guide/sql/scalar_functions.md Co-authored-by: Alex Huang <huangweijun1001@gmail.com> * adding entry --------- Co-authored-by: Alex Huang <huangweijun1001@gmail.com> * logical planning updated * compiling * removing filetype enum * compiling * working on tests * fix some tests * test fixes * cli fix * cli fmt * Update datafusion/core/src/datasource/file_format/mod.rs Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * Update datafusion/core/src/execution/session_state.rs Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * review comments * review comments * review comments * typo fix * fmt * fix err log style * fmt --------- Co-authored-by: Lordworms <48054792+Lordworms@users.noreply.github.com> Co-authored-by: Alex Huang <huangweijun1001@gmail.com> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
…pache#10879) * adding new function contains * adding substrait test * adding doc * adding doc * Update docs/source/user-guide/sql/scalar_functions.md Co-authored-by: Alex Huang <huangweijun1001@gmail.com> * adding entry --------- Co-authored-by: Alex Huang <huangweijun1001@gmail.com>
…pache#11060) * wip create and register ext file types with session * Add contains function, and support in datafusion substrait consumer (apache#10879) * adding new function contains * adding substrait test * adding doc * adding doc * Update docs/source/user-guide/sql/scalar_functions.md Co-authored-by: Alex Huang <huangweijun1001@gmail.com> * adding entry --------- Co-authored-by: Alex Huang <huangweijun1001@gmail.com> * logical planning updated * compiling * removing filetype enum * compiling * working on tests * fix some tests * test fixes * cli fix * cli fmt * Update datafusion/core/src/datasource/file_format/mod.rs Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * Update datafusion/core/src/execution/session_state.rs Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * review comments * review comments * review comments * typo fix * fmt * fix err log style * fmt --------- Co-authored-by: Lordworms <48054792+Lordworms@users.noreply.github.com> Co-authored-by: Alex Huang <huangweijun1001@gmail.com> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Which issue does this PR close?
Closes #10861
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?