-
Notifications
You must be signed in to change notification settings - Fork 98
Add a dataframe call matcher for linting tables #4664
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ 115/115 passed, 10 skipped, 1h26m34s total Running from acceptance #8923 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an improvement, thanks. I'm happy to merge this.
Something to think about is that currently the detection misses a lot of common cases, because it assumes there will only be a single mode
/format
/option
/partitionBy
/bucketBy
call between .write
and .saveAsTable()
whereas in practice a few of them are normally chained.
(Random thought: I think for .saveAsTable()
only mode
and option
can be used?)
Anyway once you find the .write
node,
You can solve that with recursion, once you have found .write
, by checking the next node:
- If the node is .saveAsTable(), return
True
. - If the node is one of
{"mode", "format", "option", "partitionBy", "bucketBy"}
, then return what these rules say about the next node (via recursion). - Return
False
.
Currently added a few more test cases, and it seems multiple chained calls are also being addressed. |
Changes
Current linting does not collect dataframe calls like
df.write[.mode()].saveAsTable()
. Adding a method to fix this issue.Small fix for recent blueprint change for prompts.