-
Notifications
You must be signed in to change notification settings - Fork 114
Introduce Backend Interface (DatabricksClient) #573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
a251d75
to
1df7f02
Compare
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
7da3b63
to
49af2ea
Compare
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
207ae8c
to
169716c
Compare
Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com>
Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com>
Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com>
Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for changes and most importantly changes to tests. Some minor nit comments, rest LGTM
please make a note of the failing integration test in commit message when pushing if it is unrelated to these changes. |
Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com>
Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com>
Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com>
Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com>
Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com>
Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com>
In the future, let's try to raise smaller size PRs typically within 700 lines of code changes including tests. This will help you get more reviews and more importantly more |
Hey @samikshya-db, it's tricky to further scope this down. Otherwise, it was adding too much overhead. |
@jayantsing-db I understand this would be harder for the initial set of SEA PRs. Even then, it is good to keep this in mind and try to break it down. Happy to brainstorm on this too. |
yes
heads up, this and couple more are refactoring PRs to prepare for SEA changes. refactoring PRs are usually high in LOC because we have to make sure current set of tests do not break and so the related changes have to go in together. Even then @varun-edachali-dbx first session re-factoring PR is ~500 for SEA PR, LOC will be low as expected per PR. |
NOTE: the `test_complex_types` e2e test was not working at the time of this merge. The test must be triggered when the test is back up and running as intended. * remove excess logs, assertions, instantiations large merge artifacts Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * formatting (black) + remove excess log (merge artifact) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * fix typing Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove un-necessary check Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove un-necessary replace call Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * introduce __str__ methods for CommandId and SessionId Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * docstrings for DatabricksClient interface Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * stronger typing of Cursor and ExecuteResponse Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove utility functions from backend interface, fix circular import Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * rename info to properties Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * newline for cleanliness Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * fix circular import Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * formatting (black) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * to_hex_id -> get_hex_id Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * better comment on protocol version getter Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * formatting (black) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * move guid to hex id to new utils module Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * formatting (black) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * move staging allowed local path to connection props Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * add strong return type for execute_command Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * skip auth, error handling in databricksclient interface Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * chore: docstring + line width Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * get_id -> get_guid Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * chore: docstring Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * fix: to_hex_id -> to_hex_guid Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> --------- Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com>
NOTE: the `test_complex_types` e2e test was not working at the time of this merge. The test must be triggered when the test is back up and running as intended. * remove excess logs, assertions, instantiations large merge artifacts Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * formatting (black) + remove excess log (merge artifact) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * fix typing Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove un-necessary check Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove un-necessary replace call Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * introduce __str__ methods for CommandId and SessionId Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * docstrings for DatabricksClient interface Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * stronger typing of Cursor and ExecuteResponse Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove utility functions from backend interface, fix circular import Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * rename info to properties Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * newline for cleanliness Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * fix circular import Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * formatting (black) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * to_hex_id -> get_hex_id Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * better comment on protocol version getter Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * formatting (black) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * move guid to hex id to new utils module Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * formatting (black) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * move staging allowed local path to connection props Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * add strong return type for execute_command Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * skip auth, error handling in databricksclient interface Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * chore: docstring + line width Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * get_id -> get_guid Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * chore: docstring Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * fix: to_hex_id -> to_hex_guid Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> --------- Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com>
* Separate Session related functionality from Connection class (#571) * decouple session class from existing Connection ensure maintenance of current APIs of Connection while delegating responsibility Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * add open property to Connection to ensure maintenance of existing API Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * update unit tests to address ThriftBackend through session instead of through Connection Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * chore: move session specific tests from test_client to test_session Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * formatting (black) as in CONTRIBUTING.md Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * use connection open property instead of long chain through session Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * trigger integration workflow Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * fix: ensure open attribute of Connection never fails in case the openSession takes long, the initialisation of the session will not complete immediately. This could make the session attribute inaccessible. If the Connection is deleted in this time, the open() check will throw because the session attribute does not exist. Thus, we default to the Connection being closed in this case. This was not an issue before because open was a direct attribute of the Connection class. Caught in the integration tests. Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * fix: de-complicate earlier connection open logic earlier, one of the integration tests was failing because 'session was not an attribute of Connection'. This is likely tied to a local configuration issue related to unittest that was causing an error in the test suite itself. The tests are now passing without checking for the session attribute. c676f9b Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * Revert "fix: de-complicate earlier connection open logic" This reverts commit d6b1b19. Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * [empty commit] attempt to trigger ci e2e workflow Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * Update CODEOWNERS (#562) new codeowners Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * Enhance Cursor close handling and context manager exception management to prevent server side resource leaks (#554) * Enhance Cursor close handling and context manager exception management * tests * fmt * Fix Cursor.close() to properly handle CursorAlreadyClosedError * Remove specific test message from Cursor.close() error handling * Improve error handling in connection and cursor context managers to ensure proper closure during exceptions, including KeyboardInterrupt. Add tests for nested cursor management and verify operation closure on server-side errors. * add * add Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * PECOBLR-86 improve logging on python driver (#556) * PECOBLR-86 Improve logging for debug level Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * PECOBLR-86 Improve logging for debug level Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * fixed format Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * used lazy logging Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * changed debug to error logs Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * used lazy logging Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> --------- Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * Revert "Merge remote-tracking branch 'upstream/sea-migration' into decouple-session" This reverts commit dbb2ec5, reversing changes made to 7192f11. Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * Reapply "Merge remote-tracking branch 'upstream/sea-migration' into decouple-session" This reverts commit bdb8381. Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * fix: separate session opening logic from instantiation ensures correctness of self.session.open call in Connection Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * fix: use is_open attribute to denote session availability Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * fix: access thrift backend through session Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * chore: use get_handle() instead of private session attribute in client Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * formatting (black) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * fix: remove accidentally removed assertions Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> --------- Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> Co-authored-by: Jothi Prakash <jothi.prakash@databricks.com> Co-authored-by: Madhav Sainanee <madhav.sainanee@databricks.com> Co-authored-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Introduce Backend Interface (DatabricksClient) (#573) NOTE: the `test_complex_types` e2e test was not working at the time of this merge. The test must be triggered when the test is back up and running as intended. * remove excess logs, assertions, instantiations large merge artifacts Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * formatting (black) + remove excess log (merge artifact) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * fix typing Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove un-necessary check Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove un-necessary replace call Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * introduce __str__ methods for CommandId and SessionId Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * docstrings for DatabricksClient interface Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * stronger typing of Cursor and ExecuteResponse Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove utility functions from backend interface, fix circular import Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * rename info to properties Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * newline for cleanliness Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * fix circular import Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * formatting (black) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * to_hex_id -> get_hex_id Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * better comment on protocol version getter Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * formatting (black) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * move guid to hex id to new utils module Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * formatting (black) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * move staging allowed local path to connection props Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * add strong return type for execute_command Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * skip auth, error handling in databricksclient interface Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * chore: docstring + line width Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * get_id -> get_guid Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * chore: docstring Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * fix: to_hex_id -> to_hex_guid Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> --------- Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * Implement ResultSet Abstraction (backend interfaces for fetch phase) (#574) * ensure backend client returns a ResultSet type in backend tests Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * formatting (black) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * newline for cleanliness Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * fix circular import Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * formatting (black) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * to_hex_id -> get_hex_id Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * better comment on protocol version getter Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * formatting (black) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * stricter typing for cursor Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * correct typing Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * correct tests and merge artifacts Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove accidentally modified workflow files remnants of old merge Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * chore: remove accidentally modified workflow files Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * add back accidentally removed docstrings Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * clean up docstrings Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * log hex Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove unnecessary _replace call Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * add __str__ for CommandId Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * take TOpenSessionResp in get_protocol_version to maintain existing interface Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * active_op_handle -> active_mmand_id Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * ensure None returned for close_command Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * account for ResultSet return in new pydocs Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * pydoc for types Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * move common state to ResultSet aprent Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * stronger typing in resultSet behaviour Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove redundant patch in test Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * add has_been_closed_server_side assertion Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove redundancies in tests Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * more robust close check Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * use normalised state in e2e test Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * simplify corrected test Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * add line gaps after multi-line pydocs for consistency Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * use normalised CommandState type in ExecuteResponse Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> --------- Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove un-necessary initialisation assertions Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove un-necessary line break s Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * more un-necessary line breaks Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * constrain diff of test_closing_connection_closes_commands Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * reduce diff of test_closing_connection_closes_commands Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * use pytest-like assertions for test_closing_connection_closes_commands Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * ensure command_id is not None Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * line breaks after multi-line pyfocs Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * ensure non null operationHandle for commandId creation Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * use command_id methods instead of explicit guid_to_hex_id conversion Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove un-necessary artifacts in test_session, add back assertion Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * add from __future__ import annotations to remove string literals around forward refs, remove some unused imports Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * move docstring of DatabricksClient within class Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * move ThriftResultSet import to top of file Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * make backend/utils __init__ file empty Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * use from __future__ import annotations to remove string literals around Cursor Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * use lazy logging Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * replace getters with property tag Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * set active_command_id to None, not active_op_handle Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * align test_session with pytest instead of unittest Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * formatting (black) Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove repetition from Session.__init__ Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * mention that if catalog / schema name is None, we fetch across all Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * mention fetching across all tables if null table name Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove lazy import of ThriftResultSet Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * remove unused import Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * better docstrings Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> * clarified role of cursor in docstring Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com> --------- Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com>
What type of PR is this?
Description
DatabricksClient
interface and make the existing thrift backend implement it. This allows the cursor to not be aware of the type of backend instantiated. Currently, we have to include some assertions in the ResultSet to ensure we have aThriftDatabricksClient
type, because the fetch-phase abstractions have not been implemented yet.SessionId
andCommandId
interfaces to create a consistent adapter to be used to represent sessions and commands instead of relying on Thrift (or eventually, SEA) specific types.How is this tested?
Related Tickets & Documents
https://docs.google.com/document/d/1Y-eXLhNqqhrMVGnOlG8sdFrCxBTN1GdQvuKG4IfHmo0/edit?usp=sharing
https://databricks.atlassian.net/browse/PECOBLR-440?atlOrigin=eyJpIjoiMTgzNGNiMDVkMGQ3NDM2Njg5OTRhZWQ1MGQ4Mjg1OWIiLCJwIjoiaiJ9