Skip to content

Add optional telemetry support to the python connector #628

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 252 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
252 commits
Select commit Hold shift + click to select a range
70a2393
Set up github actions + remove databricks-specific test infra (#3)
Jun 24, 2022
891f744
Note about versions pre-2.0 (#5)
Jun 24, 2022
07c709e
Update licence with year and company name (#6)
Jun 27, 2022
a283faf
README updates (#9)
Jun 28, 2022
71defd5
Add CODEOWNERS (#10)
Jun 28, 2022
187cbe8
Reformat changelog (#11)
Jun 29, 2022
e5a5001
Add e2e tests (#12)
susodapop Jul 15, 2022
86969b1
Indicate that Python 3.10 is not supported (#27)
Aug 1, 2022
abaa45a
Add Developer Certificate of Origin requirement (#13)
Aug 1, 2022
870787b
Retry attempts that fail due to a connection timeout (#24)
Aug 5, 2022
10d765b
Bump to v2.0.3 (#28)
Aug 5, 2022
34397db
Bump version to 2.0.4-dev (#29)
Aug 10, 2022
5fd729a
[PECO-197] Support Python 3.10 (#31)
dbaxa Aug 17, 2022
5612e5f
Update changelog and bump to v2.0.4 (#34)
Aug 17, 2022
f1241ba
Bump to 2.0.5-dev on main (#35)
Aug 19, 2022
6bafb9d
On Pypi, display the "Project Links" sidebar. (#36)
Aug 19, 2022
afcb0f0
[ES-402013] Close cursors before closing connection (#38)
Aug 23, 2022
af945aa
Bump version to 2.0.5 and improve CHANGELOG (#40)
Aug 23, 2022
441a6ae
fix dco issue
moderakh Aug 25, 2022
29fe6b4
fix dco issue
moderakh Aug 25, 2022
06d9df8
Merge pull request #42 from moderakh/fix-dco-issue
moderakh Aug 25, 2022
cf3130e
dco tunning
moderakh Aug 25, 2022
4387f93
dco tunning
moderakh Aug 25, 2022
285e516
Merge pull request #43 from moderakh/dco-tunning
moderakh Aug 25, 2022
ea0f076
Github workflows: run checks on pull requests from forks (#47)
Aug 26, 2022
616a5c8
OAuth implementation (#15)
moderakh Sep 14, 2022
e39d294
Automate deploys to Pypi (#48)
Sep 22, 2022
1ea2fe0
[PECO-205] Add functional examples (#52)
Sep 30, 2022
3638fa2
Bump version to 2.1.0 (#54)
Oct 1, 2022
1a4cf4b
[SC-110400] Enabling compression in Python SQL Connector (#49)
mohitsingla-db Oct 13, 2022
8d6d47f
Add tests for parameter sanitisation / escaping (#46)
Oct 14, 2022
3d3c692
Bump thrift dependency to 0.16.0 (#65)
Nov 8, 2022
5cbfcac
Bump version to 2.2.0 (#66)
Nov 17, 2022
c6e573c
Support Python 3.11 (#60)
Nov 28, 2022
7c53b76
Bump version to 2.2.1 (#70)
Nov 28, 2022
4f221b3
Add none check on _oauth_persistence in DatabricksOAuthProvider (#71)
jackyhu-db Dec 29, 2022
cfa38a1
Support custom oauth client id and redirect port (#75)
jackyhu-db Dec 29, 2022
2f2a761
Bump version to 2.2.2 (#76)
jackyhu-db Jan 3, 2023
def5e0e
Merge staging ingestion into main (#78)
Jan 10, 2023
3cc9393
Bump version to 2.3.0 and update changelog (#80)
Jan 10, 2023
aa55a6e
Add pkgutil-style for the package (#84)
lu-wang-dl Jan 27, 2023
ce158cb
Add SQLAlchemy Dialect (#57)
Feb 17, 2023
0ed7e53
Bump to version 2.4.0(#89)
Feb 21, 2023
9a06d6c
Fix syntax in examples in root readme. (#92)
shea-parkes Feb 27, 2023
20e789f
Less strict numpy and pyarrow dependencies (#90)
Mar 7, 2023
3a60599
Update example in docstring so query output is valid Spark SQL (#95)
Mar 21, 2023
e627649
Bump version to 2.4.1 (#96)
Mar 21, 2023
c43eaf8
Update CODEOWNERS (#97)
moderakh Mar 24, 2023
b0b6abd
Add Andre to CODEOWNERS (#98)
yunbodeng-db Mar 29, 2023
f440791
Add external auth provider + example (#101)
andrefurlan-db Apr 12, 2023
5f247e5
Retry on connection timeout (#103)
andrefurlan-db Apr 13, 2023
c1d9510
[PECO-244] Make http proxies work (#81)
Apr 14, 2023
c5731d8
Bump to version 2.5.0 (#104)
Apr 15, 2023
7087236
Fix changelog release date for version 2.5.0
Apr 15, 2023
61b6911
Relax sqlalchemy requirement (#113)
Apr 28, 2023
b5ab608
Update to version 2.5.1 (#114)
Apr 28, 2023
ad6fbd9
Fix SQLAlchemy timestamp converter + docs (#117)
May 9, 2023
73108e2
Relax pandas and alembic requirements (#119)
May 9, 2023
7d85814
Bump to version 2.5.2 (#118)
May 9, 2023
4077c7f
Use urllib3 for thrift transport + reuse http connections (#131)
Jun 7, 2023
cdf1857
Default socket timeout to 15 min (#137)
mattdeekay Jun 7, 2023
5539b26
Bump version to 2.6.0 (#139)
Jun 7, 2023
728e2b1
Fix: some thrift RPCs failed with BadStatusLine (#141)
Jun 8, 2023
eada549
Bump version to 2.6.1 (#142)
Jun 8, 2023
cdc50d2
[ES-706907] Retry GetOperationStatus for http errors (#145)
Jun 14, 2023
2904788
Bump version to 2.6.2 (#147)
Jun 14, 2023
782ebb6
[PECO-626] Support OAuth flow for Databricks Azure (#86)
jackyhu-db Jun 20, 2023
b7ada62
Use a separate logger for unsafe thrift responses (#153)
Jun 23, 2023
c6cf88f
Improve e2e test development ergonomics (#155)
Jun 23, 2023
95cf95b
Don't raise exception when closing a stale Thrift session (#159)
Jun 26, 2023
3680a0f
Bump to version 2.7.0 (#161)
Jun 26, 2023
ba2cd84
Cloud Fetch download handler (#127)
mattdeekay Jun 27, 2023
061c763
Cloud Fetch download manager (#146)
mattdeekay Jul 3, 2023
e8fc63b
Cloud fetch queue and integration (#151)
mattdeekay Jul 5, 2023
813c73c
Cloud Fetch e2e tests (#154)
mattdeekay Jul 7, 2023
d3f0513
Update changelog for cloudfetch (#172)
mattdeekay Jul 10, 2023
6786933
Improve sqlalchemy backward compatibility with 1.3.24 (#173)
Jul 11, 2023
203735f
OAuth: don't override auth headers with contents of .netrc file (#122)
Jul 12, 2023
bd08f58
Fix proxy connection pool creation (#158)
sebbegg Jul 12, 2023
9508c4f
Relax pandas dependency constraint to allow ^2.0.0 (#164)
itsdani Jul 12, 2023
8140be9
Use hex string version of operation ID instead of bytes (#170)
Jul 12, 2023
850235c
SQLAlchemy: fix has_table so it honours schema= argument (#174)
Jul 12, 2023
4c766ef
Fix socket timeout test (#144)
mattdeekay Jul 12, 2023
7fe5ddf
Disable non_native_boolean_check_constraint (#120)
bkyryliuk Jul 12, 2023
50dfd93
Remove unused import for SQLAlchemy 2 compatibility (#128)
WilliamGentry Jul 12, 2023
4b0b8bd
Bump version to 2.8.0 (#178)
Jul 21, 2023
f07df30
Fix typo in python README quick start example (#186)
dbarrundia-tiger Aug 9, 2023
683e03c
Configure autospec for mocked Client objects (#188)
Aug 9, 2023
d168598
Use urllib3 for retries (#182)
Aug 9, 2023
fcfe8f4
Bump version to 2.9.0 (#189)
Aug 10, 2023
972f7cc
Explicitly add urllib3 dependency (#191)
jacobus-herman Aug 10, 2023
1c3ce1e
Bump to 2.9.1 (#195)
Aug 11, 2023
667f719
Make backwards compatible with urllib3~=1.0 (#197)
Aug 16, 2023
ddf8a5f
Convenience improvements to v3 retry logic (#199)
Aug 17, 2023
56c7d41
Bump version to 2.9.2 (#201)
Aug 18, 2023
312c7b9
Github Actions Fix: poetry install fails for python 3.7 tests (#208)
Aug 24, 2023
9bc0d3e
Make backwards compatible with urllib3~=1.0 [Follow up #197] (#206)
Aug 24, 2023
33390db
Bump version to 2.9.3 (#209)
Aug 24, 2023
e176f65
Add note to sqlalchemy example: IDENTITY isn't supported yet (#212)
Aug 31, 2023
854c56f
[PECO-1029] Updated thrift compiler version (#216)
nithinkdb Sep 9, 2023
0d1d7d8
[PECO-1055] Updated thrift defs to allow Tsparkparameters (#220)
nithinkdb Sep 11, 2023
c32b71a
Update changelog to indicate that 2.9.1 and 2.9.2 have been yanked. (…
Sep 13, 2023
4588ff3
Fix changelog typo: _enable_v3_retries (#225)
Sep 18, 2023
b9bd2a1
Introduce SQLAlchemy reusable dialog tests (#125)
unj1m Sep 20, 2023
329b7ee
[PECO-1026] Add Parameterized Query support to Python (#217)
nithinkdb Sep 22, 2023
9489087
Parameterized queries: Add e2e tests for inference (#227)
Sep 25, 2023
b94f59e
[PECO-1109] Parameterized Query: add suport for inferring decimal typ…
Sep 26, 2023
9592098
SQLAlchemy 2: reorganise dialect files into a single directory (#231)
Sep 26, 2023
84a6cbc
[PECO-1083] Updated thrift files and added check for protocol version…
nithinkdb Sep 29, 2023
9d93e1b
[PECO-840] Port staging ingestion behaviour to new UC Volumes (#235)
Sep 30, 2023
ef5fbda
Query parameters: implement support for binding NoneType parameters (…
Sep 30, 2023
f138703
SQLAlchemy 2: Bump dependency version and update e2e tests for existi…
Oct 2, 2023
04c99e4
Revert "[PECO-1083] Updated thrift files and added check for protocol…
Oct 2, 2023
cbe21e5
SQLAlchemy 2: add type compilation for all CamelCase types (#238)
Oct 2, 2023
77a8886
SQLAlchemy 2: add type compilation for uppercase types (#240)
Oct 2, 2023
4a70379
SQLAlchemy 2: Stop skipping all type tests (#242)
Oct 10, 2023
0e791ba
[PECO-1134] v3 Retries: allow users to bound the number of redirects …
Oct 10, 2023
f198a25
Parameters: Add type inference for BIGINT and TINYINT types (#246)
Oct 11, 2023
d975611
SQLAlchemy 2: Stop skipping some non-type tests (#247)
Oct 13, 2023
a596776
SQLAlchemy 2: implement and refactor schema reflection methods (#249)
Oct 13, 2023
16a5106
Add GovCloud domain into AWS domains (#252)
jackyhu-db Oct 17, 2023
ca84f1a
SQLAlchemy 2: Refactor __init__.py into base.py (#250)
Oct 18, 2023
45c6073
SQLAlchemy 2: Finish implementing all of ComponentReflectionTest (#251)
Oct 18, 2023
3a8b4ea
SQLAlchemy 2: Finish marking all tests in the suite (#253)
Oct 18, 2023
8a0ec56
SQLAlchemy 2: Finish organising compliance test suite (#256)
Oct 23, 2023
4905952
SQLAlchemy 2: Fix failing mypy checks from development (#257)
Oct 23, 2023
7444425
Enable cloud fetch by default (#258)
Oct 25, 2023
9a8ac88
[PECO-1137] Reintroduce protocol checking to Python test fw (#248)
nithinkdb Oct 25, 2023
6bc7413
sqla2 clean-up: make sqlalchemy optional and don't mangle the user-ag…
Oct 28, 2023
95e5595
SQLAlchemy 2: Add support for TINYINT (#265)
Oct 31, 2023
c69d886
Add OAuth M2M example (#266)
jackyhu-db Oct 31, 2023
012f6ed
Native Parameters: reintroduce INLINE approach with tests (#267)
Nov 1, 2023
b09ff05
Document behaviour of executemany (#213)
martinitus Nov 1, 2023
fd4336e
SQLAlchemy 2: Expose TIMESTAMP and TIMESTAMP_NTZ types to users (#268)
Nov 1, 2023
f3081a5
Drop Python 3.7 as a supported version (#270)
Nov 1, 2023
ff51bfb
GH Workflows: remove Python 3.7 from the matrix for _all_ workflows (…
Nov 9, 2023
ca000db
Add README and updated example for SQLAlchemy usage (#273)
Nov 16, 2023
6aa7890
Rewrite native parameter implementation with docs and tests (#281)
Nov 16, 2023
bf084fe
Enable v3 retries by default (#282)
Nov 17, 2023
23b51c9
security: bump pyarrow dependency to 14.0.1 (#284)
Nov 17, 2023
5a1acdc
Bump package version to 3.0.0 (#285)
Nov 17, 2023
e768d48
Fix docstring about default parameter approach (#287)
Falydoor Nov 21, 2023
505a522
[PECO-1286] Add tests for complex types in query results (#293)
Nov 29, 2023
5c01874
sqlalchemy: fix deprecation warning for dbapi classmethod (#294)
Nov 29, 2023
2027145
[PECO-1297] sqlalchemy: fix: can't read columns for tables containing…
Nov 30, 2023
9e963a0
Prepared 3.0.1 release (#297)
Dec 1, 2023
f703d81
Make contents of `__init__.py` equal across projects (#304)
pietern Dec 26, 2023
bdd2cb6
Fix URI construction in ThriftBackend (#303)
NodeJSmith Jan 23, 2024
00b8d3e
[sqlalchemy] Add table and column comment support (#329)
Jan 25, 2024
a6e81ed
Pin pandas and urllib3 versions to fix runtime issues in dbt-databric…
benc-db Jan 25, 2024
c89da23
SQLAlchemy: TINYINT types didn't reflect properly (#315)
TimTheinAtTabs Jan 25, 2024
6482c76
[PECO-1435] Restore `tests.py` to the test suite (#331)
Jan 26, 2024
d20d931
Bump to version 3.0.2 (#335)
Jan 26, 2024
e3e0f49
Update some outdated OAuth comments (#339)
jackyhu-db Jan 30, 2024
456fec5
Redact the URL query parameters from the urllib3.connectionpool logs …
mkazia-db Feb 2, 2024
01cfc66
Bump to version 3.0.3 (#344)
jackyhu-db Feb 2, 2024
9ff99b8
[PECO-1411] Support Databricks OAuth on GCP (#338)
jackyhu-db Feb 5, 2024
072ef2c
[PECO-1414] Support Databricks native OAuth in Azure (#351)
jackyhu-db Feb 13, 2024
f52c658
Prep for Test Automation (#352)
benc-db Feb 14, 2024
b1bd792
Update code owners (#345)
yunbodeng-db Feb 14, 2024
70f3738
Reverting retry behavior on 429s/503s to how it worked in 2.9.3 (#349)
benc-db Feb 15, 2024
912127c
Bump to version 3.1.0 (#358)
jackyhu-db Feb 16, 2024
1ed5c9d
[PECO-1440] Expose current query id on cursor object (#364)
kravets-levko Mar 4, 2024
1577506
Add a default for retry after (#371)
benc-db Mar 14, 2024
e01ef74
Fix boolean literals (#357)
aholyoke Mar 14, 2024
7cfd6f6
Don't retry network requests that fail with code 403 (#373)
Mar 15, 2024
6cf12fb
Bump to 3.1.1 (#374)
benc-db Mar 19, 2024
02d08d6
Fix cookie setting (#379)
benc-db Mar 27, 2024
4122597
Fixing a couple type problems: how I would address most of #381 (#382)
wyattscarpenter Apr 2, 2024
3631e55
fix the return types of the classes' __enter__ functions (#384)
wyattscarpenter Apr 2, 2024
4b1b7ad
Add Kravets Levko to codeowners (#386)
kravets-levko Apr 15, 2024
f2d927b
Prepare for 3.1.2 (#387)
benc-db Apr 18, 2024
2d2f8f7
Update the proxy authentication (#354)
amir-haroun May 23, 2024
d9802a8
Fix failing tests (#392)
kravets-levko May 28, 2024
9c158d9
Relax `pyarrow` pin (#389)
dhirschfeld May 29, 2024
0400bdb
Fix log error in oauth.py (#269)
susodapop May 29, 2024
683a033
Enable `delta.feature.allowColumnDefaults` for all tables (#343)
dhirschfeld May 30, 2024
3a68fa8
Fix SQLAlchemy tests (#393)
kravets-levko May 30, 2024
94a2597
Add more debug logging for CloudFetch (#395)
kravets-levko Jun 6, 2024
3a50d70
Update Thrift package (#397)
m1n0 Jun 12, 2024
37d8a7b
Prepare release 3.2.0 (#396)
kravets-levko Jun 13, 2024
0017b0c
move py.typed to correct places (#403)
wyattscarpenter Jul 2, 2024
2a1875a
Upgrade mypy (#406)
wyattscarpenter Jul 3, 2024
9fd4a25
Do not retry failing requests with status code 401 (#408)
Hodnebo Jul 3, 2024
74bcc86
[PECO-1715] Remove username/password (BasicAuth) auth option (#409)
jackyhu-db Jul 4, 2024
e7c0c06
[PECO-1751] Refactor CloudFetch downloader: handle files sequentially…
kravets-levko Jul 11, 2024
677483d
Fix CloudFetch retry policy to be compatible with all `urllib3` versi…
kravets-levko Jul 11, 2024
512efca
Disable SSL verification for CloudFetch links (#414)
kravets-levko Jul 16, 2024
1a1497b
Prepare relese 3.3.0 (#415)
kravets-levko Jul 17, 2024
b751088
Fix pandas 2.2.2 support (#416)
kfollesdal Jul 26, 2024
4959197
[PECO-1801] Make OAuth as the default authenticator if no authenticat…
jackyhu-db Aug 1, 2024
7467860
[PECO-1857] Use SSL options with HTTPS connection pool (#425)
kravets-levko Aug 22, 2024
2de70ec
Prepare release v3.4.0 (#430)
kravets-levko Aug 27, 2024
2675099
[PECO-1926] Create a non pyarrow flow to handle small results for the…
jprakash-db Oct 3, 2024
c755ecc
[PECO-1961] On non-retryable error, ensure PySQL includes useful info…
shivam2680 Oct 3, 2024
1a44d91
Reformatted all the files using black (#448)
jprakash-db Oct 3, 2024
92dff6c
Prepare release v3.5.0 (#457)
jackyhu-db Oct 18, 2024
b4bcf8a
[PECO-2051] Add custom auth headers into cloud fetch request (#460)
jackyhu-db Oct 25, 2024
28a0fe6
Prepare release 3.6.0 (#461)
jackyhu-db Oct 25, 2024
82efe73
[ PECO - 1768 ] PySQL: adjust HTTP retry logic to align with Go and N…
jprakash-db Nov 20, 2024
5e11582
[ PECO-2065 ] Create the async execution flow for the PySQL Connector…
jprakash-db Nov 26, 2024
a9ae775
Fix for check_types github action failing (#472)
jprakash-db Nov 26, 2024
c251d91
Remove upper caps on dependencies (#452)
arredond Dec 5, 2024
8a63786
Updated the doc to specify native parameters in PUT operation is not …
jprakash-db Dec 6, 2024
9d6813b
Incorrect rows in inline fetch result (#479)
jprakash-db Dec 22, 2024
8468a2b
Bumped up to version 3.7.0 (#482)
jprakash-db Dec 23, 2024
aa673ac
PySQL Connector split into connector and sqlalchemy (#444)
jprakash-db Dec 27, 2024
fd7f85c
Removed CI CD for python3.8 (#490)
jprakash-db Jan 17, 2025
b20c55b
Added CI CD upto python 3.12 (#491)
jprakash-db Jan 18, 2025
d61a964
Merging changes from v3.7.1 release (#488)
jprakash-db Jan 18, 2025
efd82fb
Bumped up to version 4.0.0 (#493)
jprakash-db Jan 22, 2025
ed19388
Updated action's version (#455)
newwingbird Feb 27, 2025
f8f9f4e
Support Python 3.13 and update deps (#510)
dhirschfeld Feb 27, 2025
0e51281
Improve debugging + fix PR review template (#514)
samikshya-db Mar 2, 2025
9665a74
Forward porting all changes into 4.x.x. uptil v3.7.3 (#529)
jprakash-db Mar 7, 2025
b24ddd7
Updated the CODEOWNERS (#531)
jprakash-db Mar 7, 2025
0013ba4
Add version check for urllib3 in backoff calculation (#526)
shivam2680 Mar 11, 2025
851d23b
[ES-1372353] make user_agent_header part of public API (#530)
shivam2680 Mar 12, 2025
f321b49
Updates runner used to run DCO check to use databricks-protected-runn…
madhav-db Mar 12, 2025
b000892
Support multiple timestamp formats in non arrow flow (#533)
jprakash-db Mar 18, 2025
2553bcf
prepare release for v4.0.1 (#534)
shivam2680 Mar 19, 2025
078f41b
Relaxed bound for python-dateutil (#538)
jprakash-db Apr 1, 2025
adc2c86
Bumped up the version for 4.0.2 (#539)
jprakash-db Apr 1, 2025
f9fe172
Added example for async execute query (#537)
jprakash-db Apr 1, 2025
6f99449
Added urllib3 version check (#547)
jprakash-db Apr 21, 2025
6790dca
Bump version to 4.0.3 (#549)
jprakash-db Apr 22, 2025
3a4d6d3
Cleanup fields as they might be deprecated/removed/change in the futu…
vikrantpuppala May 9, 2025
557bb68
Refactor decimal conversion in PyArrow tables to use direct casting (…
jayantsing-db May 12, 2025
9a3f946
[PECOBLR-361] convert column table to arrow if arrow present (#551)
shivam2680 May 16, 2025
7233e4e
Update CODEOWNERS (#562)
jprakash-db May 21, 2025
b88eba0
Enhance Cursor close handling and context manager exception managemen…
madhav-db May 21, 2025
14c8a7e
PECOBLR-86 improve logging on python driver (#556)
saishreeeee May 22, 2025
8013a0d
Update github actions run conditions (#569)
jprakash-db May 26, 2025
fdd385f
Added classes required for telemetry (#572)
saishreeeee May 30, 2025
9dc7d52
E2E POC for python telemetry for connect logs (#581)
saishreeeee Jun 10, 2025
ce2cc1a
Merge branch 'main' into HEAD
saishreeeee Jun 17, 2025
99ec875
Merge branch 'main' into telemetry
saishreeeee Jun 17, 2025
cf89ce3
Added functionality for export of failure logs (#591)
saishreeeee Jun 19, 2025
380b0b9
bugfix: stalling test issue (close in TelemetryClientFactory) (#609)
saishreeeee Jun 23, 2025
23d8881
Updated tests (#614)
jprakash-db Jun 24, 2025
350e745
Add test to check thrift field IDs (#602)
vikrantpuppala Jun 24, 2025
4a2356d
Revert "Enhance Cursor close handling and context manager exception m…
madhav-db Jun 24, 2025
97df72e
Bump version to 4.0.5 (#615)
madhav-db Jun 24, 2025
6748c2c
Merge branch 'main' into telemetry
saishreeeee Jun 25, 2025
0dfe0f4
Add functionality for export of latency logs via telemetry (#608)
saishreeeee Jul 3, 2025
10375a8
Merge branch 'main' into telemetry
saishreeeee Jul 7, 2025
8c0f474
Revert "Merge branch 'main' into telemetry"
saishreeeee Jul 7, 2025
13ebfb4
Revert "Revert "Merge branch 'main' into telemetry""
saishreeeee Jul 7, 2025
79db09f
workflows
saishreeeee Jul 7, 2025
5005561
-
saishreeeee Jul 7, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
154 changes: 127 additions & 27 deletions src/databricks/sql/client.py

Large diffs are not rendered by default.

15 changes: 13 additions & 2 deletions src/databricks/sql/exc.py
Original file line number Diff line number Diff line change
@@ -1,21 +1,32 @@
import json
import logging

logger = logging.getLogger(__name__)
from databricks.sql.telemetry.telemetry_client import TelemetryClientFactory

logger = logging.getLogger(__name__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't think we're using this?


### PEP-249 Mandated ###
# https://peps.python.org/pep-0249/#exceptions
class Error(Exception):
"""Base class for DB-API2.0 exceptions.
`message`: An optional user-friendly error message. It should be short, actionable and stable
`context`: Optional extra context about the error. MUST be JSON serializable
"""

def __init__(self, message=None, context=None, *args, **kwargs):
def __init__(
self, message=None, context=None, session_id_hex=None, *args, **kwargs
):
super().__init__(message, *args, **kwargs)
self.message = message
self.context = context or {}

error_name = self.__class__.__name__
if session_id_hex:
telemetry_client = TelemetryClientFactory.get_telemetry_client(
session_id_hex
)
telemetry_client.export_failure_log(error_name, self.message)

def __str__(self):
return self.message

Expand Down
231 changes: 231 additions & 0 deletions src/databricks/sql/telemetry/latency_logger.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,231 @@
import time
import functools
from typing import Optional
from databricks.sql.telemetry.telemetry_client import TelemetryClientFactory
from databricks.sql.telemetry.models.event import (
SqlExecutionEvent,
)
from databricks.sql.telemetry.models.enums import ExecutionResultFormat, StatementType
from databricks.sql.utils import ColumnQueue, CloudFetchQueue, ArrowQueue
from uuid import UUID


class TelemetryExtractor:
"""
Base class for extracting telemetry information from various object types.
This class serves as a proxy that delegates attribute access to the wrapped object
while providing a common interface for extracting telemetry-related data.
"""

def __init__(self, obj):
"""
Initialize the extractor with an object to wrap.
Args:
obj: The object to extract telemetry information from.
"""
self._obj = obj

def __getattr__(self, name):
"""
Delegate attribute access to the wrapped object.
Args:
name (str): The name of the attribute to access.
Returns:
The attribute value from the wrapped object.
"""
return getattr(self._obj, name)

def get_session_id_hex(self):
pass

def get_statement_id(self):
pass

def get_is_compressed(self):
pass

def get_execution_result(self):
pass

def get_retry_count(self):
pass


class CursorExtractor(TelemetryExtractor):
"""
Telemetry extractor specialized for Cursor objects.
Extracts telemetry information from database cursor objects, including
statement IDs, session information, compression settings, and result formats.
"""

def get_statement_id(self) -> Optional[str]:
return self.query_id

def get_session_id_hex(self) -> Optional[str]:
return self.connection.get_session_id_hex()

def get_is_compressed(self) -> bool:
return self.connection.lz4_compression

def get_execution_result(self) -> ExecutionResultFormat:
if self.active_result_set is None:
return ExecutionResultFormat.FORMAT_UNSPECIFIED

if isinstance(self.active_result_set.results, ColumnQueue):
return ExecutionResultFormat.COLUMNAR_INLINE
elif isinstance(self.active_result_set.results, CloudFetchQueue):
return ExecutionResultFormat.EXTERNAL_LINKS
elif isinstance(self.active_result_set.results, ArrowQueue):
return ExecutionResultFormat.INLINE_ARROW
return ExecutionResultFormat.FORMAT_UNSPECIFIED

def get_retry_count(self) -> int:
if (
hasattr(self.thrift_backend, "retry_policy")
and self.thrift_backend.retry_policy
):
return len(self.thrift_backend.retry_policy.history)
return 0


class ResultSetExtractor(TelemetryExtractor):
"""
Telemetry extractor specialized for ResultSet objects.
Extracts telemetry information from database result set objects, including
operation IDs, session information, compression settings, and result formats.
"""

def get_statement_id(self) -> Optional[str]:
if self.command_id:
return str(UUID(bytes=self.command_id.operationId.guid))
return None

def get_session_id_hex(self) -> Optional[str]:
return self.connection.get_session_id_hex()

def get_is_compressed(self) -> bool:
return self.lz4_compressed

def get_execution_result(self) -> ExecutionResultFormat:
if isinstance(self.results, ColumnQueue):
return ExecutionResultFormat.COLUMNAR_INLINE
elif isinstance(self.results, CloudFetchQueue):
return ExecutionResultFormat.EXTERNAL_LINKS
elif isinstance(self.results, ArrowQueue):
return ExecutionResultFormat.INLINE_ARROW
return ExecutionResultFormat.FORMAT_UNSPECIFIED

def get_retry_count(self) -> int:
if (
hasattr(self.thrift_backend, "retry_policy")
and self.thrift_backend.retry_policy
):
return len(self.thrift_backend.retry_policy.history)
return 0


def get_extractor(obj):
"""
Factory function to create the appropriate telemetry extractor for an object.
Determines the object type and returns the corresponding specialized extractor
that can extract telemetry information from that object type.
Args:
obj: The object to create an extractor for. Can be a Cursor, ResultSet,
or any other object.
Returns:
TelemetryExtractor: A specialized extractor instance:
- CursorExtractor for Cursor objects
- ResultSetExtractor for ResultSet objects
- Throws an NotImplementedError for all other objects
"""
if obj.__class__.__name__ == "Cursor":
return CursorExtractor(obj)
elif obj.__class__.__name__ == "ResultSet":
return ResultSetExtractor(obj)
else:
raise NotImplementedError(f"No extractor found for {obj.__class__.__name__}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this error break anything or just be printed in debug mode?



def log_latency(statement_type: StatementType = StatementType.NONE):
"""
Decorator for logging execution latency and telemetry information.
This decorator measures the execution time of a method and sends telemetry
data about the operation, including latency, statement information, and
execution context.
The decorator automatically:
- Measures execution time using high-precision performance counters
- Extracts telemetry information from the method's object (self)
- Creates a SqlExecutionEvent with execution details
- Sends the telemetry data asynchronously via TelemetryClient
Args:
statement_type (StatementType): The type of SQL statement being executed.
Usage:
@log_latency(StatementType.SQL)
def execute(self, query):
# Method implementation
pass
Returns:
function: A decorator that wraps methods to add latency logging.
Note:
The wrapped method's object (self) must be compatible with the
telemetry extractor system (e.g., Cursor or ResultSet objects).
"""

def decorator(func):
@functools.wraps(func)
def wrapper(self, *args, **kwargs):
start_time = time.perf_counter()
result = None
try:
result = func(self, *args, **kwargs)
return result
finally:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if anything in this finally block fails?


def _safe_call(func_to_call):
"""Calls a function and returns a default value on any exception."""
try:
return func_to_call()
except Exception:
return None

end_time = time.perf_counter()
duration_ms = int((end_time - start_time) * 1000)

extractor = get_extractor(self)
session_id_hex = _safe_call(extractor.get_session_id_hex)
statement_id = _safe_call(extractor.get_statement_id)

sql_exec_event = SqlExecutionEvent(
statement_type=statement_type,
is_compressed=_safe_call(extractor.get_is_compressed),
execution_result=_safe_call(extractor.get_execution_result),
retry_count=_safe_call(extractor.get_retry_count),
)

telemetry_client = TelemetryClientFactory.get_telemetry_client(
session_id_hex
)
telemetry_client.export_latency_log(
latency_ms=duration_ms,
sql_execution_event=sql_exec_event,
sql_statement_id=statement_id,
)

return wrapper

return decorator
43 changes: 43 additions & 0 deletions src/databricks/sql/telemetry/models/endpoint_models.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
import json
from dataclasses import dataclass, asdict
from typing import List, Optional


@dataclass
class TelemetryRequest:
"""
Represents a request to send telemetry data to the server side.
Contains the telemetry items to be uploaded and optional protocol buffer logs.
Attributes:
uploadTime (int): Unix timestamp in milliseconds when the request is made
items (List[str]): List of telemetry event items to be uploaded
protoLogs (Optional[List[str]]): Optional list of protocol buffer formatted logs
"""

uploadTime: int
items: List[str]
protoLogs: Optional[List[str]]

def to_json(self):
return json.dumps(asdict(self))


@dataclass
class TelemetryResponse:
"""
Represents the response from the telemetry backend after processing a request.
Contains information about the success or failure of the telemetry upload.
Attributes:
errors (List[str]): List of error messages if any occurred during processing
numSuccess (int): Number of successfully processed telemetry items
numProtoSuccess (int): Number of successfully processed protocol buffer logs
"""

errors: List[str]
numSuccess: int
numProtoSuccess: int

def to_json(self):
return json.dumps(asdict(self))
42 changes: 42 additions & 0 deletions src/databricks/sql/telemetry/models/enums.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
from enum import Enum


class AuthFlow(Enum):
TOKEN_PASSTHROUGH = "token_passthrough"
BROWSER_BASED_AUTHENTICATION = "browser_based_authentication"


class AuthMech(Enum):
CLIENT_CERT = "CLIENT_CERT" # ssl certificate authentication
PAT = "PAT" # Personal Access Token authentication
DATABRICKS_OAUTH = "DATABRICKS_OAUTH" # Databricks-managed OAuth flow
EXTERNAL_AUTH = "EXTERNAL_AUTH" # External identity provider (AWS, Azure, etc.)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @jprakash-db is adding some enums in his latest PR #621, is there scope to re-use?



class DatabricksClientType(Enum):
SEA = "SEA"
THRIFT = "THRIFT"


class DriverVolumeOperationType(Enum):
TYPE_UNSPECIFIED = "type_unspecified"
PUT = "put"
GET = "get"
DELETE = "delete"
LIST = "list"
QUERY = "query"


class ExecutionResultFormat(Enum):
FORMAT_UNSPECIFIED = "format_unspecified"
INLINE_ARROW = "inline_arrow"
EXTERNAL_LINKS = "external_links"
COLUMNAR_INLINE = "columnar_inline"


class StatementType(Enum):
NONE = "none"
QUERY = "query"
SQL = "sql"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

query type is used when sql command is expected to return results, in jdbc it makes sense due to the nature of interface, but do we need both query and sql in python connector?

UPDATE = "update"
METADATA = "metadata"
Loading
Loading