Add debug latency logs for all-purpose compute #862

jayantsing-db · 2025-06-24T14:15:42Z

NO_CHANGELOG=true

Description

Until telemetry is rolled-out, add debug latency logs to analyse the thrift server http calls

Lower level http transport logs

2025-06-24 13:54:09 FINE com.databricks.jdbc.dbclient.impl.thrift.DatabricksHttpTTransport#flush - Connection [26021ed8-74eb-47fe-9c9f-89c4310daa9b] Header refresh latency: 1537ms

2025-06-24 13:54:10 FINE com.databricks.jdbc.dbclient.impl.thrift.DatabricksHttpTTransport#flush - Connection [900a5c8d-332c-45a8-b9c7-962bfbb89d45] HTTP request latency: 1346ms

Thrift Accessor logs

2025-06-24 13:54:10 FINE com.databricks.jdbc.dbclient.impl.thrift.DatabricksThriftAccessor#getThriftResponse - Connection [900a5c8d-332c-45a8-b9c7-962bfbb89d45] Thrift request latency (TOpenSessionReq): 2967ms

2025-06-24 14:07:25 FINE com.databricks.jdbc.dbclient.impl.thrift.DatabricksThriftAccessor#execute - Connection [303aebdd-ccf7-4152-b003-945aaf2bebcb] Statement [49294d43-56d0-4170-bb70-3707733e07ed|0ee7cdf5-d339-4e04-9408-c67c73440830] Session [5acd47b3-dc29-490c-5acd-47b3dc29490c (AllPurpose cluster with clusterId {0613-231111-59eitsdx} and orgId {6051921418418893})] Thrift polling latency: 1ms

2025-06-24 14:07:35 FINE com.databricks.jdbc.dbclient.impl.thrift.DatabricksThriftAccessor#execute - Connection [1949a7a5-17d2-4b89-9761-091d0edc14d0] Statement [9fc42a20-7ddc-4ddc-8933-15c6983946c5|4553e64a-45bc-4295-8934-5954ebff4780] Session [f4159b0e-883f-4136-f415-9b0e883f4136 (AllPurpose cluster with clusterId {0613-231111-59eitsdx} and orgId {6051921418418893})] Thrift fetch latency: 342ms

2025-06-24 14:07:25 FINE com.databricks.jdbc.dbclient.impl.thrift.DatabricksThriftAccessor#execute - Connection [303aebdd-ccf7-4152-b003-945aaf2bebcb] Statement [49294d43-56d0-4170-bb70-3707733e07ed|0ee7cdf5-d339-4e04-9408-c67c73440830] Session [5acd47b3-dc29-490c-5acd-47b3dc29490c (AllPurpose cluster with clusterId {0613-231111-59eitsdx} and orgId {6051921418418893})] Thrift execute latency (SQL): 725ms

Testing

Manual testing

Additional Notes to the Reviewer

github-actions · 2025-06-24T14:16:42Z

Please ensure that the NEXT_CHANGELOG.md file is updated with any relevant changes.
If this is not necessary for your PR, please include the following in your PR description:
NO_CHANGELOG=true
and rerun the job.

github-actions · 2025-06-24T14:19:30Z

📊 Code Coverage Report

Overall Project	86.18% `-0.07%`	🟢
Files changed	90.98%	🟢

File	Coverage
DatabricksThriftAccessor.java	85.11% `-1.33%`	🟢
DatabricksHttpTTransport.java	80.4% `-5.2%`	🔴

samikshya-db · 2025-06-24T15:40:42Z

src/main/java/com/databricks/jdbc/dbclient/impl/thrift/DatabricksThriftAccessor.java

@@ -312,6 +401,20 @@ DatabricksResultSet executeAsync(
        throw new DatabricksSQLException(response.status.errorMessage, response.status.sqlState);
      }
    } catch (DatabricksSQLException | TException e) {
+      long executeAsyncEndTime = System.currentTimeMillis();


each of this is laced with DatabricksMetricsTimedProcessor - can we maybe use that to log the timings?

yes i would have ideally used that. But that annotation accesses connection-context, statement-id, etc. through context holder which is broken. Many of these logs need that info.

context holder which is broken

The aim is just to print the time required on these functions, not to exportLatencyLog(executionTime); maybe we just log the time and function name there?

i think statement id and session id, etc. are important for further analysis. for example, relating to the cluster logs, etc. just the method name will give us client side timings and we will have no way to relate to cluster and detailed analysis.

The complete logs that will be shared by the customer will anyway have the initial connection log, context etc. Is that not the case?

No, it won't have statement ids, session ids for concurrent connections. And even if it has and we don't attach ids with latency logs, in concurrent execution, there is no guaranteed way to map a latency log of a method name with the corresponding session-id, statement-id

The connectionContext bug is a one off case, we can trade off that with polluting our implementation I guess. If we are not ok with that, can you create a TODO to remove all of this once the task has been taken up? https://databricks.atlassian.net/browse/PECOBLR-389

Added a TODO to remove these once timed processor annotation has access to session, operation handles (or the latency issue is resolved).

Add debug latency logs for all-purpose compute

c1a88b1

jayantsing-db had a problem deploying to azure-prod June 24, 2025 14:15 — with GitHub Actions Error

jayantsing-db had a problem deploying to azure-prod June 24, 2025 14:15 — with GitHub Actions Failure

jayantsing-db requested review from shivam2680, gopalldb and samikshya-db June 24, 2025 14:16

samikshya-db reviewed Jun 24, 2025

View reviewed changes

gopalldb approved these changes Jun 24, 2025

View reviewed changes

Add TODO to remove debug latency logs

8bdc961

jayantsing-db had a problem deploying to azure-prod June 24, 2025 16:33 — with GitHub Actions Failure

jayantsing-db had a problem deploying to azure-prod June 24, 2025 16:33 — with GitHub Actions Error

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add debug latency logs for all-purpose compute #862

Add debug latency logs for all-purpose compute #862

jayantsing-db commented Jun 24, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jun 24, 2025

Uh oh!

github-actions bot commented Jun 24, 2025 •

edited

Loading

Uh oh!

samikshya-db Jun 24, 2025

Uh oh!

jayantsing-db Jun 24, 2025

Uh oh!

samikshya-db Jun 24, 2025

Uh oh!

jayantsing-db Jun 24, 2025

Uh oh!

samikshya-db Jun 24, 2025

Uh oh!

jayantsing-db Jun 24, 2025

Uh oh!

samikshya-db Jun 24, 2025

Uh oh!

jayantsing-db Jun 24, 2025

Uh oh!

Uh oh!

Add debug latency logs for all-purpose compute #862

Are you sure you want to change the base?

Add debug latency logs for all-purpose compute #862

Conversation

jayantsing-db commented Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Testing

Additional Notes to the Reviewer

Uh oh!

github-actions bot commented Jun 24, 2025

Uh oh!

github-actions bot commented Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📊 Code Coverage Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jayantsing-db commented Jun 24, 2025 •

edited

Loading

github-actions bot commented Jun 24, 2025 •

edited

Loading