Skip to content

Conversation

@pulpdrew
Copy link
Contributor

@pulpdrew pulpdrew commented Oct 21, 2025

Closes HDX-2623

Summary

This change improves the performance of getKeyValues when getting values of a JSON key.

Generally, columns that are not referenced outside of a CTE will be pruned by the query planner. For JSON however, if the outer select references one field in a JSON column, then the inner select will read (it seems) the entire JSON object.

This PR also adds integration tests for getKeyValues to ensure that the function generates queries that work as expected in ClickHouse.

Performance impact (on single JSON Dashboard Filter)

  • Original: 15.03s
Screenshot 2025-10-21 at 3 28 07 PM
  • Optimized: 0.443s
Screenshot 2025-10-21 at 3 25 47 PM

@changeset-bot
Copy link

changeset-bot bot commented Oct 21, 2025

🦋 Changeset detected

Latest commit: d417159

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@hyperdx/common-utils Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@vercel
Copy link

vercel bot commented Oct 21, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
hyperdx-v2-oss-app Ready Ready Preview Comment Oct 25, 2025 10:21am

@claude
Copy link

claude bot commented Oct 21, 2025

PR Review - Performance Optimization for getKeyValues

No critical issues found.

This is a solid performance optimization that reduces query time from 15.03s to 0.443s by only selecting the specific columns needed instead of all columns when querying JSON fields.

Strengths:

  • ✅ Excellent performance improvement with clear benchmarks
  • ✅ Comprehensive integration tests added for getKeyValues
  • ✅ Proper test cleanup (closes client in afterAll)
  • ✅ Tests cover both disableRowLimit modes and various column types (regular, materialized, JSON)
  • ✅ Proper CI configuration added for common-utils integration tests

Minor Observations:

  • The hdxClient instance creates an internal ClickHouse client but doesn't expose a close() method. This appears to be by design in the existing BaseClickhouseClient implementation (not specific to this PR).

Recommendation: Approve and merge

@github-actions
Copy link
Contributor

github-actions bot commented Oct 21, 2025

E2E Test Results

All tests passed • 26 passed • 3 skipped • 229s

Status Count
✅ Passed 26
❌ Failed 0
⚠️ Flaky 0
⏭️ Skipped 3

View full report →

const selectClause = keys
.map((k, i) => `groupUniqArray(${limit})(${k}) AS param${i}`)
.join(', ');
if (keys.length === 0) return [];
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the functional changes are in this file.

This check was added because previously, the query would generate an empty select clause when no keys were provided, resulting in a query error. (eg. SELECT FROM table...)

@pulpdrew pulpdrew force-pushed the drew/optimize-filter-sampling branch from 3297abd to 23f37db Compare October 23, 2025 09:16
@pulpdrew pulpdrew marked this pull request as ready for review October 23, 2025 09:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant