-
Notifications
You must be signed in to change notification settings - Fork 82
feat(webui): Store only 1000 search results in the results cache for Presto queries. #1209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughImplements per-chunk result counting and a storage cap in the Presto /query route. Inserts up to MAX_PRESTO_SEARCH_RESULTS rows into MongoDB, updates metadata with total results after each chunk, and enhances logging. Adds MAX_PRESTO_SEARCH_RESULTS constant in typings. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Server (/query)
participant Presto
participant MongoDB
participant Metadata
Client->>Server (/query): HTTP request
Server (/query)->>Presto: Execute query stream
loop For each data chunk
Presto-->>Server (/query): data[]
Server (/query)->>Server (/query): totalResultsCount += data.length
alt Remaining storage slots > 0
Server (/query)->>Server (/query): dataToInsert = data.slice(0, remainingSlots)
Server (/query)->>MongoDB: insertPrestoRowsToMongo(dataToInsert)
MongoDB-->>Server (/query): ack/error
Server (/query)->>Server (/query): Update storedResultsCount on success
else No remaining slots
Server (/query)->>Server (/query): Skip insertion
end
Server (/query)->>Metadata: updateOne({ numTotalResults: totalResultsCount })
Metadata-->>Server (/query): ack/error
end
Server (/query)-->>Client: Streamed response/completion
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Possibly related PRs
Suggested reviewers
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
components/webui/server/src/routes/api/presto-search/index.ts
(3 hunks)components/webui/server/src/routes/api/presto-search/typings.ts
(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}
⚙️ CodeRabbit Configuration File
- Prefer
false == <expression>
rather than!<expression>
.
Files:
components/webui/server/src/routes/api/presto-search/typings.ts
components/webui/server/src/routes/api/presto-search/index.ts
🧬 Code Graph Analysis (1)
components/webui/server/src/routes/api/presto-search/index.ts (2)
components/webui/server/src/routes/api/presto-search/typings.ts (1)
MAX_PRESTO_SEARCH_RESULTS
(4-4)components/webui/server/src/routes/api/presto-search/utils.ts (1)
insertPrestoRowsToMongo
(50-50)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build (macos-15)
- GitHub Check: lint-check (ubuntu-24.04)
🔇 Additional comments (3)
components/webui/server/src/routes/api/presto-search/typings.ts (1)
1-4
: LGTM: Clear constant and docstring align with the PR goal.The constant name and documentation are self-explanatory and the value matches the UI cap.
components/webui/server/src/routes/api/presto-search/index.ts (2)
18-18
: Import looks correct and consistent with NodeNext/ESM.js
extensions.Matches existing import style in this module. No issues spotted.
65-67
: Counters are well-scoped for tracking totals vs stored counts.The separation between
totalResultsCount
andstoredResultsCount
is clear and fits the PR intent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clean and nice change.
Validations:
- run sql queries that return more than 1000 rows
- only the first 1000 rows displayed in the table
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry I didn't think it carefully, but there is one nitpick:
Can we remove storedResultsCount and only use totalResultsCount? I guess this logic is cleaner?
if (storedResultsCount < MAX_PRESTO_SEARCH_RESULTS) { | ||
const remainingSlots = | ||
MAX_PRESTO_SEARCH_RESULTS - storedResultsCount; | ||
const dataToInsert = data.slice(0, remainingSlots); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (storedResultsCount < MAX_PRESTO_SEARCH_RESULTS) { | |
const remainingSlots = | |
MAX_PRESTO_SEARCH_RESULTS - storedResultsCount; | |
const dataToInsert = data.slice(0, remainingSlots); | |
if (totalResultsCount < MAX_PRESTO_SEARCH_RESULTS) { | |
const remainingSlots = | |
MAX_PRESTO_SEARCH_RESULTS - totalResultsCount; | |
const dataToInsert = data.slice(0, remainingSlots); | |
// TODO: Error, and success handlers are dummy implementations | ||
// and will be replaced with proper implementations. | ||
data: (_, data, columns) => { | ||
totalResultsCount += data.length; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this after the insertion of data and before the update of metadata
|
||
let searchJobId: string; | ||
let totalResultsCount = 0; | ||
let storedResultsCount = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove storedResultsCount
@hoophalab - I see ur point here but i am slightly worried about moving totalResultsCount += data.length; after the insertion of data. If we ever add an await into InsertPrestoRows, its possible the data callback gets called again and we might have a double insert. |
sure let's keep it as-is then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deferring to @hoophalab's review.
Description
PR limits the amount results actually stored in mongoDB since 1) ui only shows 1000 results max, 2) could take up lots of space for long running queries.
I tried a few design but i think this inline design is the simplest.
The total num results is updated in metadata, but the UI does not show this yet. This will be another PR.
Checklist
breaking change.
Validation performed
Ran long query with lots of results. Checked results in mongo was only 1000. checked total in results metadata was > 1000
Summary by CodeRabbit