-
Notifications
You must be signed in to change notification settings - Fork 29
Truncate action log to avoid json string decoding overflow #8757
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
📝 Walkthrough""" WalkthroughA new Changes
Assessment against linked issues
Assessment against linked issues: Out-of-scope changesNo out-of-scope changes were found. Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
🧰 Additional context used🧠 Learnings (1)frontend/javascripts/viewer/view/version_list.tsx (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
🔇 Additional comments (1)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
… in version restore view
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
frontend/javascripts/admin/rest_api.ts
(1 hunks)webknossos-tracingstore/app/com/scalableminds/webknossos/tracingstore/annotation/TSAnnotationService.scala
(3 hunks)webknossos-tracingstore/app/com/scalableminds/webknossos/tracingstore/controllers/TSAnnotationController.scala
(1 hunks)webknossos-tracingstore/conf/tracingstore.latest.routes
(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
webknossos-tracingstore/app/com/scalableminds/webknossos/tracingstore/annotation/TSAnnotationService.scala (1)
Learnt from: MichaelBuessemeyer
PR: scalableminds/webknossos#8236
File: webknossos-datastore/app/com/scalableminds/webknossos/datastore/datareaders/precomputed/NeuroglancerPrecomputedShardingUtils.scala:23-28
Timestamp: 2025-04-25T11:06:13.275Z
Learning: AlfuCache in the WebKnossos codebase has default configured TTL (time-to-live) and size limits, providing automatic protection against unbounded memory growth.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: backend-tests
- GitHub Check: build-smoketest-push
- GitHub Check: frontend-tests
🔇 Additional comments (5)
frontend/javascripts/admin/rest_api.ts (1)
782-782
: LGTM! Clean implementation of the truncate parameter.The hardcoded
"true"
value ensures the frontend always requests truncated results to prevent JSON decoding overflow, which aligns with the PR objectives.Consider making this configurable in the future if there's a need to conditionally disable truncation:
- const params = new URLSearchParams([["truncate", "true"]]); + const params = new URLSearchParams([["truncate", truncate?.toString() ?? "true"]]);However, for the current use case where preventing overflow is critical, the hardcoded approach is appropriate.
webknossos-tracingstore/conf/tracingstore.latest.routes (1)
10-10
: LGTM! Correct route parameter addition.The
truncate: Option[Boolean]
parameter is properly added to the route definition, maintaining backward compatibility with the optional type.webknossos-tracingstore/app/com/scalableminds/webknossos/tracingstore/annotation/TSAnnotationService.scala (2)
63-63
: Good addition of configuration constant.The
maxUpdateActionLogBatchSize
constant provides a clear, configurable limit for truncation and improves code maintainability.
245-246
: Clean method signature update.The addition of the
truncate
parameter is well-named and appropriately typed for controlling the truncation behavior.webknossos-tracingstore/app/com/scalableminds/webknossos/tracingstore/controllers/TSAnnotationController.scala (1)
79-94
: Well-implemented parameter addition with backward compatibility.The optional
truncate
parameter is properly handled with sensible defaults and maintains backward compatibility. The implementation follows the established pattern for optional parameters in this method.
truncatedUpdateActionBatches = if (truncate) updateActionBatches.map(_.take(maxUpdateActionLogBatchSize)) | ||
else updateActionBatches | ||
} yield Json.toJson(truncatedUpdateActionBatches.flatten.map(versionedTupleToJson)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider total response size in addition to per-batch truncation.
The current implementation truncates each batch to 1000 items but then flattens all batches together. This could still result in very large JSON responses if there are many batches, potentially not fully addressing the JSON decoding overflow issue.
Consider also limiting the total number of batches or implementing a global response size limit to ensure the final JSON stays within frontend decoding limits.
truncatedUpdateActionBatches = if (truncate) updateActionBatches.map(_.take(maxUpdateActionLogBatchSize))
else updateActionBatches
+// Consider also limiting total response size:
+finalBatches = if (truncate) truncatedUpdateActionBatches.take(maxBatchCount) else truncatedUpdateActionBatches
-} yield Json.toJson(truncatedUpdateActionBatches.flatten.map(versionedTupleToJson))
+} yield Json.toJson(finalBatches.flatten.map(versionedTupleToJson))
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In
webknossos-tracingstore/app/com/scalableminds/webknossos/tracingstore/annotation/TSAnnotationService.scala
around lines 263 to 265, the code truncates each batch to a maximum size but
then flattens all batches, which can still produce an excessively large JSON
response. To fix this, implement an additional limit on the total number of
batches or the overall size of the combined response before converting to JSON,
ensuring the final output stays within frontend decoding limits.
Ok I am questioning myself right now whether what I implemented is actually what was suggested in the issue / a solution to the problem:
I am not sure whether the concept really exists in the backend route offering the update action log. The backend loads something like My question / thought now: But what happens in case of a having a lot of update action groups with each having 1000 update actions. Doesn't this maybe also blow up the json serialized string to not fit into the frontend max buffer or so? So shouldn't we keep the limit to the total sum of update actions? Or is this too unrealistic to blow up the max string length in case of "only" 1000 update actions per update action group Edit: Coderabbit also mentioned my thoughts: #8757 (comment) |
Maybe the outer array size is limited by the pagination parameters? |
yeah right, it seems like 2500 is the current max of versions per request 🤔 This should kinda be a outer level restriction, limiting the update actions max in total per request. The code says:
Should I restore the old limit @philippotto 🤔? |
This PR introduces a new truncate url param to the update actions log route. If set to true (default should be false), the backend sends the a max of 1000 update actions per update group. The goal is to reduce the max possible string length of the json encoded update actions to not hit the limit of the frontend. See #8594 for more details.
URL of deployed dev instance (used for testing):
Steps to test:
maxUpdateActionLogBatchSize = 3
or so.maxUpdateActionLogBatchSize
consecutive update actions should be listed.TODOs:
Issues:
(Please delete unneeded items, merge only when none are left open)
$PR_NUMBER.md
file inunreleased_changes
or use./tools/create-changelog-entry.py
)