Skip to content

Improve mutation processing performance #1652

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 26 commits into
base: master
Choose a base branch
from

Conversation

eoghanmurray
Copy link
Contributor

Refactor record time mutation processing to improve performance

  • smarter 'iteration' through the this.addedSet so that we start with nodes which already have the required nextId and parentId
  • take advantage of siblings sharing the same parent, so only do parent related calculations once instead of repeatedly for every child node
  • inline pushAdd (and getNextId) as we only need a single run over them

This solves pathological cases where nodes from the addedSet were pushed onto the secondary addList, possibly multiple times as pushAdd was called again each time the nextId/parentId requirements weren't met.

Previous efforts in this direction were

Also related is #1277 "refactor recursive procedure to iterative" which we could also incorporate on top of this as it should provide orthogonal performance gains — in order to 'grasp the nettle' all at once in terms of possible breakage.

Copy link

changeset-bot bot commented Feb 11, 2025

🦋 Changeset detected

Latest commit: 9a7a47e

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 19 packages
Name Type
rrweb Patch
rrweb-snapshot Patch
rrdom Patch
rrdom-nodejs Patch
rrweb-player Patch
@rrweb/all Patch
@rrweb/replay Patch
@rrweb/record Patch
@rrweb/types Patch
@rrweb/packer Patch
@rrweb/utils Patch
@rrweb/web-extension Patch
rrvideo Patch
@rrweb/rrweb-plugin-console-record Patch
@rrweb/rrweb-plugin-console-replay Patch
@rrweb/rrweb-plugin-sequential-id-record Patch
@rrweb/rrweb-plugin-sequential-id-replay Patch
@rrweb/rrweb-plugin-canvas-webrtc-record Patch
@rrweb/rrweb-plugin-canvas-webrtc-replay Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@eoghanmurray
Copy link
Contributor Author

I've cloned the plunker from #1302 from @mdellanoce
https://plnkr.co/edit/QgOXGFAFOguN2Zpa

The microtasks run after the jstree work is done, and I'm clocking 3.2s for those before this PR vs. 2.2s after.

@eoghanmurray
Copy link
Contributor Author

Here's the benchmark output (please replicate on other machines):

                                                                                                      
title                                                   before     after   improvement   html                                             
                                                                                                      
1000x 1 DOM nodes with deeply nested children       10  2107.5    1974.1      7.0%       benchmark-dom-mutation-deep-nested.html        
1000x10 DOM nodes                                   10   384.9     370.7      3.7%       benchmark-dom-mutation.html                    
1000x10x2 DOM nodes and remove a bunch of them      10   526.6     331.9     36.0%       benchmark-dom-mutation-add-and-remove.html     
1000 DOM nodes and append into its previous looped   5   132.2      79.6     39.8%       benchmark-dom-mutation-multiple-descendant-add.h
10000 DOM nodes and move it to new container         5   586       301       48.6%       benchmark-dom-mutation-add-and-move.html       
modify attributes on 10000 DOM nodes                10    96.7      87.5      9.5%       benchmark-dom-mutation-attributes.html

eoghanmurray and others added 23 commits March 7, 2025 12:49
…acktrack; pushAdd requires that each new node has a parentId and a nextId
…hat if a possibly null node is in e.g. this.addedSet, then it is indeed not null. Similarly if `this.addedSet.size` is non zero, then we can pop confidently
… confusing initialization to `IGNORED_NODE`
… equivalent but might have implications in terms of how replay should iterate over them
…was preventing benchmark test from running
… could also use `checkLoops: allExceptWhileTrue` after eslint upgrade
… could be applied across all snapshots. Also, only mutation events are relevant here (reduce burden when manually expecting test output). Also (commented out) showing how to reuse expected output between tests, i.e. assert that two tests produce same output
…esent on mutation processing before they could be added, but couldn't (deliberately) break anything. Adding test anyway as it might throw up an interesting scenario
…a which could be applied across all snapshots. Also, only mutation events are relevant here (reduce burden when manually expecting test output). Also (commented out) showing how to reuse expected output between tests, i.e. assert that two tests produce same output
Copy link
Contributor

@pauldambra pauldambra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ugh, my youngest is awake...

I promised i'd do a review but i've not looked at mutation code much before so this first pass is much more to check my understanding than anything else

i have a customer site that i know is causing us problems with missing nodes at playback, will try and figure out if there's a way i can test this code on that site to compare the output (i don't think i can without customer changing their code)

that site is a stock ticker so maybe i can make a minimal example myself quickly

(maybe it goes without saying but feel free to tell me where i'm being a fool, i'm keen to understand this all more)

@@ -1506,7 +1506,7 @@ export class Replayer {
// is newly added document, maybe the document node of an iframe
return this.newDocumentQueue.push(mutation);
}
return queue.push(mutation);
return legacy_queue.push(mutation);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fly by that the other item marked legacy here only made me wonder why it was legacy

so there's a signal that maybe it's not current
but I don't know what the consequences are
e.g. is the code only going to run on data recorded prior to version X

i don't know where the best place for that info is
but i worry it's in someone's head and could be lost :)

while (true) {
parentNode = dom.parentNode(n);
if (this.addedSet.has(parentNode as Node)) {
// start at top of added tree so as not to serialize children before their parents (parentId requirement)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// start at top of added tree so as not to serialize children before their parents (parentId requirement)
// keep searching for the top of added tree so as not to serialize children before their parents (parentId requirement)

nit... "start at the top" threw me here and made me think 🙈

}
}

if (this.addedSet.has(parentNode.lastChild as Node)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess the casts to Node are to avoid !!parentNode.lastChild && x.has(parentNode.lastChild)

i think safe here but as X is such a source of bugs that it always scares me 🙈

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants