Improve mutation processing performance #1652

eoghanmurray · 2025-02-11T21:29:49Z

Refactor record time mutation processing to improve performance

smarter 'iteration' through the this.addedSet so that we start with nodes which already have the required nextId and parentId
take advantage of siblings sharing the same parent, so only do parent related calculations once instead of repeatedly for every child node
inline pushAdd (and getNextId) as we only need a single run over them

This solves pathological cases where nodes from the addedSet were pushed onto the secondary addList, possibly multiple times as pushAdd was called again each time the nextId/parentId requirements weren't met.

Previous efforts in this direction were

perf(genadds) traverse children in reverse order #1398 "traverse children in reverse order"
perf(rrweb): optimize random shuffled addList #1302 "optimize random shuffled addList"

Also related is #1277 "refactor recursive procedure to iterative" which we could also incorporate on top of this as it should provide orthogonal performance gains — in order to 'grasp the nettle' all at once in terms of possible breakage.

changeset-bot · 2025-02-11T21:29:53Z

🦋 Changeset detected

Latest commit: 9a7a47e

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 19 packages

Name	Type
rrweb	Patch
rrweb-snapshot	Patch
rrdom	Patch
rrdom-nodejs	Patch
rrweb-player	Patch
@rrweb/all	Patch
@rrweb/replay	Patch
@rrweb/record	Patch
@rrweb/types	Patch
@rrweb/packer	Patch
@rrweb/utils	Patch
@rrweb/web-extension	Patch
rrvideo	Patch
@rrweb/rrweb-plugin-console-record	Patch
@rrweb/rrweb-plugin-console-replay	Patch
@rrweb/rrweb-plugin-sequential-id-record	Patch
@rrweb/rrweb-plugin-sequential-id-replay	Patch
@rrweb/rrweb-plugin-canvas-webrtc-record	Patch
@rrweb/rrweb-plugin-canvas-webrtc-replay	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

eoghanmurray · 2025-02-11T22:21:01Z

I've cloned the plunker from #1302 from @mdellanoce
https://plnkr.co/edit/QgOXGFAFOguN2Zpa

The microtasks run after the jstree work is done, and I'm clocking 3.2s for those before this PR vs. 2.2s after.

…b-io#1543 and rrweb-io#1652)

eoghanmurray · 2025-02-13T17:15:26Z

Here's the benchmark output (please replicate on other machines):

                                                                                                      
title                                                   before     after   improvement   html                                             
                                                                                                      
1000x 1 DOM nodes with deeply nested children       10  2107.5    1974.1      7.0%       benchmark-dom-mutation-deep-nested.html        
1000x10 DOM nodes                                   10   384.9     370.7      3.7%       benchmark-dom-mutation.html                    
1000x10x2 DOM nodes and remove a bunch of them      10   526.6     331.9     36.0%       benchmark-dom-mutation-add-and-remove.html     
1000 DOM nodes and append into its previous looped   5   132.2      79.6     39.8%       benchmark-dom-mutation-multiple-descendant-add.h
10000 DOM nodes and move it to new container         5   586       301       48.6%       benchmark-dom-mutation-add-and-move.html       
modify attributes on 10000 DOM nodes                10    96.7      87.5      9.5%       benchmark-dom-mutation-attributes.html

…acktrack; pushAdd requires that each new node has a parentId and a nextId

… (console)

… pass

…er of child nodes share the same parent

…hat if a possibly null node is in e.g. this.addedSet, then it is indeed not null. Similarly if `this.addedSet.size` is non zero, then we can pop confidently

… confusing initialization to `IGNORED_NODE`

… equivalent but might have implications in terms of how replay should iterate over them

…checks if it goes bad early

… to mutation processing

…was preventing benchmark test from running

… could also use `checkLoops: allExceptWhileTrue` after eslint upgrade

… could be applied across all snapshots. Also, only mutation events are relevant here (reduce burden when manually expecting test output). Also (commented out) showing how to reuse expected output between tests, i.e. assert that two tests produce same output

…esent on mutation processing before they could be added, but couldn't (deliberately) break anything. Adding test anyway as it might throw up an interesting scenario

…a which could be applied across all snapshots. Also, only mutation events are relevant here (reduce burden when manually expecting test output). Also (commented out) showing how to reuse expected output between tests, i.e. assert that two tests produce same output

…g and a speculative mitigation

… in the DOM/mirror prior to mutation processing

… already be in the correct order for a single pass

pauldambra

ugh, my youngest is awake...

I promised i'd do a review but i've not looked at mutation code much before so this first pass is much more to check my understanding than anything else

i have a customer site that i know is causing us problems with missing nodes at playback, will try and figure out if there's a way i can test this code on that site to compare the output (i don't think i can without customer changing their code)

that site is a stock ticker so maybe i can make a minimal example myself quickly

(maybe it goes without saying but feel free to tell me where i'm being a fool, i'm keen to understand this all more)

pauldambra · 2025-07-07T20:55:07Z

packages/rrweb/src/replay/index.ts

@@ -1506,7 +1506,7 @@ export class Replayer {
          // is newly added document, maybe the document node of an iframe
          return this.newDocumentQueue.push(mutation);
        }
-        return queue.push(mutation);
+        return legacy_queue.push(mutation);


fly by that the other item marked legacy here only made me wonder why it was legacy

so there's a signal that maybe it's not current
but I don't know what the consequences are
e.g. is the code only going to run on data recorded prior to version X

i don't know where the best place for that info is
but i worry it's in someone's head and could be lost :)

pauldambra · 2025-07-08T07:14:59Z

packages/rrweb/src/record/mutation.ts

+        while (true) {
+          parentNode = dom.parentNode(n);
+          if (this.addedSet.has(parentNode as Node)) {
+            // start at top of added tree so as not to serialize children before their parents (parentId requirement)


Suggested change

// start at top of added tree so as not to serialize children before their parents (parentId requirement)

// keep searching for the top of added tree so as not to serialize children before their parents (parentId requirement)

nit... "start at the top" threw me here and made me think 🙈

pauldambra · 2025-07-08T07:22:05Z

packages/rrweb/src/record/mutation.ts

+            }
+          }
+
+          if (this.addedSet.has(parentNode.lastChild as Node)) {


i guess the casts to Node are to avoid !!parentNode.lastChild && x.has(parentNode.lastChild)

i think safe here but as X is such a source of bugs that it always scares me 🙈

eoghanmurray added a commit to eoghanmurray/rrweb that referenced this pull request Feb 13, 2025

Performance refactor to avoid 2 function calls (bridging between rrwe…

814c18f

…b-io#1543 and rrweb-io#1652)

eoghanmurray mentioned this pull request Feb 13, 2025

Performance refactor to avoid 2 function calls #1653

Open

eoghanmurray requested review from Juice10 and YunFeng0817 February 18, 2025 11:12

eoghanmurray and others added 23 commits March 7, 2025 12:49

Iterate over the added nodes in 'one pass' so that we don't need to b…

43c5d72

…acktrack; pushAdd requires that each new node has a parentId and a nextId

Test changes, rearrangement of mutations

ed757b3

Add some ids as I'm interested in tracing these nodes through pushAdd…

1d8b37b

… (console)

Do away with the second pass as we can handle shadow DOM in the first…

99da1e4

… pass

Performance oriented refactor focusing on scenario where a large numb…

490aea9

…er of child nodes share the same parent

Satisfy typescript which could be smarter here ... we can guarantee t…

d9587d4

…hat if a possibly null node is in e.g. this.addedSet, then it is indeed not null. Similarly if `this.addedSet.size` is non zero, then we can pop confidently

Utilize lastChild to avoid possibly crawling through hundreds of nodes

2a0eedd

We've already got nextSibling here so can skip a step and avoid the…

96ea20d

… confusing initialization to `IGNORED_NODE`

Test rearrangements in the adds array due to new algorithm; should be…

2aa8597

… equivalent but might have implications in terms of how replay should iterate over them

We were calling inDom in all cases, so don't do the other ancestor …

8d4e766

…checks if it goes bad early

Don't think we're explicitly looking at the slimdom stuff in relation…

3b33611

… to mutation processing

Add changeset

f260c0d

Don't think main subfolder was ever used as an output target; this …

87b091a

…was preventing benchmark test from running

Placate eslint (while(true) is a Pythonism rather than do..while) -…

1441ef3

… could also use `checkLoops: allExceptWhileTrue` after eslint upgrade

Forgot to add the mutation.html file - also add doctype

b5b04e2

Simplify the parentId check, doesn't need to ever by null

720e174

Apply formatting changes

4a751ee

Some inconsequential tests to cover blocking scenarios

7734331

Was trying to 'catch out' the mutation handling by having siblings pr…

4c2af79

…esent on mutation processing before they could be added, but couldn't (deliberately) break anything. Adding test anyway as it might throw up an interesting scenario

I can't recreate a scenario for this case in testing, so add a warnin…

453beb0

…g and a speculative mitigation

Put each snap file in it's own folder and shorten names

03f2146

eoghanmurray added 3 commits March 7, 2025 12:49

Satisfy eslint

9666d32

Repeat the mutation tests but with the blocking/ignored nodes already…

e90b18b

… in the DOM/mirror prior to mutation processing

Indicate that replay no longer needs the queue, as added nodes should…

9a7a47e

… already be in the correct order for a single pass

eoghanmurray force-pushed the pushAddOrder branch from 7fde788 to 9a7a47e Compare March 7, 2025 13:03

pauldambra reviewed Jul 8, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Improve mutation processing performance #1652

Improve mutation processing performance #1652

eoghanmurray commented Feb 11, 2025

Uh oh!

changeset-bot bot commented Feb 11, 2025 •

edited

Loading

Uh oh!

eoghanmurray commented Feb 11, 2025

Uh oh!

eoghanmurray commented Feb 13, 2025

Uh oh!

pauldambra left a comment

Uh oh!

pauldambra Jul 7, 2025

Uh oh!

pauldambra Jul 8, 2025

Uh oh!

pauldambra Jul 8, 2025

Uh oh!

Uh oh!

	// start at top of added tree so as not to serialize children before their parents (parentId requirement)
	// keep searching for the top of added tree so as not to serialize children before their parents (parentId requirement)

Uh oh!

Improve mutation processing performance #1652

Are you sure you want to change the base?

Improve mutation processing performance #1652

Conversation

eoghanmurray commented Feb 11, 2025

Uh oh!

changeset-bot bot commented Feb 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

eoghanmurray commented Feb 11, 2025

Uh oh!

eoghanmurray commented Feb 13, 2025

Uh oh!

pauldambra left a comment

Choose a reason for hiding this comment

Uh oh!

pauldambra Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

pauldambra Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

pauldambra Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

changeset-bot bot commented Feb 11, 2025 •

edited

Loading