Skip to content

Do Not Merge: Memory Leak Investigation Summary #8167

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

pkukielka
Copy link
Contributor

@pkukielka pkukielka commented Jul 11, 2025

Memory Leak Investigation Summary

Root Cause Discovery

The Problem

We discovered that the memory leak in Cody's webview is caused by React's inability to efficiently handle complex DOM structures generated by syntax highlighting, rather than the highlighting libraries themselves.

How We Reached This Conclusion

1. Simple testing framework for reproducible results

We slightly modified chat component to always stream the same mocked LLM in response to any input.
Mock data contains a 288-line TypeScript response with 6 complex code blocks, streamed character-by-character.
This implementation was essential for the memory leak investigation because it provided consistent test conditions for comparing memory usage between different highlighting implementations and retries. It is also a little bit faster than querying real LLM.

Without any changes to the highlighter running it once (one human question, one assistant response) leads to around 900-1000 MB usage (from initial ~100MB) clearly demonstrating a memory leak.

▶️ To launch the demo run pnpm dev:standalone command in the web directory and open http://localhost:5777/ in the browser.

2. Initial Hypothesis Testing

  • Started with: Suspected highlight.js library was causing memory leaks
  • Method: Created two interchangeable syntax highlighting components to A/B test
  • Components tested:
    • CustomHTMLHighlighter - Simple regex-based highlighting
    • CustomHJSHighlighter - Full highlight.js-powered highlighting

3. Key Discovery

When examining the generated HTML output, we found that highlight.js produces highly complex nested DOM structures:

<code class="language-typescript" data-language="typescript">
  <span class="hljs-keyword">import</span> { 
  <span class="hljs-title class_">ComplexFeature</span> } 
  <span class="hljs-keyword">from</span> 
  <span class="hljs-string">'./ComplexFeature'</span>
  <!-- hundreds more nested spans for a typical code block -->
</code>

3. Elimination Process

❌ Ruled Out: highlight.js Library

  • Library itself doesn't retain references
  • Memory usage is consistent in non-React contexts
  • Other applications using highlight.js don't show similar leaks
  • We tested other highlighting libraries like prism.js with the same result
  • Switching useHighlightedCode flag in the CustomHJSHighlighter.tsx to false runs all the highlight.js logic, but does not apply created code at the end. This does not cause a memory leak.

❌ Ruled Out: Component Implementation

  • Both components showed similar memory patterns when generating complex DOM
  • Simple DOM structures (regex-based) had minimal memory impact
  • Complex DOM structures (highlight.js) caused significant memory accumulation

✅ Identified: React DOM Reconciliation Issue

  • React struggles with reconciling complex nested span structures
  • Virtual DOM diffing becomes inefficient with hundreds of nested elements
  • Memory accumulates as React maintains references to complex DOM trees
  • The more complex the syntax highlighting, the worse the memory leak

Key Takeaway: The memory leak is not caused by the syntax highlighting library itself, but by React's inability to efficiently manage the complex DOM structures that advanced syntax highlighting generates. This finding shifts the focus from library replacement to DOM structure optimization.

Other Approaches Tested & Results

1. Caching Optimization Attempts

  • What we tried: React.memo, stable keys, dependency array optimization
  • Result: Minimal improvement
  • Memory usage: Still ~900MB
  • Why it didn't help: We already proved that leak happens because React components or React itself cannot properly clean complicated DOM created by highlighter. Every new character streamed causes React to reconstruct Markdown object. Even if we cache code blocks so we don't re-create them, that does not solve cleanup issue.

2. Conditional Highlighting During Streaming

  • What we tried: Only highlight complete code blocks, show plain text during streaming
  • Implementation: Added isMessageLoading prop, data-is-code-complete detection
  • Result: Small improvement but memory leak persisted
  • Memory usage: ~800MB (100MB improvement)
  • Why it didn't fully help: While this reduced the complexity of individual code blocks during streaming, react-markdown was still creating React component instances for every code block on every character update. The component creation/destruction cycle continued even with simpler content for the current code block, but was affected by all existing and completed code blocks. That approach would probably work nice for single, large code blocks, but not for multiple smaller ones.

3. Complete Architecture Change: markdown-it

  • What we tried: Replaced react-markdown with markdown-it for static HTML generation
  • Implementation:
  • Eliminated React component re-rendering entirely
  • Direct HTML generation with dangerouslySetInnerHTML
  • Preserved syntax highlighting via highlight.js
  • Result: Some improvement, but not dramatic
  • Why it didn't help: Most likely because it created very similar if not identical DOM structures, which React struggles with. That is important experiment though, because it proves problem is somewhere deep in core React memory management.

**4. Moving Highlighting To WebWorker **

  • What we tried: Move highlight.js calling and html generation to a separate webworker
  • Result: No significant improvement
  • Memory usage: ~900MB
  • Why it didn't help: For the same reason as all other solutions, really. Problem is not at the generation side, but on the cleanup side which is managed by React.

Interesting observation: When testing last 4 points on Linux I noticed sometimes memory usage was around ~470MB instead of ~900MB. Generally looking at the devtools Memory inspector when app is working is is clear that memory is allocated in batches, roughly doubling every time. That pattern looks more like heap allocation rather than DOM memory management issue, which is a bit confusing. Take form it your own conclusions ;)

@pkukielka pkukielka changed the title Pkukielka/web memory leak testing 2 Cody Web Memory Leak Investigation Summary Jul 11, 2025
@pkukielka pkukielka changed the title Cody Web Memory Leak Investigation Summary Memory Leak Investigation Summary - Cody Web Jul 11, 2025
@pkukielka pkukielka changed the title Memory Leak Investigation Summary - Cody Web Memory Leak Investigation Summary Jul 11, 2025
Copy link

‼️ Hey @sourcegraph/cody-security, please review this PR carefully as it introduces the usage of an unsafe_ function or abuses PromptString.

@pkukielka pkukielka changed the title Memory Leak Investigation Summary Do Not Merge: Memory Leak Investigation Summary Jul 12, 2025
@vovakulikov
Copy link
Contributor

After some investigation, I think I know what is happening here.

Indeed, @pkukielka is right about React reconciliation struggles with our code blocks, but I guess this is only a symptom of our mismemoization.

On every message update from LLM backend, React creates a completely new DOM element for the whole code block in the message. You can imagine how many elements will be created and removed during one question/response session (it is a big number that browser can free up memory space fast enough).

The more details

  • Here is where our highlighters live https://sourcegraph.sourcegraph.com/github.com/sourcegraph/cody@pkukielka/web-memory-leak-testing-2/-/blob/vscode/webviews/components/MarkdownFromCody.tsx?L164
  • You see that this works via components props (plag and play API of Markdown component).
  • In the MarkdownFromCody component these markdownComponents wrapped with memo and the code function is a pure function so it should be no problems here. (The CustomHJSHighlighter will be the same between different render calls)
  • The components objects is created a level higher here and it's non-memoised, so the new instance of components will be passed to MarkdownFromCody and will trigger its component memoisation re-calculation, which will create a new function that creates CustomHJSHighlighter, which will lead to the new element creation.

What I have tried already

  • The problem is within one message text, so virtualisation for chat messages doesn't solve the original problem (however, it can help with other perf problems in long chats, which seems to be the case in our customers; they start to have long chats, so virtualisation won't hurt, I made first run on this here, it works pretty well with dynamic size measures Add virtualization to message sections #8182)

  • Adding memoisation to the component didn't help, at least on this branch, so I guess we misrender something in the components tree, Needs further investigation in how markdown component works with components API

cc @pkukielka @fkling

pkukielka added a commit that referenced this pull request Jul 22, 2025
This PR is not fixing a memory leak in the DOM highlighter code but
significantly reduces amount of re-computation which is done. With
mocked LLM response of length of 7528 characters I get:

**Before my changes:**

Execution time: 1 min 25 s ec
Peak memory used: over 2GB
Stable memory after final GC: 833MB 

**After my changes:**

Execution time: **45 sec**
Peak memory used: over 1GB
Stable memory after final GC: 719MB 

Sop while memory usage drop was not significant there is very visible
speedup in the execution speed.
Also memory usage swings were subjectively less significant.

## Test plan

I do not have good testing procedure on `main`, but I di testing using
my testing PR, with some custom logging:
#8167
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants