Description
PO @planger
Description
This feature enables advanced DOM inspection and custom script execution capabilities for the AI-powered application testing agent by leveraging a hybrid architecture: Puppeteer will be used to launch and manage the browser process, while Playwright MCP will connect to the browser instance using the Chrome DevTools Protocol (CDP) endpoint.
Playwright MCP does not support extending its protocol with custom tools or direct DOM access - it only provides static snapshots or screenshots. By integrating Puppeteer for browser process control and DOM access, we gain full flexibility to execute custom logic while still benefiting from Playwright MCP's structured interface for interaction and snapshotting.
Goal
- Launch a browser process using Puppeteer
- Expose the DevTools protocol endpoint via
--remote-debugging-port
- Connect the Playwright MCP server to the browser instance using
--cdp-endpoint
- Enable:
- Full DOM access through Puppeteer (not possible with Playwright MCP)
- Programmatic control of browser lifecycle
- Seamless integration with Playwright MCP for vision or snapshot features
Example Use Case
Feature: DOM Validation via Puppeteer and Playwright MCP
As a developer
I want the AI agent to access and inspect the live DOM tree
So that I can validate structure and behavior of my application in detail
Scenario: Validating token value DOM state
Given I have a running application
When I ask in the chat "@AppTester Connect to localhost:8000 and read the innerText of the token value element with selector '.token-value'"
Then the AI agent should:
| Start the browser with Puppeteer |
| Expose the remote debugging port |
| Connect Playwright MCP to the running browser |
| Use Puppeteer to query the DOM using selector |
And the AI should respond with:
| The innerText of the requested element |
| DOM snippet if requested |
Hints and Suggested Architecture
- Browser Launching: Puppeteer will be responsible for launching the browser process on the backend and exposing the CDP endpoint (
--remote-debugging-port
). - Playwright MCP CDP Connection: Connect the MCP instance to the Puppeteer-launched browser via
--cdp-endpoint
. Note: Programmatic MCP launch is TBD. - Browser Lifecycle Management: Puppeteer manages starting and closing the browser process. Ensure integration into the agent lifecycle to avoid orphaned processes.
- Webpack Compatibility: Avoid importing
playwright-core
directly into frontend or Webpack-bundled environments as it includes SVG and HTML resources that are incompatible with Webpack. - Backend Integration: All logic must be implemented on the backend. Tools should invoke service classes for modularity and reuse.
Tool API (Initial Proposal)
queryDom(selector?: string): string
→ returns the DOM sub-tree from the matching selector as HTML
Additional tool APIs can be added to support future requirements.
Related Issues and Dependencies
- AppTester Agent via Browser Automation #196: AI agent foundation and Playwright MCP integration