AppTester Agent with DOM Access

PO @planger 

### Description

This feature enables advanced DOM inspection and custom script execution capabilities for the AI-powered application testing agent by leveraging a hybrid architecture: **Puppeteer** will be used to launch and manage the browser process, while **Playwright MCP** will connect to the browser instance using the Chrome DevTools Protocol (CDP) endpoint.

Playwright MCP does not support extending its protocol with custom tools or direct DOM access - it only provides static snapshots or screenshots. By integrating Puppeteer for browser process control and DOM access, we gain full flexibility to execute custom logic while still benefiting from Playwright MCP's structured interface for interaction and snapshotting.

### Goal

- Launch a browser process using Puppeteer
- Expose the DevTools protocol endpoint via `--remote-debugging-port`
- Connect the Playwright MCP server to the browser instance using `--cdp-endpoint`
- Enable:
  - Full DOM access through Puppeteer (not possible with Playwright MCP)
  - Programmatic control of browser lifecycle
  - Seamless integration with Playwright MCP for vision or snapshot features

### Example Use Case

```gherkin
Feature: DOM Validation via Puppeteer and Playwright MCP
  As a developer
  I want the AI agent to access and inspect the live DOM tree
  So that I can validate structure and behavior of my application in detail

  Scenario: Validating token value DOM state
    Given I have a running application
    When I ask in the chat "@AppTester Connect to localhost:8000 and read the innerText of the token value element with selector '.token-value'"
    Then the AI agent should:
      | Start the browser with Puppeteer                |
      | Expose the remote debugging port                |
      | Connect Playwright MCP to the running browser   |
      | Use Puppeteer to query the DOM using selector   |
    
    And the AI should respond with:
      | The innerText of the requested element          |
      | DOM snippet if requested                        |
````

### Hints and Suggested Architecture

* **Browser Launching**: Puppeteer will be responsible for launching the browser process on the backend and exposing the CDP endpoint (`--remote-debugging-port`).
* **Playwright MCP CDP Connection**: Connect the MCP instance to the Puppeteer-launched browser via `--cdp-endpoint`. Note: Programmatic MCP launch is TBD.
* **Browser Lifecycle Management**: Puppeteer manages starting and closing the browser process. Ensure integration into the agent lifecycle to avoid orphaned processes.
* **Webpack Compatibility**: Avoid importing `playwright-core` directly into frontend or Webpack-bundled environments as it includes SVG and HTML resources that are incompatible with Webpack.
* **Backend Integration**: All logic must be implemented on the backend. Tools should invoke service classes for modularity and reuse.

### Tool API (Initial Proposal)

* `queryDom(selector?: string): string`
  → returns the DOM sub-tree from the matching selector as HTML

> Additional tool APIs can be added to support future requirements.

### Related Issues and Dependencies

* #196: AI agent foundation and Playwright MCP integration


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AppTester Agent with DOM Access #206

Description

Goal

Example Use Case

Hints and Suggested Architecture

Tool API (Initial Proposal)

Related Issues and Dependencies

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

AppTester Agent with DOM Access #206

Description

Description

Goal

Example Use Case

Hints and Suggested Architecture

Tool API (Initial Proposal)

Related Issues and Dependencies

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions