AppTester Agent via Browser Automation #196

planger · 2025-05-01T19:26:34Z

Description

This feature enhances the AI-powered application testing agent by enabling it to autonomously build, launch, interact with, and validate scenarios in a running web application using natural language commands provided through the chat.

In particular, we want to develop an AI agent, named @AppTester, which handles the complete validation flow - from building and launching the application to inspecting it for a user-specified test scenario through the Playwright MCP server.

In this first iteration, the user needs to explicitly specify the scenario to be tested, e.g. Open the AI Configuration View, navigate to Token Usage tab and click reset, or available in the context (e.g. via #191). If the user or context doesn't specify the scenario detailed enough to run the scenario, the agent should tell the user what's missing.

The AI agent will support the following:

Build the application via tasks (see PR #15504 for more details)
Launch the application using launch configurations (to be developed in the context of this task)
Connect to a running application using Playwright MCP
Perform UI interactions (Playwright MCP should already support this)
- Navigate through the UI
- Invoke UI elements (clicking buttons, entering text)
- Inspect styling and layout
- Execute sequences of interactions to evaluate behavior
Shut down the application after testing (to be developed in the context of this task)

This will enable developers to quickly validate UI changes and interaction behaviors without manual testing, and can help other AI coding agents iterate on UI improvements with real-time validation.

Example Use Case

Feature: End-to-End Testing with Build and Launch
  As a developer
  I want the AI to handle building and launching my application
  So that I can validate changes without manual setup

  Scenario: Testing after code changes
    Given I have made changes to application code
    When I ask in the chat "@AppTester Can you check if the reset button in the Token Usage tab of the AI Configuration view now works?"
    Then the AI agent should:
      | Identify the appropriate build task       |
      | Run the build process                     |
      | Launch the application                    |
      | Navigate to the view and test the reset button    |
    
    And the AI should respond with:
      | Confirmation of successful build          |
      | Analysis of the tested functionality    |
      | Details of any issues encountered         |
      
Feature: Behavior Check After UI Interaction
  As a developer
  I want the AI to perform UI interactions and verify behavior
  So that I can confirm functionality without manual testing

  Scenario: Validating button functionality
    Given I have a running application
    When I ask in the chat "@AppTester Connect to localhost:8000 and verify that if I open the AI configuration view, go to Token Usage and click the 'Reset' button, that the values become 0."
    Then the AI agent should:
      | Connect to the application                  |
      | Find and open to the AI configuration view            |
      | Navigate to the Token Usage tab            |
      | Click the Reset button                   |
      | Check the resulting state           |
    
    And the AI should respond with:
      | Confirmation of the interaction outcome |
      | Details about the changed state         |

Hints and Suggested Architecture

The implementation should follow an extensible design with the following components:

Agent Implementation: All functionality should be encapsulated in a new agent that can be addressed in the chat, e.g., @AppTester. The functionality below should be implemented as tool calls available to this agent (build tasks, launch, MCP Playwright, stop earlier launch).
Playwright MCP Server Integration: Install and register the Playwright MCP server and add its tool functions to the agent's system prompt.
Build Task Integration: Add the existing tool functions for listing and running tasks (PR #15504) to manage the application build processes to the agent's system prompt.
Launch Configuration Support: Implement a dedicated set of tool functions, similar to (PR #15504) for launching the application under development. Add this tool function to the AppTester agent to enable the agent to start the application in the appropriate context.
Application Lifecycle Management: Consider storing state about the application lifecycle in the chat model so it can be retrieved in following tool calls. This would enable stopping the application after the test via another tool function to be created in the context of this story. The chat model is available as parameter of the tool call handler.

Dependencies

PR #15504: AI tools for listing and running tasks (merged)
Issue #191: Task management system for tracking requirements and user-facing scenarios

Follow-up Stories:

The text was updated successfully, but these errors were encountered:

planger self-assigned this May 1, 2025

planger changed the title ~~Agent for evaluating E2E scenarios (see use case)~~ AI Agent for Application Interaction and UI Inspection via Browser Automation May 1, 2025

planger mentioned this issue May 5, 2025

Create agentic Coder (linting, compiling, test execution) #197

Closed

7 tasks

planger changed the title ~~AI Agent for Application Interaction and UI Inspection via Browser Automation~~ AppTester Agent via Browser Automation May 7, 2025

This was referenced May 7, 2025

AppTester Agent with Screenshot Support #205

Open

AppTester Agent with DOM Access #206

Open

Enable Agent-to-Agent Delegation via Tool Calls for Composable AI Behavior #192

Open

planger mentioned this issue May 23, 2025

Integrate @TaskContextAgent and @AppTester with @Coder Using Agent Delegation #216

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AppTester Agent via Browser Automation #196

AppTester Agent via Browser Automation #196

planger commented May 1, 2025 •

edited

Loading

AppTester Agent via Browser Automation #196

AppTester Agent via Browser Automation #196

Comments

planger commented May 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Example Use Case

Hints and Suggested Architecture

Dependencies

Follow-up Stories:

planger commented May 1, 2025 •

edited

Loading