-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Twitter/X Demo Post
https://x.com/ShivamKhatri_/status/1965906282675843116
https://x.com/ShivamKhatri_/status/1966023163059310781
GitHub Repository
https://github.com/shivamkhatri/browser-use-test-recorder
Project Name
FlowForge - A hybrid recorder to supercharge browser-use for QA
demo.mp4
What does it do?
Problem Statement
Browser-use is a promising framework for AI-driven automation, but based on my extensive usage for QA automation, I’ve observed several challenges:
- Instruction following issues: LLMs like GPT-4.1 often skip essential actions in multi-step tests.
- Model tradeoffs: Claude 4 Sonnet handles instruction following better, but it’s slow and expensive (~$2.50 for a 25-step test).
- Wrong element interactions: If browser-use acts on the wrong element, subsequent steps still continue instead of failing early.
- Custom UI elements: Many enterprise apps use in-house icons unfamiliar to LLMs. Agents misidentify them, especially when elements lack useful attributes.
- No relative selectors: Unlike Cypress/Selenium, identifying elements relative to nearby elements is hard with browser-use.
- Debugging difficulty: Without step-by-step execution during test creation, users write long tasks (20–30 steps) only to discover failures deep inside, wasting time in debugging.
My Idea
I propose a Flow Recorder that combines manual recording (like workflow-use) with browser-use agent interactions, solving the above pain points.
Users can record manual interactions and agent-driven steps side by side.
The steps export as JSON, which can:
- Feed into Cursor/VSCode for generating robust test automation code (Cypress, Selenium, Playwright).
- Be reused as initial/final/manual actions for browser-use.
- This bridges the gap between human precision and AI-driven automation.
The long-term vision is to integrate this tool into workflow-use and QA-use, delivering a best-in-class test authoring platform that enables high-quality no-code and low-code automation.
Canvas prototyping: https://g.co/gemini/share/8859e4072909


Why This Matters
- Practicality: QA engineers are already used to manual recorders (LambdaTest, KaneAI, etc.), validating this idea.
- Flexibility: Manual + Agent interaction recording reduces reliance on LLM correctness alone.
- Debuggability: Step-by-step creation and export make tests easier to validate and maintain.
- Adoption: Enterprises can gradually adopt browser-use by combining familiar workflows with AI assistance.