This repository provides an out-of-the-box tool for utilizing OTA's Browser Agent Model (BAM), with enhanced features built on top of existing frameworks. It is intended as a supplementary repo to the model, enabling seamless interaction with web environments through a browser-based agent system.
This repo is forked from:
- Browser-Use (last commit Mar 16 2025) – provides the core browser action framework.
- WebVoyager – contributes the concurrency design and result-saving mechanism.
The Browser Action Model (BAM) is a lightweight, non-generative model designed by OTA Technology Inc. for intelligent browser-based automation. This repository makes it easy to plug BAM into a fully functional browser action loop with minimal setup.
conda create -n BAM python=3.12
conda activate BAM
Make sure your Dedicated GPU Memory > 20 GB
Please refer to the OTA-v1 for detailed model info.
ollama pull hf.co/OTA-AI/OTA-v1
Setup your virtual environment using pip:
pip install -r requirements.txt
To create your own tasks, follow the format used in the test files under the testcases/
directory. For example, a task in OTA_testdataset_mini.jsonl
looks like this:
{"web_name": "Allrecipes", "id": "Allrecipes--4", "ques": "Find a recipe for Baked Salmon that takes less than 30 minutes to prepare and has at least a 4 star rating based on user reviews.", "web": "https://www.allrecipes.com/"}
web_name: the website name you want to visit in this task
id: a unique ID for the task
ques: what you want browser-use to do
web: link to the website
please refer to WebVoyager for more information.
Run the following command to start the task:
python run_tasks.py --model-provider ollama --max-concurrent 1 --task_jsonl_path testcases/OTA_testdataset_mini.jsonl
We have extended and improved the browser-use
framework with the following key features:
-
Similarity-Based Element Selection
We integrate similarity search into the web page content symbol space to select only the top-K relevant interactive elements. These are chosen based on their relevance to the agent's next sub-goal, improving both efficiency and model performance. -
Action History Limiting
To manage token usage and avoid overwhelming the LLM, we limit the number of previous actions included in the prompt. This helps maintain a concise and effective context for decision-making.
This project inherits licensing terms from its upstream forks. Refer to each respective repository for license details.
Maintained by OTA Technologies Inc.