Skip to content

Commit dc8b6b3

Browse files
authored
final touches for v2 (#622)
* v2 readme * minor * line break * getting started * gif on top * rm hr * v2 readme * minor * line break * getting started * gif on top * rm hr * gif + cleanup * revert example * rm zod * gif first * gif first * copy * copy * copy * height * pls * prod ready * features * headings * copy * unpredictable in prod * grammar * spacing * final? * nl vs code * cua * cua * change slack url * new gif * v2 readme * minor * line break * getting started * gif on top * rm hr * revert example * rm zod * gif first * gif first * copy * copy * copy * height * pls * prod ready * features * headings * copy * unpredictable in prod * grammar * spacing * final? * nl vs code * cua * cua * change slack url * new gif * config file * cursorrules + config * changelog
1 parent 2a27e1c commit dc8b6b3

File tree

7 files changed

+229
-58
lines changed

7 files changed

+229
-58
lines changed

.changeset/wise-worlds-pull.md

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,24 @@
22
"@browserbasehq/stagehand": major
33
---
44

5-
temporary placeholder
5+
Announcing **Stagehand 2.0**! 🎉
6+
7+
We're thrilled to announce the release of Stagehand 2.0, bringing significant improvements to make browser automation more powerful, faster, and easier to use than ever before.
8+
9+
### 🚀 New Features
10+
11+
- **Introducing `stagehand.agent`**: A powerful new way to integrate SOTA Computer use models or Browserbase's [Open Operator](https://operator.browserbase.com) into Stagehand with one line of code! Perfect for multi-step workflows and complex interactions. [Learn more](https://docs.stagehand.dev/concepts/agent)
12+
- **Lightning-fast `act` and `extract`**: Major performance improvements to make your automations run significantly faster.
13+
- **Enhanced Logging**: Better visibility into what's happening during automation with improved logging and debugging capabilities.
14+
- **Comprehensive Documentation**: A completely revamped documentation site with better examples, guides, and best practices.
15+
- **Improved Error Handling**: More descriptive errors and better error recovery to help you debug issues faster.
16+
17+
### 🛠️ Developer Experience
18+
19+
- **Better TypeScript Support**: Enhanced type definitions and better IDE integration
20+
- **Better Error Messages**: Clearer, more actionable error messages to help you debug faster
21+
- **Improved Caching**: More reliable action caching for better performance
22+
23+
We're excited to see what you build with Stagehand 2.0! For questions or support, join our [Slack community](https://stagehand.dev/slack).
24+
25+
For more details, check out our [documentation](https://docs.stagehand.dev).

.cursorrules

Lines changed: 140 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,140 @@
1+
# Stagehand Project
2+
3+
This is a project that uses Stagehand, which amplifies Playwright with `act`, `extract`, and `observe` added to the Page class.
4+
5+
`Stagehand` is a class that provides config, a `StagehandPage` object via `stagehand.page`, and a `StagehandContext` object via `stagehand.context`.
6+
7+
`Page` is a class that extends the Playwright `Page` class and adds `act`, `extract`, and `observe` methods.
8+
`Context` is a class that extends the Playwright `BrowserContext` class.
9+
10+
Use the following rules to write code for this project.
11+
12+
- To take an action on the page like "click the sign in button", use Stagehand `act` like this:
13+
14+
```typescript
15+
await page.act("Click the sign in button");
16+
```
17+
18+
- To plan an instruction before taking an action, use Stagehand `observe` to get the action to execute.
19+
20+
```typescript
21+
const [action] = await page.observe("Click the sign in button");
22+
```
23+
24+
- The result of `observe` is an array of `ObserveResult` objects that can directly be used as params for `act` like this:
25+
26+
```typescript
27+
const [action] = await page.observe("Click the sign in button");
28+
await page.act(action);
29+
```
30+
31+
- When writing code that needs to extract data from the page, use Stagehand `extract`. Explicitly pass the following params by default:
32+
33+
```typescript
34+
const { someValue } = await page.extract({
35+
instruction: the instruction to execute,
36+
schema: z.object({
37+
someValue: z.string(),
38+
}), // The schema to extract
39+
});
40+
```
41+
42+
## Initialize
43+
44+
```typescript
45+
import { Stagehand } from "@browserbasehq/stagehand";
46+
import StagehandConfig from "./stagehand.config";
47+
48+
const stagehand = new Stagehand(StagehandConfig);
49+
await stagehand.init();
50+
51+
const page = stagehand.page; // Playwright Page with act, extract, and observe methods
52+
const context = stagehand.context; // Playwright BrowserContext
53+
```
54+
55+
## Act
56+
57+
You can cache the results of `observe` and use them as params for `act` like this:
58+
59+
```typescript
60+
const instruction = "Click the sign in button";
61+
const cachedAction = await getCache(instruction);
62+
63+
if (cachedAction) {
64+
await page.act(cachedAction);
65+
} else {
66+
try {
67+
const results = await page.observe(instruction);
68+
await setCache(instruction, results);
69+
await page.act(results[0]);
70+
} catch (error) {
71+
await page.act(instruction); // If the action is not cached, execute the instruction directly
72+
}
73+
}
74+
```
75+
76+
Be sure to cache the results of `observe` and use them as params for `act` to avoid unexpected DOM changes. Using `act` without caching will result in more unpredictable behavior.
77+
78+
Act `action` should be as atomic and specific as possible, i.e. "Click the sign in button" or "Type 'hello' into the search input".
79+
AVOID actions that are more than one step, i.e. "Order me pizza" or "Type in the search bar and hit enter".
80+
81+
## Extract
82+
83+
If you are writing code that needs to extract data from the page, use Stagehand `extract`.
84+
85+
```typescript
86+
const signInButtonText = await page.extract("extract the sign in button text");
87+
```
88+
89+
You can also pass in params like an output schema in Zod, and a flag to use text extraction:
90+
91+
```typescript
92+
const data = await page.extract({
93+
instruction: "extract the sign in button text",
94+
schema: z.object({
95+
text: z.string(),
96+
}),
97+
});
98+
```
99+
100+
`schema` is a Zod schema that describes the data you want to extract. To extract an array, make sure to pass in a single object that contains the array, as follows:
101+
102+
```typescript
103+
const data = await page.extract({
104+
instruction: "extract the text inside all buttons",
105+
schema: z.object({
106+
text: z.array(z.string()),
107+
}),
108+
useTextExtract: true, // Set true for larger-scale extractions (multiple paragraphs), or set false for small extractions (name, birthday, etc)
109+
});
110+
```
111+
112+
## Agent
113+
114+
Use the `agent` method to automonously execute larger tasks like "Get the stock price of NVDA"
115+
116+
```typescript
117+
// Navigate to a website
118+
await stagehand.page.goto("https://www.google.com");
119+
120+
const agent = stagehand.agent({
121+
// You can use either OpenAI or Anthropic
122+
provider: "openai",
123+
// The model to use (claude-3-7-sonnet-20250219 or claude-3-5-sonnet-20240620 for Anthropic)
124+
model: "computer-use-preview",
125+
126+
// Customize the system prompt
127+
instructions: `You are a helpful assistant that can use a web browser.
128+
Do not ask follow up questions, the user will trust your judgement.`,
129+
130+
// Customize the API key
131+
options: {
132+
apiKey: process.env.OPENAI_API_KEY,
133+
},
134+
});
135+
136+
// Execute the agent
137+
await agent.execute(
138+
"Apply for a library card at the San Francisco Public Library"
139+
);
140+
```

README.md

Lines changed: 50 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
</div>
1111

1212
<p align="center">
13-
An AI web browsing framework focused on simplicity and extensibility.<br>
13+
The production-ready framework for AI browser automations.<br>
1414
<a href="https://docs.stagehand.dev">Read the Docs</a>
1515
</p>
1616

@@ -33,45 +33,63 @@
3333
<a href="https://trendshift.io/repositories/12122" target="_blank"><img src="https://trendshift.io/api/badge/repositories/12122" alt="browserbase%2Fstagehand | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
3434
</p>
3535

36-
---
36+
## Why Stagehand?
3737

38-
Stagehand is the easiest way to build browser automations. It is fully compatible with [Playwright](https://playwright.dev/), offering three simple AI APIs (`act`, `extract`, and `observe`) on top of the base Playwright `Page` class that provide the building blocks for web automation via natural language.
38+
Most existing browser automation tools either require you to write low-level code in a framework like Selenium, Playwright, or Puppeteer, or use high-level agents that can be unpredictable in production. By letting developers choose what to write in code vs. natural language, Stagehand is the natural choice for browser automations in production.
3939

40-
Here's a sample of what you can do with Stagehand:
40+
1. **Choose when to write code vs. natural language**: use AI when you want to navigate unfamiliar pages, and use code ([Playwright](https://playwright.dev/)) when you know exactly what you want to do.
41+
42+
2. **Preview and cache actions**: Stagehand lets you preview AI actions before running them, and also helps you easily cache repeatable actions to save time and tokens.
43+
44+
3. **Computer use models with one line of code**: Stagehand lets you integrate SOTA computer use models from OpenAI and Anthropic into the browser with one line of code.
45+
46+
## Example
47+
48+
Here's how to build a sample browser automation with Stagehand:
49+
50+
<div align="center">
51+
<div style="max-width:300px;">
52+
<img src="/media/github_demo.gif" alt="See Stagehand in Action">
53+
</div>
54+
</div>
4155

4256
```typescript
43-
// Keep your existing Playwright code unchanged
44-
await page.goto("https://docs.stagehand.dev");
57+
// Use Playwright functions on the page object
58+
const page = stagehand.page;
59+
await page.goto("https://github.com/browserbase");
60+
61+
// Use act() to execute individual actions
62+
await page.act("click on the stagehand repo");
4563

46-
// Stagehand AI: Act on the page
47-
await page.act("click on the 'Quickstart'");
64+
// Use Computer Use agents for larger actions
65+
const agent = stagehand.agent({
66+
provider: "openai",
67+
model: "computer-use-preview",
68+
});
69+
await agent.execute("Get to the latest PR");
4870

49-
// Stagehand AI: Extract data from the page
50-
const { description } = await page.extract({
51-
instruction: "extract the description of the page",
71+
// Use extract() to read data from the page
72+
const { author, title } = await page.extract({
73+
instruction: "extract the author and title of the PR",
5274
schema: z.object({
53-
description: z.string(),
75+
author: z.string().describe("The username of the PR author"),
76+
title: z.string().describe("The title of the PR"),
5477
}),
5578
});
5679
```
5780

58-
> [!WARNING]
59-
> We highly recommend using the Node.js runtime environment to run Stagehand scripts, as opposed to newer alternatives like Bun. This is solely due to the fact that [Bun's runtime is not yet fully compatible with Playwright](https://github.com/microsoft/playwright/issues/27139).
60-
61-
## Why?
62-
**Stagehand adds determinism to otherwise unpredictable agents.**
63-
64-
While there's no limit to what you could instruct Stagehand to do, our primitives allow you to control how much you want to leave to an AI. It works best when your code is a sequence of atomic actions. Instead of writing a single script for a single website, Stagehand allows you to write durable, self-healing, and repeatable web automation workflows that actually work.
65-
66-
> [!NOTE]
67-
> `Stagehand` is currently available as an early release, and we're actively seeking feedback from the community. Please join our [Slack community](https://stagehand.dev/slack) to stay updated on the latest developments and provide feedback.
68-
6981
## Documentation
7082

7183
Visit [docs.stagehand.dev](https://docs.stagehand.dev) to view the full documentation.
7284

7385
## Getting Started
7486

87+
Start with Stagehand with one line of code, or check out our [Quickstart Guide](https://docs.stagehand.dev/get_started/quickstart) for more information:
88+
89+
```bash
90+
npx create-browser-app
91+
```
92+
7593
<div align="center">
7694
<a href="https://www.loom.com/share/f5107f86d8c94fa0a8b4b1e89740f7a7">
7795
<p>Watch Anirudh demo create-browser-app to create a Stagehand project!</p>
@@ -81,23 +99,6 @@ Visit [docs.stagehand.dev](https://docs.stagehand.dev) to view the full document
8199
</a>
82100
</div>
83101

84-
### Quickstart
85-
86-
To create a new Stagehand project configured to our default settings, run:
87-
88-
```bash
89-
npx create-browser-app --example quickstart
90-
```
91-
92-
Read our [Quickstart Guide](https://docs.stagehand.dev/get_started/quickstart) in the docs for more information.
93-
94-
You can also add Stagehand to an existing Typescript project by running:
95-
96-
```bash
97-
npm install @browserbasehq/stagehand zod
98-
npx playwright install # if running locally
99-
```
100-
101102
### Build and Run from Source
102103

103104
```bash
@@ -129,12 +130,15 @@ For more information, please see our [Contributing Guide](https://docs.stagehand
129130

130131
This project heavily relies on [Playwright](https://playwright.dev/) as a resilient backbone to automate the web. It also would not be possible without the awesome techniques and discoveries made by [tarsier](https://github.com/reworkd/tarsier), and [fuji-web](https://github.com/normal-computing/fuji-web).
131132

132-
We'd like to thank the following people for their contributions to Stagehand:
133-
- [Jeremy Press](https://x.com/jeremypress) wrote the original MVP of Stagehand and continues to be an ally to the project.
134-
- [Navid Pour](https://github.com/navidpour) is heavily responsible for the current architecture of Stagehand and the `act` API.
135-
- [Sean McGuire](https://github.com/seanmcguire12) is a major contributor to the project and has been a great help with improving the `extract` API and getting evals to a high level.
136-
- [Filip Michalsky](https://github.com/filip-michalsky) has been doing a lot of work on building out integrations like [Langchain](https://js.langchain.com/docs/integrations/tools/stagehand/) and [Claude MCP](https://github.com/browserbase/mcp-server-browserbase), generally improving the repository, and unblocking users.
137-
- [Sameel Arif](https://github.com/sameelarif) is a major contributor to the project, especially around improving the developer experience.
133+
We'd like to thank the following people for their major contributions to Stagehand:
134+
- [Paul Klein](https://github.com/pkiv)
135+
- [Anirudh Kamath](https://github.com/kamath)
136+
- [Sean McGuire](https://github.com/seanmcguire12)
137+
- [Miguel Gonzalez](https://github.com/miguelg719)
138+
- [Sameel Arif](https://github.com/sameelarif)
139+
- [Filip Michalsky](https://github.com/filip-michalsky)
140+
- [Jeremy Press](https://x.com/jeremypress)
141+
- [Navid Pour](https://github.com/navidpour)
138142

139143
## License
140144

media/create-browser-app.gif

3.62 MB
Loading

media/github_demo.gif

3.93 MB
Loading

stagehand.config.ts

Lines changed: 17 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -3,14 +3,24 @@ import dotenv from "dotenv";
33
dotenv.config();
44

55
const StagehandConfig: ConstructorParams = {
6-
verbose: 1,
6+
verbose: 1 /* Verbosity level for logging: 0 = silent, 1 = info, 2 = all */,
7+
domSettleTimeoutMs: 30_000 /* Timeout for DOM to settle in milliseconds */,
8+
9+
// LLM configuration
10+
modelName: "gpt-4o" /* Name of the model to use */,
11+
modelClientOptions: {
12+
apiKey: process.env.OPENAI_API_KEY,
13+
} /* Configuration options for the model client */,
14+
15+
// Browser configuration
716
env:
817
process.env.BROWSERBASE_API_KEY && process.env.BROWSERBASE_PROJECT_ID
918
? "BROWSERBASE"
1019
: "LOCAL",
1120
apiKey: process.env.BROWSERBASE_API_KEY /* API key for authentication */,
1221
projectId: process.env.BROWSERBASE_PROJECT_ID /* Project identifier */,
13-
domSettleTimeoutMs: 30_000 /* Timeout for DOM to settle in milliseconds */,
22+
browserbaseSessionID:
23+
undefined /* Session ID for resuming Browserbase sessions */,
1424
browserbaseSessionCreateParams: {
1525
projectId: process.env.BROWSERBASE_PROJECT_ID!,
1626
browserSettings: {
@@ -21,15 +31,12 @@ const StagehandConfig: ConstructorParams = {
2131
},
2232
},
2333
},
24-
enableCaching: false /* Enable caching functionality */,
25-
browserbaseSessionID:
26-
undefined /* Session ID for resuming Browserbase sessions */,
27-
modelName: "gpt-4o" /* Name of the model to use */,
28-
modelClientOptions: {
29-
apiKey: process.env.OPENAI_API_KEY,
30-
} /* Configuration options for the model client */,
3134
localBrowserLaunchOptions: {
3235
headless: false,
33-
},
36+
viewport: {
37+
width: 1024,
38+
height: 768,
39+
},
40+
} /* Configuration options for the local browser */,
3441
};
3542
export default StagehandConfig;

types/stagehandErrors.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ export class StagehandError extends Error {
88
export class StagehandDefaultError extends StagehandError {
99
constructor() {
1010
super(
11-
`\nHey! We're sorry you ran into an error. \nIf you need help, please open a Github issue or send us a Slack message: https://stagehand-dev.slack\n`,
11+
`\nHey! We're sorry you ran into an error. \nIf you need help, please open a Github issue or reach out to us on Slack: https://stagehand.dev/slack\n`,
1212
);
1313
}
1414
}

0 commit comments

Comments
 (0)