final touches for v2 (#622)

kamath · web-flow · commit dc8b6b3bd6a1 · 2025-03-31T02:10:47.000-07:00
* v2 readme

* minor

* line break

* getting started

* gif on top

* rm hr

* v2 readme

* minor

* line break

* getting started

* gif on top

* rm hr

* gif + cleanup

* revert example

* rm zod

* gif first

* gif first

* copy

* copy

* copy

* height

* pls

* prod ready

* features

* headings

* copy

* unpredictable in prod

* grammar

* spacing

* final?

* nl vs code

* cua

* cua

* change slack url

* new gif

* v2 readme

* minor

* line break

* getting started

* gif on top

* rm hr

* revert example

* rm zod

* gif first

* gif first

* copy

* copy

* copy

* height

* pls

* prod ready

* features

* headings

* copy

* unpredictable in prod

* grammar

* spacing

* final?

* nl vs code

* cua

* cua

* change slack url

* new gif

* config file

* cursorrules + config

* changelog
diff --git a/.changeset/wise-worlds-pull.md b/.changeset/wise-worlds-pull.md
@@ -2,4 +2,24 @@
 "@browserbasehq/stagehand": major
 ---
 
-temporary placeholder
+Announcing **Stagehand 2.0**! 🎉
+
+We're thrilled to announce the release of Stagehand 2.0, bringing significant improvements to make browser automation more powerful, faster, and easier to use than ever before.
+
+### 🚀 New Features
+
+- **Introducing `stagehand.agent`**: A powerful new way to integrate SOTA Computer use models or Browserbase's [Open Operator](https://operator.browserbase.com) into Stagehand with one line of code! Perfect for multi-step workflows and complex interactions. [Learn more](https://docs.stagehand.dev/concepts/agent)
+- **Lightning-fast `act` and `extract`**: Major performance improvements to make your automations run significantly faster.
+- **Enhanced Logging**: Better visibility into what's happening during automation with improved logging and debugging capabilities.
+- **Comprehensive Documentation**: A completely revamped documentation site with better examples, guides, and best practices.
+- **Improved Error Handling**: More descriptive errors and better error recovery to help you debug issues faster.
+
+### 🛠️ Developer Experience
+
+- **Better TypeScript Support**: Enhanced type definitions and better IDE integration
+- **Better Error Messages**: Clearer, more actionable error messages to help you debug faster
+- **Improved Caching**: More reliable action caching for better performance
+
+We're excited to see what you build with Stagehand 2.0! For questions or support, join our [Slack community](https://stagehand.dev/slack).
+
+For more details, check out our [documentation](https://docs.stagehand.dev).
diff --git a/.cursorrules b/.cursorrules
@@ -0,0 +1,140 @@
+# Stagehand Project
+
+This is a project that uses Stagehand, which amplifies Playwright with `act`, `extract`, and `observe` added to the Page class.
+
+`Stagehand` is a class that provides config, a `StagehandPage` object via `stagehand.page`, and a `StagehandContext` object via `stagehand.context`.
+
+`Page` is a class that extends the Playwright `Page` class and adds `act`, `extract`, and `observe` methods.
+`Context` is a class that extends the Playwright `BrowserContext` class.
+
+Use the following rules to write code for this project.
+
+- To take an action on the page like "click the sign in button", use Stagehand `act` like this:
+
+```typescript
+await page.act("Click the sign in button");
+```
+
+- To plan an instruction before taking an action, use Stagehand `observe` to get the action to execute.
+
+```typescript
+const [action] = await page.observe("Click the sign in button");
+```
+
+- The result of `observe` is an array of `ObserveResult` objects that can directly be used as params for `act` like this:
+
+  ```typescript
+  const [action] = await page.observe("Click the sign in button");
+  await page.act(action);
+  ```
+
+- When writing code that needs to extract data from the page, use Stagehand `extract`. Explicitly pass the following params by default:
+
+```typescript
+const { someValue } = await page.extract({
+  instruction: the instruction to execute,
+  schema: z.object({
+    someValue: z.string(),
+  }), // The schema to extract
+});
+```
+
+## Initialize
+
+```typescript
+import { Stagehand } from "@browserbasehq/stagehand";
+import StagehandConfig from "./stagehand.config";
+
+const stagehand = new Stagehand(StagehandConfig);
+await stagehand.init();
+
+const page = stagehand.page; // Playwright Page with act, extract, and observe methods
+const context = stagehand.context; // Playwright BrowserContext
+```
+
+## Act
+
+You can cache the results of `observe` and use them as params for `act` like this:
+
+```typescript
+const instruction = "Click the sign in button";
+const cachedAction = await getCache(instruction);
+
+if (cachedAction) {
+  await page.act(cachedAction);
+} else {
+  try {
+    const results = await page.observe(instruction);
+    await setCache(instruction, results);
+    await page.act(results[0]);
+  } catch (error) {
+    await page.act(instruction); // If the action is not cached, execute the instruction directly
+  }
+}
+```
+
+Be sure to cache the results of `observe` and use them as params for `act` to avoid unexpected DOM changes. Using `act` without caching will result in more unpredictable behavior.
+
+Act `action` should be as atomic and specific as possible, i.e. "Click the sign in button" or "Type 'hello' into the search input".
+AVOID actions that are more than one step, i.e. "Order me pizza" or "Type in the search bar and hit enter".
+
+## Extract
+
+If you are writing code that needs to extract data from the page, use Stagehand `extract`.
+
+```typescript
+const signInButtonText = await page.extract("extract the sign in button text");
+```
+
+You can also pass in params like an output schema in Zod, and a flag to use text extraction:
+
+```typescript
+const data = await page.extract({
+  instruction: "extract the sign in button text",
+  schema: z.object({
+    text: z.string(),
+  }),
+});
+```
+
+`schema` is a Zod schema that describes the data you want to extract. To extract an array, make sure to pass in a single object that contains the array, as follows:
+
+```typescript
+const data = await page.extract({
+  instruction: "extract the text inside all buttons",
+  schema: z.object({
+    text: z.array(z.string()),
+  }),
+  useTextExtract: true, // Set true for larger-scale extractions (multiple paragraphs), or set false for small extractions (name, birthday, etc)
+});
+```
+
+## Agent
+
+Use the `agent` method to automonously execute larger tasks like "Get the stock price of NVDA"
+
+```typescript
+// Navigate to a website
+await stagehand.page.goto("https://www.google.com");
+
+const agent = stagehand.agent({
+  // You can use either OpenAI or Anthropic
+  provider: "openai",
+  // The model to use (claude-3-7-sonnet-20250219 or claude-3-5-sonnet-20240620 for Anthropic)
+  model: "computer-use-preview",
+
+  // Customize the system prompt
+  instructions: `You are a helpful assistant that can use a web browser.
+	Do not ask follow up questions, the user will trust your judgement.`,
+
+  // Customize the API key
+  options: {
+    apiKey: process.env.OPENAI_API_KEY,
+  },
+});
+
+// Execute the agent
+await agent.execute(
+  "Apply for a library card at the San Francisco Public Library"
+);
+```
diff --git a/README.md b/README.md
@@ -10,7 +10,7 @@
 </div>
 
 <p align="center">
-  An AI web browsing framework focused on simplicity and extensibility.<br>
+  The production-ready framework for AI browser automations.<br>
   <a href="https://docs.stagehand.dev">Read the Docs</a>
 </p>
 
@@ -33,45 +33,63 @@
 	<a href="https://trendshift.io/repositories/12122" target="_blank"><img src="https://trendshift.io/api/badge/repositories/12122" alt="browserbase%2Fstagehand | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
 </p>
 
----
+## Why Stagehand?
 
-Stagehand is the easiest way to build browser automations. It is fully compatible with [Playwright](https://playwright.dev/), offering three simple AI APIs (`act`, `extract`, and `observe`) on top of the base Playwright `Page` class that provide the building blocks for web automation via natural language. 
+Most existing browser automation tools either require you to write low-level code in a framework like Selenium, Playwright, or Puppeteer, or use high-level agents that can be unpredictable in production. By letting developers choose what to write in code vs. natural language, Stagehand is the natural choice for browser automations in production.
 
-Here's a sample of what you can do with Stagehand:
+1. **Choose when to write code vs. natural language**: use AI when you want to navigate unfamiliar pages, and use code ([Playwright](https://playwright.dev/)) when you know exactly what you want to do.
+
+2. **Preview and cache actions**: Stagehand lets you preview AI actions before running them, and also helps you easily cache repeatable actions to save time and tokens.
+
+3. **Computer use models with one line of code**: Stagehand lets you integrate SOTA computer use models from OpenAI and Anthropic into the browser with one line of code.
+
+## Example
+
+Here's how to build a sample browser automation with Stagehand:
+
+<div align="center">
+  <div style="max-width:300px;">
+    <img src="/media/github_demo.gif" alt="See Stagehand in Action">
+  </div>
+</div>
 
 ```typescript
-// Keep your existing Playwright code unchanged
-await page.goto("https://docs.stagehand.dev");
+// Use Playwright functions on the page object
+const page = stagehand.page;
+await page.goto("https://github.com/browserbase");
+
+// Use act() to execute individual actions
+await page.act("click on the stagehand repo");
 
-// Stagehand AI: Act on the page
-await page.act("click on the 'Quickstart'");
+// Use Computer Use agents for larger actions
+const agent = stagehand.agent({
+    provider: "openai",
+    model: "computer-use-preview",
+});
+await agent.execute("Get to the latest PR");
 
-// Stagehand AI: Extract data from the page
-const { description } = await page.extract({
-  instruction: "extract the description of the page",
+// Use extract() to read data from the page
+const { author, title } = await page.extract({
+  instruction: "extract the author and title of the PR",
   schema: z.object({
-    description: z.string(),
+    author: z.string().describe("The username of the PR author"),
+    title: z.string().describe("The title of the PR"),
   }),
 });
 ```
 
-> [!WARNING]  
-> We highly recommend using the Node.js runtime environment to run Stagehand scripts, as opposed to newer alternatives like Bun. This is solely due to the fact that [Bun's runtime is not yet fully compatible with Playwright](https://github.com/microsoft/playwright/issues/27139).
-
-## Why?
-**Stagehand adds determinism to otherwise unpredictable agents.**
-
-While there's no limit to what you could instruct Stagehand to do, our primitives allow you to control how much you want to leave to an AI. It works best when your code is a sequence of atomic actions. Instead of writing a single script for a single website, Stagehand allows you to write durable, self-healing, and repeatable web automation workflows that actually work.
-
-> [!NOTE] 
-> `Stagehand` is currently available as an early release, and we're actively seeking feedback from the community. Please join our [Slack community](https://stagehand.dev/slack) to stay updated on the latest developments and provide feedback.
-
 ## Documentation
 
 Visit [docs.stagehand.dev](https://docs.stagehand.dev) to view the full documentation.
 
 ## Getting Started
 
+Start with Stagehand with one line of code, or check out our [Quickstart Guide](https://docs.stagehand.dev/get_started/quickstart) for more information:
+
+```bash
+npx create-browser-app
+```
+
 <div align="center">
     <a href="https://www.loom.com/share/f5107f86d8c94fa0a8b4b1e89740f7a7">
       <p>Watch Anirudh demo create-browser-app to create a Stagehand project!</p>
@@ -81,23 +99,6 @@ Visit [docs.stagehand.dev](https://docs.stagehand.dev) to view the full document
     </a>
   </div>
 
-### Quickstart
-
-To create a new Stagehand project configured to our default settings, run:
-
-```bash
-npx create-browser-app --example quickstart
-```
-
-Read our [Quickstart Guide](https://docs.stagehand.dev/get_started/quickstart) in the docs for more information.
-
-You can also add Stagehand to an existing Typescript project by running:
-
-```bash
-npm install @browserbasehq/stagehand zod
-npx playwright install # if running locally
-```
-
 ### Build and Run from Source
 
 ```bash
@@ -129,12 +130,15 @@ For more information, please see our [Contributing Guide](https://docs.stagehand
 
 This project heavily relies on [Playwright](https://playwright.dev/) as a resilient backbone to automate the web. It also would not be possible without the awesome techniques and discoveries made by [tarsier](https://github.com/reworkd/tarsier), and [fuji-web](https://github.com/normal-computing/fuji-web).
 
-We'd like to thank the following people for their contributions to Stagehand:
-- [Jeremy Press](https://x.com/jeremypress) wrote the original MVP of Stagehand and continues to be an ally to the project.
-- [Navid Pour](https://github.com/navidpour) is heavily responsible for the current architecture of Stagehand and the `act` API.
-- [Sean McGuire](https://github.com/seanmcguire12) is a major contributor to the project and has been a great help with improving the `extract` API and getting evals to a high level.
-- [Filip Michalsky](https://github.com/filip-michalsky) has been doing a lot of work on building out integrations like [Langchain](https://js.langchain.com/docs/integrations/tools/stagehand/) and [Claude MCP](https://github.com/browserbase/mcp-server-browserbase), generally improving the repository, and unblocking users.
-- [Sameel Arif](https://github.com/sameelarif) is a major contributor to the project, especially around improving the developer experience.
+We'd like to thank the following people for their major contributions to Stagehand:
+- [Paul Klein](https://github.com/pkiv)
+- [Anirudh Kamath](https://github.com/kamath)
+- [Sean McGuire](https://github.com/seanmcguire12)
+- [Miguel Gonzalez](https://github.com/miguelg719)
+- [Sameel Arif](https://github.com/sameelarif)
+- [Filip Michalsky](https://github.com/filip-michalsky)
+- [Jeremy Press](https://x.com/jeremypress)
+- [Navid Pour](https://github.com/navidpour)
 
 ## License
 
diff --git a/media/create-browser-app.gif b/media/create-browser-app.gif
diff --git a/media/github_demo.gif b/media/github_demo.gif
diff --git a/stagehand.config.ts b/stagehand.config.ts
@@ -3,14 +3,24 @@ import dotenv from "dotenv";
 dotenv.config();
 
 const StagehandConfig: ConstructorParams = {
-  verbose: 1,
+  verbose: 1 /* Verbosity level for logging: 0 = silent, 1 = info, 2 = all */,
+  domSettleTimeoutMs: 30_000 /* Timeout for DOM to settle in milliseconds */,
+
+  //   LLM configuration
+  modelName: "gpt-4o" /* Name of the model to use */,
+  modelClientOptions: {
+    apiKey: process.env.OPENAI_API_KEY,
+  } /* Configuration options for the model client */,
+
+  // Browser configuration
   env:
     process.env.BROWSERBASE_API_KEY && process.env.BROWSERBASE_PROJECT_ID
       ? "BROWSERBASE"
       : "LOCAL",
   apiKey: process.env.BROWSERBASE_API_KEY /* API key for authentication */,
   projectId: process.env.BROWSERBASE_PROJECT_ID /* Project identifier */,
-  domSettleTimeoutMs: 30_000 /* Timeout for DOM to settle in milliseconds */,
+  browserbaseSessionID:
+    undefined /* Session ID for resuming Browserbase sessions */,
   browserbaseSessionCreateParams: {
     projectId: process.env.BROWSERBASE_PROJECT_ID!,
     browserSettings: {
@@ -21,15 +31,12 @@ const StagehandConfig: ConstructorParams = {
       },
     },
   },
-  enableCaching: false /* Enable caching functionality */,
-  browserbaseSessionID:
-    undefined /* Session ID for resuming Browserbase sessions */,
-  modelName: "gpt-4o" /* Name of the model to use */,
-  modelClientOptions: {
-    apiKey: process.env.OPENAI_API_KEY,
-  } /* Configuration options for the model client */,
   localBrowserLaunchOptions: {
     headless: false,
-  },
+    viewport: {
+      width: 1024,
+      height: 768,
+    },
+  } /* Configuration options for the local browser */,
 };
 export default StagehandConfig;
diff --git a/types/stagehandErrors.ts b/types/stagehandErrors.ts
@@ -8,7 +8,7 @@ export class StagehandError extends Error {
 export class StagehandDefaultError extends StagehandError {
   constructor() {
     super(
-      `\nHey! We're sorry you ran into an error. \nIf you need help, please open a Github issue or send us a Slack message: https://stagehand-dev.slack\n`,
+      `\nHey! We're sorry you ran into an error. \nIf you need help, please open a Github issue or reach out to us on Slack: https://stagehand.dev/slack\n`,
     );
   }
 }

Original file line number	Diff line number	Diff line change
`@@ -8,7 +8,7 @@ export class StagehandError extends Error {`
`8`	`8`	`export class StagehandDefaultError extends StagehandError {`
`9`	`9`	`constructor() {`
`10`	`10`	`super(`
`11`		- `\nHey! We're sorry you ran into an error. \nIf you need help, please open a Github issue or send us a Slack message: https://stagehand-dev.slack\n`,
	`11`	+ `\nHey! We're sorry you ran into an error. \nIf you need help, please open a Github issue or reach out to us on Slack: https://stagehand.dev/slack\n`,
`12`	`12`	`);`
`13`	`13`	`}`
`14`	`14`	`}`