-
Notifications
You must be signed in to change notification settings - Fork 433
docs: adding agent pattern example notebooks (Langgraph) #7510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 5 commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
a126560
check author
PriyanJindal 641095e
adding langgraph agent pattern notebooks
PriyanJindal c8e84a1
cleaning style
PriyanJindal a3d8524
using Phoenix 9.0.1
PriyanJindal 7167b0f
using Phoenix 9.0.1
PriyanJindal 4bd547e
synthesiser typo
PriyanJindal 3a10a26
evals for router
PriyanJindal df37956
evals for parallel
PriyanJindal 08f1b8d
evals for prompt chaining
PriyanJindal 2a97ae5
adding evals for orchestrator
PriyanJindal 0d73e58
clean style
PriyanJindal File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,360 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "jXCU375BitSy" | ||
}, | ||
"source": [ | ||
"<center>\n", | ||
" <p style=\"text-align:center\">\n", | ||
" <img alt=\"phoenix logo\" src=\"https://storage.googleapis.com/arize-phoenix-assets/assets/phoenix-logo-light.svg\" width=\"200\"/>\n", | ||
" <br>\n", | ||
" <a href=\"https://docs.arize.com/phoenix/\">Docs</a>\n", | ||
" |\n", | ||
" <a href=\"https://github.com/Arize-ai/phoenix\">GitHub</a>\n", | ||
" |\n", | ||
" <a href=\"https://join.slack.com/t/arize-ai/shared_invite/zt-1px8dcmlf-fmThhDFD_V_48oU7ALan4Q\">Community</a>\n", | ||
" </p>\n", | ||
"</center>" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "bB9zTru-iiQS" | ||
}, | ||
"source": [ | ||
"# LangGraph: Evaluator–Optimizer Loop\n", | ||
"In this tutorial, we’ll build a code generation feedback loop using LangGraph — where a generator LLM writes code and an evaluator LLM provides structured reviews. This iterative pattern is useful for refining outputs over multiple steps until they meet a defined success criterion.\n", | ||
"\n", | ||
"The workflow consists of:\n", | ||
"\n", | ||
"A **generator LLM** that produces or revises code based on feedback.\n", | ||
"\n", | ||
"An **evaluator LLM** that assigns a grade (pass or fail) and gives feedback if needed.\n", | ||
"\n", | ||
"A **LangGraph state machine** that loops the generator until the evaluator approves the result.\n", | ||
"\n", | ||
"To make this fully observable and production-grade, we’ve instrumented the graph with Phoenix tracing. This enables you to inspect each generation and evaluation step, see what the model produced, and understand why it did (or didn’t) pass." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"!pip install langgraph langchain langchain_community \"arize-phoenix==9.0.1\" arize-phoenix-otel openinference-instrumentation-langchain" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"!pip install langchain_openai" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"import getpass\n", | ||
"import os\n", | ||
"\n", | ||
"from langgraph.graph import END, START, StateGraph" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"OpenAI API Key:\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "kE0uzv1ei0Nc" | ||
}, | ||
"source": [ | ||
"# Configure Phoenix Tracing\n", | ||
"\n", | ||
"Make sure you go to https://app.phoenix.arize.com/ and generate an API key. This will allow you to trace your Langgraph application with Phoenix." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"PHOENIX_API_KEY = getpass.getpass(\"Phoenix API Key:\")\n", | ||
"os.environ[\"PHOENIX_CLIENT_HEADERS\"] = f\"api_key={PHOENIX_API_KEY}\"\n", | ||
"os.environ[\"PHOENIX_COLLECTOR_ENDPOINT\"] = \"https://app.phoenix.arize.com\"" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from phoenix.otel import register\n", | ||
"\n", | ||
"tracer_provider = register(project_name=\"Evaluator-Optimizer\", auto_instrument=True)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "uCHt5qQzjQ84" | ||
}, | ||
"source": [ | ||
"# Evaluator‑Optimizer • Code‑Writing Loop\n", | ||
"---------------------------------------\n", | ||
"Input : problem_spec (natural‑language description of the function/program)\n", | ||
"\n", | ||
"Output : refined, accepted code (Python string)\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from typing import Literal, TypedDict\n", | ||
"\n", | ||
"from langchain_core.messages import HumanMessage, SystemMessage\n", | ||
"from langchain_core.pydantic_v1 import BaseModel, Field\n", | ||
"from langchain_openai import ChatOpenAI" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "Kfy5KexRjeKF" | ||
}, | ||
"source": [ | ||
"LLMs\n", | ||
"----\n", | ||
"• generator_llm : writes / rewrites the code\n", | ||
"\n", | ||
"• evaluator_llm : grades the code via structured output" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"generator_llm = ChatOpenAI(model=\"gpt-3.5-turbo\", temperature=0.3)\n", | ||
"evaluator_llm = ChatOpenAI(model=\"gpt-3.5-turbo\", temperature=0)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "I3v2xmGCjlSB" | ||
}, | ||
"source": [ | ||
"To enable structured, reliable feedback from the evaluator LLM, we define a Pydantic schema called Review. This schema ensures that all evaluations include both a binary grade (pass or fail) and feedback if the code needs improvement.\n", | ||
"\n", | ||
"By binding this schema to the evaluator LLM, we guarantee consistent output formatting and make it easier to route logic in the graph. This step is essential for closing the feedback loop and driving iterative optimization." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"class Review(BaseModel):\n", | ||
" grade: Literal[\"pass\", \"fail\"] = Field(description=\"Did the code fully solve the problem?\")\n", | ||
" feedback: str = Field(description=\"If grade=='fail', give concrete, actionable feedback.\")\n", | ||
"\n", | ||
"\n", | ||
"evaluator = evaluator_llm.with_structured_output(Review)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "fdfvi2tTjuTu" | ||
}, | ||
"source": [ | ||
"# Langgraph Shared State\n", | ||
"Defines the shared state for the evaluator-optimizer loop, tracking the problem description, generated code, feedback, and evaluation grade." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"class State(TypedDict):\n", | ||
" problem_spec: str\n", | ||
" code: str\n", | ||
" feedback: str\n", | ||
" grade: str # pass / fail" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "RHpxqUrwj1PL" | ||
}, | ||
"source": [ | ||
"# Node Functions: Generator & Evaluator\n", | ||
"These nodes power the evaluator–optimizer loop. The code_generator node uses the generator LLM to produce or revise Python code based on the task and prior feedback. The code_evaluator node uses a structured evaluator LLM to simulate a code review — mentally testing the code against the spec and returning a binary grade along with constructive feedback if needed. This feedback is then looped back to the generator until the code passes." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"def code_generator(state: State):\n", | ||
" \"\"\"Write or refine code based on feedback.\"\"\"\n", | ||
" prompt = (\n", | ||
" \"You are an expert Python developer.\\n\"\n", | ||
" \"Write clear, efficient, PEP‑8 compliant code that solves the task below.\\n\"\n", | ||
" \"If feedback is provided, revise the previous code accordingly.\\n\\n\"\n", | ||
" f\"### Task\\n{state['problem_spec']}\\n\\n\"\n", | ||
" )\n", | ||
" if state.get(\"feedback\"):\n", | ||
" prompt += f\"### Previous Reviewer Feedback\\n{state['feedback']}\\n\"\n", | ||
"\n", | ||
" msg = generator_llm.invoke(prompt)\n", | ||
" return {\"code\": msg.content}\n", | ||
"\n", | ||
"\n", | ||
"def code_evaluator(state: State):\n", | ||
" \"\"\"LLM reviews the code solution.\"\"\"\n", | ||
" review = evaluator.invoke(\n", | ||
" [\n", | ||
" SystemMessage(\n", | ||
" content=(\n", | ||
" \"You are a strict code reviewer. \"\n", | ||
" \"Run mental tests / reasoning to decide whether the code meets the spec. \"\n", | ||
" \"If it fails, give concise actionable feedback.\"\n", | ||
" )\n", | ||
" ),\n", | ||
" HumanMessage(\n", | ||
" content=(\n", | ||
" f\"### Problem\\n{state['problem_spec']}\\n\\n\"\n", | ||
" f\"### Candidate Code\\n{state['code']}\"\n", | ||
" )\n", | ||
" ),\n", | ||
" ]\n", | ||
" )\n", | ||
" return {\"grade\": review.grade, \"feedback\": review.feedback}" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "azF5t0TYkFAi" | ||
}, | ||
"source": [ | ||
"# Routing Logic\n", | ||
"To support iterative refinement, we define a simple routing function called route. After the code is evaluated, this function checks the grade returned by the evaluator: if the code passes, the process ends; if it fails, control loops back to the generator for another revision. This logic ensures the LLM can continuously improve its output based on feedback until it meets the quality bar." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"def route(state: State):\n", | ||
" return \"Accept\" if state[\"grade\"] == \"pass\" else \"Revise\"" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "AALHddZ5kJLP" | ||
}, | ||
"source": [ | ||
"# Building the LangGraph\n", | ||
"We now define our LangGraph-based workflow. Each component — the generator and evaluator — is added as a node. Directed edges specify the flow: the graph starts with generation, then moves to evaluation. Conditional edges use the routing logic to either terminate the loop (if passed) or revisit the generator (if failed). This structure enables self-correcting behavior with every iteration traceable via Phoenix." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"builder = StateGraph(State)\n", | ||
"\n", | ||
"builder.add_node(\"code_generator\", code_generator)\n", | ||
"builder.add_node(\"code_evaluator\", code_evaluator)\n", | ||
"\n", | ||
"builder.add_edge(START, \"code_generator\")\n", | ||
"builder.add_edge(\"code_generator\", \"code_evaluator\")\n", | ||
"builder.add_conditional_edges(\n", | ||
" \"code_evaluator\",\n", | ||
" route,\n", | ||
" {\n", | ||
" \"Accept\": END,\n", | ||
" \"Revise\": \"code_generator\",\n", | ||
" },\n", | ||
")\n", | ||
"\n", | ||
"workflow = builder.compile()" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "9JAMsA_8kMAe" | ||
}, | ||
"source": [ | ||
"# Example Usage" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"problem = \"\"\"\n", | ||
"Write code for a complicated website with javascript features.\n", | ||
"\"\"\"\n", | ||
"\n", | ||
"result_state = workflow.invoke({\"problem_spec\": problem})\n", | ||
"print(\"===== FINAL CODE =====\\n\")\n", | ||
"print(result_state[\"code\"])" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "Dg72TE-ykNlN" | ||
}, | ||
"source": [ | ||
"# Make sure to check our your traces in Phoenix!" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"language_info": { | ||
"name": "python" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 0 | ||
} |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo
Reply via ReviewNB