Extracting a structured (pydantic) output from documents #28371
-
Checked other resources
Commit to Help
Example Codefrom typing import Optional
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.prompts import ChatPromptTemplate
from langchain_core.documents import Document
from langchain_openai import ChatOpenAI
from pydantic import BaseModel, Field
class Person(BaseModel):
"""Information about a person."""
name: Optional[str] = Field(default=None, description="The name of the person")
hair_color: Optional[str] = Field(
default=None, description="The color of the person's hair if known"
)
height_in_meters: Optional[str] = Field(
default=None, description="Height measured in meters"
)
class People(BaseModel):
"""A collection of people."""
people: list[Person] = Field(
default_factory=list, description="A list of people"
)
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
prompt = ChatPromptTemplate.from_messages([
("system", "You are an expert model excelling at extracting human information\n{context}"),
])
structured_llm = llm.with_structured_output(schema=People)
chain = create_stuff_documents_chain(structured_llm, prompt)
docs = [Document(metadata={"team": "dev"}, page_content="The dev team corresponds of Bob and Alice. Bob is 5'8 with a black hair. Alice has the same hair color but is 20cm shorter than Bob")]
chain.invoke({"context": docs}) DescriptionThe problem is that when creating the chain with I ended up patching the "create_stuff_documents_chain" to ignore output_parser when not supplied (or worse - reimplement it in my code). What is the right way of using it? System InfoSystem Information
Package Information
|
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
|
Beta Was this translation helpful? Give feedback.
-
Ran into he same issue and here is what I did to override the default StrOutputParser custom_parser = lambda x: x
stuff_chain = create_stuff_documents_chain(llm, prompt=my_prompt, output_parser=custom_parser)
# Then invoke normally - this replaces the default StrIOutputParser with this no-op lambda |
Beta Was this translation helpful? Give feedback.
Ran into he same issue and here is what I did to override the default StrOutputParser
Ensure that the model you use has tool calling.