docs/how_to/multimodal_inputs/ #27702

If anyone wants to allow the agent to use multimodal output from a tool call (common scenario in complex tasks), for example, let the model capture a screenshot, check this discussion on how to do it:
#25881

0 replies

Albaeld · 2025-06-04T15:26:10Z

Albaeld
Jun 4, 2025 — with giscus

This simply does not work with openai:gpt-4.1. I get:
Error code: 400 - {'error': {'message': "Missing required parameter: 'messages[0].content[1].file'.", 'type': 'invalid_request_error', 'param': 'messages[0].content[1].file', 'code': 'missing_required_parameter'}}

7 replies

gutbash Jun 13, 2025

yeah but does it work with memory is the question

brunoshine Jun 13, 2025

Hi all, there is an open issue for this here #31505 with some working examples for OpenAI and Gemini, and still failing code for Azure OpenAI

gutbash Jun 13, 2025

Hi all, there is an open issue for this here #31505 with some working examples for OpenAI and Gemini, and still failing code for Azure OpenAI

it doesnt test with memory. i tried a similar implementation and it didnt work with memory.

brunoshine Jun 13, 2025

But it works as I have an agent implemented in langgraph that uses postgres as the node state / memory

dayvidborges Jun 13, 2025 — with giscus

Hey guys, gutbash the second option i showed you can perfectly fine use with memory

brunoshine · 2025-06-05T18:38:45Z

brunoshine
Jun 5, 2025

Hi all,

I'm also facing the same issue that @Albaeld reported.
I have a langgraph agent with several nodes and Im calling the graph via agent.astream_events(**kwargs, version="v2").

On the llm node I have:

 messages = prompt + state["messages"]
 chat = ChatPromptTemplate.from_messages(messages)
logger.debug("-------------------------------------------------------------------------------->> invoking LLM with config")
logger.debug(f"{chat}")
logger.debug("-------------------------------------------------------------------------------->> invoking LLM...")
response = await model.ainvoke(chat.format_prompt(), config)

this outputs:

== APP == -------------------------------------------------------------------------------->> invoking LLM with config
== APP == input_variables=[] input_types={} partial_variables={} messages=[SystemMessage(content="You are a concierge agent. You are very helpful and friendly. You are very polite and respectful. You are always ready to help.\n\ncurrent date/time: 2025-06-05 19:32:40\n\nuser context: {'id': '123123123', 'name': 'Bruno Figueiredo', 'avatar': ''}", additional_kwargs={}, response_metadata={}), HumanMessage(content='olá', additional_kwargs={}, response_metadata={}, id='8fa5db18-69b0-4085-9d4e-aa16847da243'), AIMessage(content='Olá, Bruno! 😊 Como posso ajudar você hoje?', additional_kwargs={}, response_metadata={'finish_reason': 'stop', 'model_name': 'gpt-4o-2024-11-20', 'system_fingerprint': 'fp_ee1d74bde0'}, id='run-2df83ba9-a843-4029-a020-0d41465fddf3'), HumanMessage(content=[{'type': 'text', 'text': 'resume'}, {'type': 'file', 'source_type': 'url', 'url': 'https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf'}], additional_kwargs={}, response_metadata={}, id='363b871a-d9bc-4f40-ab56-5138486e7f07')]
== APP == -------------------------------------------------------------------------------->> invoking LLM...
== APP == Error invoking model: Error code: 400 - {'error': {'message': "Missing required parameter: 'messages[3].content[1].file'.", 'type': 'invalid_request_error', 'param': 'messages[3].content[1].file', 'code': 'missing_required_parameter'}}

any thoughts? thanks

1 reply

ccurme Jun 14, 2025
Maintainer

Hi @brunoshine, it looks like you're passing in a PDF via a URL, from what I can tell that's not supported by OpenAI, see their relevant docs here.

See LangChain docs here on working with PDFs and OpenAI.

gutbash · 2025-06-06T00:58:03Z

gutbash
Jun 6, 2025

bro i dont understand why multimodality is still not really supported and definitely not well documented. they should've added multimodal handling before LCEL even existed.

0 replies

cividanieltorres · 2025-06-19T07:22:37Z

cividanieltorres
Jun 19, 2025 — with giscus

I am having a bit of a misunderstanding. I am following the guidelines of this page to build the following input message:

{
  "messages": [
    {
      "type": "human",
      "content": [
         {"type": "text", "text": "what do you see?"},
         {
          "type": "image",
          "source_type": "url",
          "url": "https://xyz.com/example-image.png"
         }
      ]
    }
  ]
}

However, I am getting the following error both with latest langchain-openai and langchain-google-vertexai library versions:

OpenAI: openai.BadRequestError: Error code: 400 - {'error': {'message': "Invalid value: 'image'. Supported values are: 'text', 'image_url', 'input_audio', 'refusal', 'audio', and 'file'.", 'type': 'invalid_request_error', 'param': 'messages[1].content[1].type', 'code': 'invalid_value'}}
VertexAI: ValueError: Only text, image_url, and media types are supported!

1 reply

gutbash Jun 19, 2025

I am having a bit of a misunderstanding. I am following the guidelines of this page to build the following input message:
{
  "messages": [
    {
      "type": "human",
      "content": [
         {"type": "text", "text": "what do you see?"},
         {
          "type": "image",
          "source_type": "url",
          "url": "https://xyz.com/example-image.png"
         }
      ]
    }
  ]
}
However, I am getting the following error both with latest langchain-openai and langchain-google-vertexai library versions:

OpenAI: openai.BadRequestError: Error code: 400 - {'error': {'message': "Invalid value: 'image'. Supported values are: 'text', 'image_url', 'input_audio', 'refusal', 'audio', and 'file'.", 'type': 'invalid_request_error', 'param': 'messages[1].content[1].type', 'code': 'invalid_value'}}

VertexAI: ValueError: Only text, image_url, and media types are supported!

hey 👋 i see the problem with your specific error. try changing the type on the image to specifically image_url.

docs/how_to/multimodal_inputs/ #27702

Uh oh!

giscus[bot] bot Oct 29, 2024

docs/how_to/multimodal_inputs/

Replies: 9 comments · 10 replies

Uh oh!

VpkPrasanna Oct 29, 2024 — with giscus

Uh oh!

adithya04dev Nov 5, 2024 — with giscus

Uh oh!

Uh oh!

jvlinsta Nov 19, 2024 — with giscus

Uh oh!

MohamadRezaei0 Nov 23, 2024 — with giscus

Uh oh!

Hamza5 Jan 9, 2025 — with giscus

Uh oh!

Albaeld Jun 4, 2025 — with giscus

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dayvidborges Jun 13, 2025 — with giscus

Uh oh!

Uh oh!

Uh oh!

ccurme Jun 14, 2025 Maintainer

Uh oh!

Uh oh!

cividanieltorres Jun 19, 2025 — with giscus

Uh oh!

giscus[bot]
bot Oct 29, 2024

Replies: 9 comments 10 replies

VpkPrasanna
Oct 29, 2024 — with giscus

adithya04dev
Nov 5, 2024 — with giscus

jvlinsta
Nov 19, 2024 — with giscus

MohamadRezaei0
Nov 23, 2024 — with giscus

Hamza5
Jan 9, 2025 — with giscus

Albaeld
Jun 4, 2025 — with giscus

ccurme Jun 14, 2025
Maintainer

cividanieltorres
Jun 19, 2025 — with giscus