When crawling confluence, send pictures that are encountered to an LLM to include textual description into knowledge base #31683

Dennis-E · 2025-06-20T13:24:14Z

Dennis-E
Jun 20, 2025

I’m exploring a workflow to enhance our RAG chatbots based on Confluence knowledge base by automatically processing embedded images:

While crawling Confluence pages detect image attachments or inline images.
Automatically send each encountered image to an LLM or OCR-capable model.
Generate descriptive, context-aware captions or alt-text via the model.
Integrate these auto-generated descriptions into the knowledge base, making image "content" available to RAG chatbots

Goal: Improve accessibility, enable image-based search queries, and enrich the knowledge base with textual context for visual content.

Any suggestions if how to achieve this or integrate it to langchain?

@Dennis-E · 2025-06-20T13:27:10Z

dosubot[bot]
bot Jun 20, 2025

Hi @Dennis-E! I'm Dosu and I’m helping the langchain team.

You can achieve this workflow in LangChain by combining a few modular components. Use the ConfluenceLoader to extract both text and image attachments from Confluence pages. For each image, process it with an external image captioning or OCR API—LangChain provides an ImageCaptionLoader for this purpose, or you can use the SceneXplainTool, which wraps the SceneXplain API for generating descriptive captions from images. After generating captions, you can enrich your knowledge base by attaching these descriptions as metadata or new documents, making them accessible to your RAG pipeline and improving both accessibility and searchability. LangChain’s design encourages chaining these steps together, but you’ll need to integrate your preferred image-to-text model or API for the captioning step. For more on these components, see ImageCaptionLoader and ConfluenceLoader and SceneXplainTool. LangChain also supports multimodal inputs for OpenAI models, so you can pass images directly to LLMs that support this feature if needed (multimodal support details).

To reply, just mention @dosu.

^{How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other}

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

When crawling confluence, send pictures that are encountered to an LLM to include textual description into knowledge base #31683

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

When crawling confluence, send pictures that are encountered to an LLM to include textual description into knowledge base #31683

Uh oh!

Dennis-E Jun 20, 2025

Replies: 1 comment

Uh oh!

dosubot[bot] bot Jun 20, 2025

Dennis-E
Jun 20, 2025

dosubot[bot]
bot Jun 20, 2025