A Streamlit app powered by LangGraph and OpenRouter to analyze images (charts, tables, documents, UIs, etc.) and extract structured insights.
This agent can:
- OCR and reformat tables.
- Extract data points from time series plots, bar charts, pie charts, and other visualizations.
- Perform Visual Question Answering on general images.
- Summarize min/max values and trends.
- Multimodal: Handles charts, tables, and documents in one flow.
- Automatic: Uses LangGraph to orchestrate the process.
- Markdown Outputs: Neatly formatted tables and summaries.
- Interactive UI: Built with Streamlit.
git clone https://github.com/<your-username>/universal-image-agent.git
cd universal-image-agent
Use Python 3.9+:
pip install -r requirements.txt
requirements.txt
streamlit
langgraph
openai
pillow
This project uses OpenRouter for model inference.
-
Get your OpenRouter API Key from: https://openrouter.ai
-
Add it to your environment:
export OPENROUTER_API_KEY="your-key-here"
Or replace directly in the code:
client = OpenAI( api_key="your-key", base_url="https://openrouter.ai/api/v1" )
streamlit run app.py
Then open: 👉 http://localhost:8501
-
Upload an image (charts, tables, UI screenshots, etc.)
-
Ask a question like:
"Show all data points"
"Extract the table"
"What is the max value and when did it occur?"
-
Get neatly formatted Markdown tables and summaries.
- LangGraph: Defines a minimal state machine with a single agent node.
- OpenRouter (Qwen2.5 VL 32B): Performs multimodal reasoning.
- Streamlit: Interactive web UI.
graph TD;
User -->|Upload Image + Question| Streamlit;
Streamlit --> LangGraph;
LangGraph --> UniversalAgent;
UniversalAgent -->|Image + Prompt| OpenRouter;
OpenRouter -->|Markdown Answer| Streamlit;
Input: Prophet forecast chart Output:
### Max Value: 432 on 2024-03-15
### Min Value: 98 on 2024-01-02
| Date | Series Name | Value |
|------------|-------------|-------|
| 2024-01-02 | yhat | 98 |
| 2024-02-01 | yhat | 250 |
| 2024-03-15 | yhat | 432 |
MIT License © 2025