ArtAgents is a prototype framework designed for artists, designers, and creators to experiment with LLM-based prompt engineering and creative content generation. It leverages Ollama for local model serving, allowing users to interact with various text and multimodal models through specialized AI 'agents' and structured, configurable workflows ("Teams").
Select predefined agents, load custom agents, or utilize multi-agent "Teams" to generate detailed prompts, descriptions, image captions, or other text outputs. Provide text instructions and optionally images as input. Fine-tune generation using Ollama API parameters, prompt style limiters, and agent presets. Experiment systematically using the Sweep feature and manage image captions directly within the application.
Core Functionality:
- Ollama Integration: Connects to a running Ollama instance to utilize locally served LLMs (text & multimodal) with startup check.
- Agent System: Define and use specialized agents (Designer, Photographer, Styler, etc.) with unique instructions and optional API overrides (
agent_roles.json
,custom_agent_roles.json
). - Agent Team / Workflow Execution: Define (
agent_teams.json
) and run multi-step agent sequences ("Teams"). Supports sequential execution with context passing and multiple result assembly strategies (concatenate
,refine_last
,summarize_all
,structured_concatenate
). - Team Editor: Create, edit, save, and delete Agent Teams via a dedicated UI tab.
- Chat Interface: Main tab for direct interaction with selected agents or teams, including session history and response refinement.
- Multimodal Input: Supports single image upload or processing images within a specified folder for chat or captioning context.
- Image Captioning: Dedicated tab to load images from a folder, view/edit associated
.txt
caption files, save changes, and generate captions using selected agents/teams and vision models. - Experiment Sweeps: Systematically run base prompts across multiple selected Agent Teams and Worker Models. Saves detailed JSON protocol files for each run and separate
.txt
files containing the raw generated prompts per model. - Configuration Management: External JSON files for easy customization of settings, models, limiters, API profiles, agent roles, and agent teams.
- App Settings UI: Dedicated tab to configure Ollama URL, agent loading preferences, default behaviors, UI theme, and detailed Ollama API parameters (with loadable profiles).
- Persistent History: Logs all single interactions and detailed workflow steps to
core/history.json
, viewable and clearable in the "Full History" tab. - Utilities: Copy-to-clipboard for responses, optional prompt artifact cleaning, model release functions, contextual help tooltips, setup scripts.
- Modular Codebase: Organized structure (
core
,agents
,ui
) for maintainability.
ArtAgent/
│
├── app.py # Main Gradio App: UI Structure, Event Wiring, State Mgmt
├── requirements.txt # Python Dependencies (Consider migrating to pyproject.toml/Poetry)
├── settings.json # App Config: Ollama URL, defaults, global API opts, theme
├── models.json # Ollama models known to the app (name, vision)
├── limiters.json # Prompt style limiters (name, tokens, format string)
├── ollama_profiles.json # Presets for Ollama API options
├── agent_teams.json # Stores PREDEFINED & USER-SAVED Agent Team/Workflow definitions
│
├── agents/ # --- Agent Logic & Definitions ---
│ ├── __init__.py
│ ├── roles_config.py # Logic to load/merge roles
│ ├── ollama_agent.py # Interacts with Ollama API (get_llm_response)
│ ├── agent_roles.json # Default agent definitions
│ ├── custom_agent_roles.json # User's custom persistent agents
│ └── examples/ # --- Optional: Example Agent Files ---
│ └── *.json
│
├── core/ # --- Core Logic & Utilities ---
│ ├── __init__.py
│ ├── app_logic.py # Callback logic functions (router, UI callbacks)
│ ├── refinement_logic.py # Logic for comment/refinement feature
│ ├── agent_manager.py # Orchestrates Agent Team Workflows
│ ├── captioning_logic.py # Logic for caption editing & generation
│ ├── history_manager.py # Loads/saves persistent history
│ ├── ollama_checker.py # Ollama startup check logic
│ ├── ollama_manager.py # Ollama model release logic
│ ├── sweep_manager.py # Logic for running experiment sweeps
│ ├── utils.py # Common utilities (JSON loading, cleaning etc.)
│ ├── help_content.py # Stores help text for UI
│ └── history.json # Persistent history data file
│
├── ui/ # --- UI Tab Definitions (Gradio components) ---
│ ├── __init__.py
│ ├── chat_tab.py
│ ├── captions_tab.py # UI for caption editing & generation
│ ├── team_editor_tab.py # UI for editing teams
│ ├── sweep_tab.py # UI for experiment sweeps
│ ├── history_tab.py
│ ├── info_tab.py # Consolidated info tab (replaces roles_tab.py)
│ ├── app_settings_tab.py
│ └── common_ui_elements.py
│
├── scripts/ # --- Utility & Setup Scripts ---
│ ├── (Batch files: setup.bat, setupvenv.bat, go.bat, govenv.bat)
│ └── full_project_creator.py
│ └── (Optional: .sh equivalents)
│
├── docs/ # --- Detailed Documentation ---
│ ├── index.md # Overview (Placeholder)
│ ├── user-guide.md # User manual (Placeholder)
│ ├── architecture.md # System design (Placeholder)
│ └── api.md # Core function details (Placeholder, Optional)
│
├── sweep_runs/ # Default Output folder for Sweep Protocols (add to .gitignore)
│
├── tests/ # --- Automated Tests ---
│ ├── __init__.py
│ └── test_agent.py # Example tests (Needs Expansion)
│ └── (Placeholder: other test files)
│
├── .gitignore
└── README.md # This file
- Install Ollama: Download and install from ollama.com. Ensure the
ollama
command is available in your terminal. - Clone Repository:
git clone https://github.com/sandner-art/ArtAgents.git
and navigate into theArtAgent
directory (cd ArtAgent
). - Setup Python Environment (Recommended):
- Using Venv (Manual): Create and activate a virtual environment (Python 3.9+ recommended, 3.10+ required for potential Gradio 5 upgrade).
Then install requirements:
python -m venv venv # On Windows: .\venv\Scripts\activate # On Linux/macOS: source venv/bin/activate
pip install --upgrade pip pip install -r requirements.txt
- (Alternative) Using Scripts: Run
.\scripts\setupvenv.bat
(Windows) or equivalent.sh
script to automate venv creation andpip install
. - (Future) Using Poetry: If Poetry is implemented, replace step 3 with
poetry install
.
- Using Venv (Manual): Create and activate a virtual environment (Python 3.9+ recommended, 3.10+ required for potential Gradio 5 upgrade).
- Setup Ollama Models: Run
.\scripts\setup.bat
(Windows) or equivalent.sh
script. This checks Ollama connectivity and downloads recommended models listed inmodels.json
. Alternatively, useollama pull <model_name>
manually for desired models. - Configure (Optional): Review and edit JSON files (
settings.json
,models.json
,agent_teams.json
, etc.) to customize the application.
- Start Ollama Service: Ensure the Ollama service is running (e.g., launch the Ollama Desktop application or run
ollama serve
in a separate terminal). - Activate Environment: If using a virtual environment, activate it (
source venv/bin/activate
or.\venv\Scripts\activate
). - Run ArtAgents:
- If using venv:
python app.py
- Using Scripts:
.\scripts\govenv.bat
(Windows) or equivalent.sh
script. - (Future) Using Poetry:
poetry run python app.py
- If using venv:
- Access UI: Open the local URL provided in the console (usually
http://127.0.0.1:7860
) in your web browser.
For more detailed information, please refer to the documents in the /docs
directory:
/docs/user-guide.md
/docs/architecture.md
Phase 0: Stabilization & Core Refinement (Complete)
- Agent Captioning functionality stabilized.
- Agent Team Editor implemented and stabilized.
- Core assembly strategies (
concatenate
,refine_last
,summarize_all
,structured_concatenate
) tested. - Sweep output format implemented (per-model
.txt
prompt files + JSON protocols). - Optional prompt artifact cleaner added.
- Copy-to-clipboard button added.
- Consolidated "Info" tab implemented.
- Error handling reviewed and improved.
Phase 1: Foundational Expansion & Modernization (Current Focus)
- Gradio 5.x Upgrade: Evaluate and execute upgrade from Gradio 3.x.
- Hydra Integration: Migrate
.json
configurations to Hydra (.yaml
) for improved experiment management. - Implement Select Novel Synthesis Strategies: Add 2-3 creative strategies (e.g., Metaphorical Synthesis, Conceptual Blending) to
agent_manager.py
and Team Editor UI. - NLP Library Integration (
nlpaug
): Integrate for noise/synonym capabilities within strategies or as agent steps. - Unit Testing Expansion: Write comprehensive
pytest
tests for core logic and new features.
Future / Planned Enhancements (Phase 2+):
- Advanced Agent Teams (Hierarchical agents, conditional logic, feedback loops).
- Advanced Experimentation (Parameter sweeping via Hydra, potentially MLFlow integration).
- Direct Image Generation API Integration (e.g., ComfyUI, A1111).
- Workflow Visualization.
- Enhanced UI/UX (Improved Team Editor, potential Gradio custom components).
- Explainability / XAI Features.
- More Novel Synthesis Strategies & NLP features.
Contributions are welcome! Please refer to CONTRIBUTING.md for guidelines on reporting issues, suggesting features, or submitting pull requests.
ArtAgents by Daniel Sandner © 2024 - 2025. Adapt and use creatively. No guarantees provided. MIT LICENSE.