AI Creative

AI Creative is an end-to-end pipeline for generating 3D models from text prompts, leveraging cutting-edge AI technologies including text-to-image generation and image-to-3D model transformation.

https://www.loom.com/share/a74df385d75e48c294fa9ebd656e0b6a?sid=24a7b54f-ff45-46c6-a477-8f8c840af591

Features

Text-to-Image Generation: Convert descriptive text prompts into detailed, high-quality images
Text Prompt Enhancement: Automatically enhance user prompts for better results using LLM
Image-to-3D Conversion: Transform 2D images into textured 3D models in GLB format
Video Preview: Generate video previews of the 3D models for quick visualization
User-friendly Interface: Simple web UI for interacting with the application
Openfabric Integration: Leverages Openfabric's powerful AI applications ecosystem

System Requirements

Python 3.10+
8GB RAM (16GB+ recommended for larger models)
Modern CPU (GPU recommended for improved performance)
1GB free disk space for the application and generated assets
13GB free disk space for the LLM model (if downloaded)
Internet connection (for Openfabric API access)
Operating Systems: Windows 10/11, macOS, Linux

Default Model Used

The LLM used is the google/gemma-3-1b-it model, which is a lightweight version of the Llama-3.2 model. It is designed to run on consumer-grade hardware with 8GB of RAM. The model is capable of enhancing text prompts for better image generation results.
This is a gated model; you would need your huggingface token to access it. You can request access to the model here.
The model is downloaded using the download_model.sh script, which uses the huggingface-cli to authenticate and download the model files. Set your Hugging Face token in the .env file before running the script.
If you don't want to add your Huggingface token, the application will fall back to using non-gated models.

    fallback_models = [
        "google/gemma-3-1b-it",
        # Can add different models
    ]

Installation

1. Clone the repository

git clone https://github.com/LuniaKunal/Openfabric-Interview.git

2. Create a Python virtual environment

# Using conda
conda create -n text-image-3d python=3.10
conda activate text-image-3d

3. Install dependencies

pip install -r requirements.txt

4. Download the LLM model (optional but recommended)

Model size is approximately 13GB. To use the local LLM for prompt enhancement:

bash download_model.sh

5. Set up environment variables

Create a .env file in the project root. Check .env.example for the required variables.

Usage

Starting the LLM Service

Start the local LLM service first:

python app/llm/service.py

Wait for the model to load (you should see "LLM initialized successfully" in the console).

Running the UI Application

In a new terminal window:

python app/ui/app.py

This will launch the web interface, accessible at http://localhost:7860.

Using the Application

Enter a prompt: Enter a descriptive text prompt for the 3D model you want to create
Generate: Click the "Generate" button to start the creation process
View results: The application will display:
- The original prompt
- The enhanced prompt (if LLM enhancement worked)
- The generated image
- The 3D model viewer with the created model
- Download links for both image and 3D model

Here's a screenshot of the application in action, showing the input prompt, generated image, and the expanded prompt. The total time taken for the generation process is also displayed.

Model Viewer

Image Gallery

Model Gallery

Memory Implementation

AI Creative implements a file-based persistence system for storing generated assets and their metadata. This allows the application to maintain a history of creations and reload them even after application restarts.

Image Storage

When a text-to-image generation is completed:

Image Files: Generated images are saved in the app/data/images/ directory as PNG files
Metadata Files: For each image, a corresponding JSON metadata file is created with the same base name
Metadata Content: The metadata includes:
- Original prompt
- Enhanced prompt
- Timestamp
- Generation parameters
- Result IDs from Openfabric
- Download status

Example of image metadata:

{
  "id": "28def601-6e40-4924-834f-9327446142e4",
  "timestamp": 1746984813,
  "prompt": "A weathered, crimson-red rabit, perched atop a crumbling stone altar, dominates the frame, its golden eyes gleaming with a mixture of defiance and ancient wisdom. Sunlight streams through a shattered stained-glass window, casting a mosaic of violet and emerald hues across the dusty floor, illuminating swirling motes of gold dust and highlighting the intricate details of the altar\u2019s carvings. The scene is rendered in a dramatic, chiaroscuro style reminiscent of Caravaggio, evoking a sense of melancholic grandeur and a palpable sense of forgotten power. The rabit\u2019s fur is a patchwork of deep browns and blacks, subtly textured with a rough, almost granular appearance, and the altar itself is constructed from aged, moss-covered granite.",
  "parameters": {},
  "result_id": "data_blob_31f62264a315fb69d3018eec1f5950e78dbecfe90558615026321b2e37498064/executions/4feb83f688be421c9ae21bd9597357ca",
  "type": "image",
  "needs_download": false,
  "base_name": "Rabit_with_a_cr",
  "original_prompt": "Rabit with a crown on his head",
  "expanded_prompt": "A weathered, crimson-red rabit, perched atop a crumbling stone altar, dominates the frame, its golden eyes gleaming with a mixture of defiance and ancient wisdom. Sunlight streams through a shattered stained-glass window, casting a mosaic of violet and emerald hues across the dusty floor, illuminating swirling motes of gold dust and highlighting the intricate details of the altar\u2019s carvings. The scene is rendered in a dramatic, chiaroscuro style reminiscent of Caravaggio, evoking a sense of melancholic grandeur and a palpable sense of forgotten power. The rabit\u2019s fur is a patchwork of deep browns and blacks, subtly textured with a rough, almost granular appearance, and the altar itself is constructed from aged, moss-covered granite."
}

Model Storage

For 3D model generation:

Model Files: Generated 3D models are saved in the app/data/models/ directory as GLB files
Metadata Files: Each model has an associated JSON metadata file
Metadata Content: The model metadata includes:
- Source image reference
- Model format (always GLB for better compatibility)
- Timestamp
- Generation parameters
- Video preview path (if available)

Persistence and Loading

On-disk Persistence: All assets and metadata are written to disk immediately after generation
Directory Structure: The application maintains a clean directory structure with separate folders for images and models
Auto-discovery: When the application starts, it scans the data directories to find and load existing assets
UI Integration: Previously generated assets are accessible through the galleries in the UI
Download Management: For remotely stored assets, the application tracks download status and can retrieve them when needed

Blob Storage Integration

The application can also handle remote assets stored in Openfabric's blob storage:

When an image or model is generated, it may initially exist only as a reference to a remote blob
The application downloads these blobs as needed and stores them locally
The metadata files are updated to reflect the local storage path once downloaded

I couldn't fully implement the instructed memory specs in openfabric_guide.md, but I could if I had more time. The current implementation focuses on a simple file-based approach rather than the more sophisticated database solution outlined in the guide.

Architecture

AI Creative consists of several key components:

Core Pipeline: Orchestrates the end-to-end creative workflow
LLM Service: Enhances text prompts using a local language model
Text-to-Image Service: Generates images from text (via Openfabric)
Image-to-3D Service: Creates 3D models from images (via Openfabric)
Web UI: Gradio-based user interface

The application uses Openfabric's platform for the compute-intensive image and 3D model generation tasks, while running a lightweight LLM locally for prompt enhancement.

Logging

AI Creative maintains detailed logs in the following locations:

app/core/openfabric_service.log: Logs for Openfabric service operations
app/llm/llm_service.log: Logs for the local LLM service

Troubleshooting

Common Issues

LLM Service not starting: Ensure you have sufficient RAM for the model, and check that the model download was completed successfully
Connection errors: Verify your internet connection and check that your Openfabric app IDs are correct
Generation timeout: Image or 3D model generation may take time, depending on complexity and service load
"No module found" errors: Ensure all dependencies are installed and you're using the correct Python environment

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
app		app
assets		assets
onto/dc8f06af066e4a7880a5938933236037		onto/dc8f06af066e4a7880a5938933236037
.env.example		.env.example
.gitignore		.gitignore
config.py		config.py
download_model.sh		download_model.sh
openfabric_guide.md		openfabric_guide.md
readme.md		readme.md
requirements.txt		requirements.txt
temp.ipynb		temp.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AI Creative

Features

System Requirements

Default Model Used

Installation

1. Clone the repository

2. Create a Python virtual environment

3. Install dependencies

4. Download the LLM model (optional but recommended)

5. Set up environment variables

Usage

Starting the LLM Service

Running the UI Application

Using the Application

Memory Implementation

Image Storage

Model Storage

Persistence and Loading

Blob Storage Integration

Architecture

Logging

Troubleshooting

Common Issues

About

Uh oh!

Releases

Packages

Uh oh!

Languages

LuniaKunal/text-to-3D

Folders and files

Latest commit

History

Repository files navigation

AI Creative

Features

System Requirements

Default Model Used

Installation

1. Clone the repository

2. Create a Python virtual environment

3. Install dependencies

4. Download the LLM model (optional but recommended)

5. Set up environment variables

Usage

Starting the LLM Service

Running the UI Application

Using the Application

Memory Implementation

Image Storage

Model Storage

Persistence and Loading

Blob Storage Integration

Architecture

Logging

Troubleshooting

Common Issues

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages