Odexpo Gallery Scraper

A web scraper for downloading high-resolution artwork images from the Fabienne Vincent gallery website using Playwright.

Features

🎨 Gallery Crawling - Automatically discovers and downloads from all gallery categories
🖼️ High-Resolution Images - Gets full-size images from lightbox popups
📁 Smart Organization - Organizes images into category folders
🏷️ Auto-Renaming - Renames files using artwork titles for better organization
📊 Metadata Collection - Saves detailed information about each artwork
🔄 Duplicate Prevention - Avoids re-downloading existing images

Quick Start

Run the complete workflow automatically - crawl, download, rename, and organize:

python gallery_tool.py

This will:

Crawl the gallery and download images
Automatically rename files using artwork titles
Organize into cleaned category folders
Save metadata for all images

Requirements

Python 3.11+
Install dependencies: pip install -r requirements.txt

How It Works

Discovers Categories - Finds all gallery categories automatically
Crawls Systematically - Downloads from each category completely before moving to the next
High-Quality Images - Clicks each image to get full-resolution versions
Smart Naming - Renames files like tigre_du_bengale_419.jpg instead of 739467410189419.jpg
Organized Storage - Creates folders like fauves/, animaux-d-afrique/, etc.

Output Structure

assets/
└── crawl_runs/
    └── 20241208_143022/          # Timestamped run
        ├── images/               # Downloaded images
        │   ├── fauves/          # Category folders
        │   ├── animaux-d-afrique/
        │   └── ...
        └── metadata.json        # Image details and metadata

Configuration

Edit config.py to change:

Target website URL
Download delays and timeouts
File organization preferences

Notes

Each crawl session creates a timestamped folder
Respects the website with proper delays between requests
Handles pagination automatically within each category
Skips duplicate downloads across multiple runs

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.cursor/rules		.cursor/rules
docs		docs
utils		utils
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
config.py		config.py
crawler.py		crawler.py
debug_crawler.py		debug_crawler.py
gallery_tool.py		gallery_tool.py
main.py		main.py
rename_files.py		rename_files.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Odexpo Gallery Scraper

Features

Quick Start

Requirements

How It Works

Output Structure

Configuration

Notes

About

Uh oh!

Releases

Packages

Languages

etvincen/scrap-odexpo

Folders and files

Latest commit

History

Repository files navigation

Odexpo Gallery Scraper

Features

Quick Start

Requirements

How It Works

Output Structure

Configuration

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages