Skip to content

acidtib/tcg-fetch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TCG Fetch

A high-performance Rust CLI tool to fetch trading card game data from various APIs. This tool helps organize card data into train, test, and validation sets for machine learning applications.

Supported TCGs

  • Magic: The Gathering (MTG) - Fetches data from the Scryfall API
  • Grand Archive (GA) - Fetches data from the Grand Archive API

Features

  • Fast parallel processing - Uses all CPU cores for optimal performance
  • Smart batch checking - Avoids unnecessary downloads by checking existing cards efficiently
  • Fetches card data from TCG-specific APIs
  • Downloads comprehensive card data including all available card information
  • Direct image resizing to specified dimensions (default: 500×700 pixels)
  • Organizes data into train/test/validation sets with automatic splitting
  • Command-line interface with customizable output path
  • Progress indicators for downloads
  • Multi-threaded image downloading with configurable thread count

Performance

The tool automatically uses optimized parallel processing for all file operations:

  • 5-10x faster card existence checking through batch processing
  • 4-8x faster directory operations using all CPU cores
  • 60% faster overall for large datasets (10,000+ cards)
  • Intelligent filtering to skip existing cards and avoid redundant downloads

No configuration needed - optimizations are automatic!

Installation

Make sure you have Rust and Cargo installed on your system. Then clone this repository:

git clone https://github.com/acidtib/tcg-fetch.git
cd tcg-fetch
cargo build --release

Usage

Basic Usage

Run the program with required TCG argument:

cargo run -- mtg    # For Magic: The Gathering
cargo run -- ga     # For Grand Archive

Image Processing Options

Configure target image dimensions (images will be resized to fit exactly):

cargo run -- mtg --width 512 --height 512    # Resize all MTG images to 512×512 pixels
cargo run -- ga --width 512 --height 512     # Resize all GA images to 512×512 pixels

Performance Tuning

Control download performance and dataset size:

cargo run -- mtg --amount 100 --threads 6    # Download 100 MTG cards using 6 threads
cargo run -- ga --amount 50 --threads 4      # Download 50 GA cards using 4 threads

Custom Output Directory

Specify a custom output directory:

cargo run -- mtg --path custom-data-dir      # For MTG cards
cargo run -- ga --path ga-cards-dir          # For GA cards

Complete Example

Combine multiple options for full control:

cargo run -- mtg --path mtg-data --amount 50 --threads 4 --width 512 --height 512
cargo run -- ga --path ga-data --amount 25 --threads 2 --width 400 --height 600

Output Structure

The tool creates a directory with the following structure:

<output-dir>/
├── data/
│   ├── train/
│   │   └── <card-id>/
│   │       └── 0000.jpg     # Original downloaded image
│   ├── test/
│   │   └── <card-id>/
│   │       └── 0000.jpg     # Test image (copied from train)
│   ├── validation/
│   │   └── <card-id>/
│   │       └── 0000.jpg     # Validation image (copied from train)
│   │       └── 0001.jpg
└── all_cards.json               # For MTG
└── ga_cards.json                # For Grand Archive

Each card gets its own subdirectory named after the card ID. The primary image is saved as 0000.jpg, with additional images numbered sequentially (0001.jpg, 0002.jpg, etc.) that can be added in the future.

Command Line Options

Usage: tcg-fetch <TCG> [OPTIONS]

Arguments:
  <TCG>                          Trading card game type to fetch data for [possible values: mtg, ga]

Options:
  -p, --path <PATH>              Path where to save the data [default: tcg-data]
  -a, --amount <AMOUNT>          Amount of cards to fetch [default: all]
  -t, --threads <THREADS>        Number of threads to use for downloading images [default: CPU cores]
      --width <WIDTH>            Target width for resized images [default: 500]
      --height <HEIGHT>          Target height for resized images [default: 700]
  -h, --help                     Print help
  -V, --version                  Print version

Adding New TCGs

The codebase is designed to be extensible. To add support for a new TCG:

  1. Add a new variant to the TcgType enum in main.rs
  2. Add corresponding API URL, user agent, and data type mappings in utils.rs
  3. Implement any TCG-specific logic in the utility functions

Dependencies

Rust Dependencies

  • clap: Command-line argument parsing
  • reqwest: HTTP client for API requests
  • serde: JSON serialization/deserialization
  • tokio: Async runtime
  • image: Image processing
  • indicatif: Progress bars
  • futures: Async utilities
  • rand: Random number generation for dataset shuffling
  • rayon: Parallel processing for performance optimization

Project Structure

tcg-fetch/
├── src/
│   ├── main.rs           # Main application logic and CLI definition
│   └── utils.rs          # Utility functions for API and file operations
├── Cargo.toml           # Rust dependencies
└── README.md           # This file

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. When adding support for new TCGs, please ensure:

  • The new TCG follows the existing pattern in the codebase
  • API rate limits are respected
  • Error handling is comprehensive
  • Documentation is updated accordingly

License

This project is open source. Please check the repository for license details.

About

Simple program to fetch trading card game data from various APIs

Resources

Stars

Watchers

Forks

Languages