Skip to content

πŸ” AI-powered PDF metadata extractor and file renamer using Google Gemini Vision AI. Automatically analyzes PDF content, extracts metadata (title, author, year), and renames files in standardized format for better document organization.

Notifications You must be signed in to change notification settings

Web3NL/AI-PDF-Renamer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

45 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

AI-PDF-Renamer

πŸ” Automatically extract metadata from PDF files and rename them using AI

This tool uses Google's Gemini AI to analyze PDF documents, extract key metadata (title, author, publication year), and automatically rename files in a standardized format for better organization.

πŸš€ Features

  • AI-Powered Extraction: Uses Google Gemini AI to read PDF content and extract metadata
  • Multiple AI Models: Support for Flash (gemini-2.5-flash-preview-05-20) and Pro (gemini-2.5-pro-preview-06-05) models
  • Smart Renaming: Automatically renames files in format: YEAR - AUTHOR - TITLE.pdf
  • Batch Processing: Process entire directories of PDF files at once
  • Non-Destructive: Creates renamed copies while preserving original files
  • Rate Limiting: Respects API limits with intelligent retry logic

πŸ“¦ Installation

The run.sh script handles environment setup automatically, including virtual environment creation and dependency installation.

πŸ“– Usage

# Setup: Get API key from https://aistudio.google.com/app/apikey
echo "GEMINI_API_KEY=your-actual-api-key-here" > .env

# Basic usage - process PDFs and rename them
./run.sh ./documents ./organized

# Extract metadata only (no file copying, saves JSON to source dir)
./run.sh ./documents ./results --no-copy

# Process only first page (faster/cheaper)
./run.sh ./documents ./organized --max-pages 1

# Use Pro model (gemini-2.5-pro-preview-06-05) for better accuracy
./run.sh ./documents ./organized --model pro

# Automation mode (skip confirmations)
./run.sh ./documents ./organized --force

# Combine options
./run.sh ./papers ./organized --max-pages 1 --force

πŸ“Š Example Results

Input: sample.pdf
Output: 2015 - AndrΓ© Koch Torres Assis - A new method for inductance calculation.pdf

Results are also saved to pdf_metadata_results.json with detailed metadata for each processed file.

About

πŸ” AI-powered PDF metadata extractor and file renamer using Google Gemini Vision AI. Automatically analyzes PDF content, extracts metadata (title, author, year), and renames files in standardized format for better document organization.

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •