Skip to content

pixelbrow720/dataset_scrape_crypto

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“ˆ Crypto OHLCV Data Scraper

A comprehensive Python tool for scraping cryptocurrency OHLCV (Open, High, Low, Close, Volume) data from multiple exchanges with both command-line interface and graphical user interface.

✨ Features

  • Multi-Exchange Support: Binance and Coinbase Pro
  • Multiple Timeframes: From 1 minute to 1 week intervals
  • Gap Filling: Automatically fills missing data points
  • Dual Interface: Both CLI and GUI versions available
  • Flexible Date Ranges: Custom date ranges with quick presets
  • CSV Export: Clean, structured data export
  • Rate Limiting: Built-in API rate limiting protection
  • Error Handling: Robust error handling and logging

πŸš€ Quick Start

Prerequisites

pip install requests pandas tkinter

Command Line Usage

# Basic usage
python crypto_ohlcv_scraper.py -e binance -s BTC-USDT -t 1h -sd 2024-01-01 -ed 2024-01-31

# Advanced usage with custom output
python crypto_ohlcv_scraper.py \
  --exchange coinbase \
  --symbol ETH-USD \
  --timeframe 4h \
  --start-date "2024-01-01" \
  --end-date "2024-01-31" \
  --output my_eth_data.csv \
  --no-fill-gaps

GUI Usage

python crypto_scraper_gui.py

πŸ“‹ Parameters

Command Line Arguments

Parameter Short Description Default Required
--exchange -e Exchange to scrape from (binance/coinbase) binance No
--symbol -s Trading pair symbol (e.g., BTC-USDT) - Yes
--timeframe -t Candle timeframe 1h No
--start-date -sd Start date (YYYY-MM-DD) - Yes
--end-date -ed End date (YYYY-MM-DD) - Yes
--output -o Output filename Auto-generated No
--no-fill-gaps - Disable gap filling False No

Supported Timeframes

Timeframe Description
1m 1 Minute
3m 3 Minutes
5m 5 Minutes
15m 15 Minutes
30m 30 Minutes
1h 1 Hour
2h 2 Hours
4h 4 Hours
6h 6 Hours
8h 8 Hours
12h 12 Hours
1d 1 Day
3d 3 Days
1w 1 Week

🏒 Supported Exchanges

Binance

  • Popular Pairs: BTC-USDT, ETH-USDT, BNB-USDT, ADA-USDT, XRP-USDT, SOL-USDT, DOT-USDT, DOGE-USDT, AVAX-USDT, LUNA-USDT
  • Rate Limit: 1200 requests/minute
  • Max Records: 1000 per request

Coinbase Pro

  • Popular Pairs: BTC-USD, ETH-USD, LTC-USD, BCH-USD, ETC-USD, LINK-USD, XLM-USD, ADA-USD, DOT-USD, UNI-USD
  • Rate Limit: 10 requests/second
  • Max Records: 300 per request

πŸ“Š Data Format

The scraped data is saved in CSV format with the following columns:

timestamp,open,high,low,close,volume
2024-01-01 00:00:00,42000.50,42150.25,41980.10,42100.75,125.50
2024-01-01 01:00:00,42100.75,42200.00,42050.30,42180.45,98.25
...

πŸ”§ Configuration

The config.json file allows you to customize:

{
  "exchanges": {
    "binance": {
      "name": "Binance",
      "popular_pairs": ["BTC-USDT", "ETH-USDT", ...]
    },
    "coinbase": {
      "name": "Coinbase Pro",
      "popular_pairs": ["BTC-USD", "ETH-USD", ...]
    }
  },
  "timeframes": [...],
  "default_settings": {
    "exchange": "binance",
    "timeframe": "1h",
    "fill_gaps": true,
    "output_directory": "output"
  }
}

πŸ–₯️ GUI Features

The graphical interface provides:

  • Exchange Selection: Dropdown with supported exchanges
  • Trading Pair Selection: Auto-populated based on selected exchange
  • Date Range Picker: With quick preset buttons (7, 30, 90 days)
  • Progress Tracking: Real-time progress bar and status updates
  • Results Display: Detailed scraping results and data preview
  • File Browser: Easy output file selection

πŸ“ Project Structure

crypto-ohlcv-scraper/
β”œβ”€β”€ crypto_ohlcv_scraper.py    # Main CLI scraper
β”œβ”€β”€ crypto_scraper_gui.py      # GUI application
β”œβ”€β”€ config.json                # Configuration file
β”œβ”€β”€ output/                    # Output directory (auto-created)
β”œβ”€β”€ requirements.txt           # Python dependencies
└── README.md                  # This file

πŸ›‘οΈ Error Handling

The scraper includes comprehensive error handling for:

  • Network Issues: Connection timeouts and API errors
  • Rate Limiting: Automatic delays to respect API limits
  • Data Validation: Input parameter validation
  • File Operations: Output directory creation and file saving
  • Threading: Safe GUI updates from background threads

πŸ“ˆ Gap Filling

The gap filling feature:

  • Identifies missing time periods in the data
  • Uses forward fill for OHLC values
  • Sets volume to 0 for missing periods
  • Maintains data integrity and chronological order

πŸ” Examples

Example 1: Bitcoin Daily Data for Last Month

python crypto_ohlcv_scraper.py -e binance -s BTC-USDT -t 1d -sd 2024-01-01 -ed 2024-01-31

Example 2: Ethereum Hourly Data from Coinbase

python crypto_ohlcv_scraper.py -e coinbase -s ETH-USD -t 1h -sd 2024-01-15 -ed 2024-01-20

Example 3: Custom Output File

python crypto_ohlcv_scraper.py -e binance -s SOL-USDT -t 4h -sd 2024-01-01 -ed 2024-01-31 -o solana_4h_january.csv

🚨 Rate Limiting

The scraper automatically handles rate limiting:

  • Binance: 0.1 second delay between requests
  • Coinbase: Built-in respect for API limits
  • Batch Processing: Large date ranges are split into smaller chunks

πŸ“ Logging

Comprehensive logging includes:

  • Request timestamps and data counts
  • Error messages with context
  • Gap filling statistics
  • File save confirmations

🀝 Contributing

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

⚠️ Disclaimer

This tool is for educational and research purposes. Please:

  • Respect exchange API terms of service
  • Use appropriate rate limiting
  • Be mindful of data usage costs
  • Verify data accuracy for trading decisions

πŸ†˜ Troubleshooting

Common Issues

1. "No data retrieved"

  • Check symbol format (BTC-USDT for Binance, BTC-USD for Coinbase)
  • Verify date range is valid
  • Ensure exchange supports the trading pair

2. "Rate limit exceeded"

  • Reduce date range
  • Increase delays between requests
  • Try again after waiting period

3. "GUI not opening"

  • Ensure tkinter is installed: pip install tkinter
  • Check Python version compatibility

Support

For issues and questions:

  • Open an issue on GitHub
  • Check existing issues for solutions
  • Provide detailed error messages and steps to reproduce

🎯 Roadmap

  • Additional exchange support (Kraken, Bitfinex)
  • Real-time data streaming
  • Data visualization features
  • Database storage options
  • API key authentication
  • WebSocket support
  • Docker containerization

Happy Trading! πŸ“ŠπŸ’°

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages