A comprehensive Python tool for scraping cryptocurrency OHLCV (Open, High, Low, Close, Volume) data from multiple exchanges with both command-line interface and graphical user interface.
- Multi-Exchange Support: Binance and Coinbase Pro
- Multiple Timeframes: From 1 minute to 1 week intervals
- Gap Filling: Automatically fills missing data points
- Dual Interface: Both CLI and GUI versions available
- Flexible Date Ranges: Custom date ranges with quick presets
- CSV Export: Clean, structured data export
- Rate Limiting: Built-in API rate limiting protection
- Error Handling: Robust error handling and logging
pip install requests pandas tkinter
# Basic usage
python crypto_ohlcv_scraper.py -e binance -s BTC-USDT -t 1h -sd 2024-01-01 -ed 2024-01-31
# Advanced usage with custom output
python crypto_ohlcv_scraper.py \
--exchange coinbase \
--symbol ETH-USD \
--timeframe 4h \
--start-date "2024-01-01" \
--end-date "2024-01-31" \
--output my_eth_data.csv \
--no-fill-gaps
python crypto_scraper_gui.py
Parameter | Short | Description | Default | Required |
---|---|---|---|---|
--exchange |
-e |
Exchange to scrape from (binance/coinbase) | binance | No |
--symbol |
-s |
Trading pair symbol (e.g., BTC-USDT) | - | Yes |
--timeframe |
-t |
Candle timeframe | 1h | No |
--start-date |
-sd |
Start date (YYYY-MM-DD) | - | Yes |
--end-date |
-ed |
End date (YYYY-MM-DD) | - | Yes |
--output |
-o |
Output filename | Auto-generated | No |
--no-fill-gaps |
- | Disable gap filling | False | No |
Timeframe | Description |
---|---|
1m |
1 Minute |
3m |
3 Minutes |
5m |
5 Minutes |
15m |
15 Minutes |
30m |
30 Minutes |
1h |
1 Hour |
2h |
2 Hours |
4h |
4 Hours |
6h |
6 Hours |
8h |
8 Hours |
12h |
12 Hours |
1d |
1 Day |
3d |
3 Days |
1w |
1 Week |
- Popular Pairs: BTC-USDT, ETH-USDT, BNB-USDT, ADA-USDT, XRP-USDT, SOL-USDT, DOT-USDT, DOGE-USDT, AVAX-USDT, LUNA-USDT
- Rate Limit: 1200 requests/minute
- Max Records: 1000 per request
- Popular Pairs: BTC-USD, ETH-USD, LTC-USD, BCH-USD, ETC-USD, LINK-USD, XLM-USD, ADA-USD, DOT-USD, UNI-USD
- Rate Limit: 10 requests/second
- Max Records: 300 per request
The scraped data is saved in CSV format with the following columns:
timestamp,open,high,low,close,volume
2024-01-01 00:00:00,42000.50,42150.25,41980.10,42100.75,125.50
2024-01-01 01:00:00,42100.75,42200.00,42050.30,42180.45,98.25
...
The config.json
file allows you to customize:
{
"exchanges": {
"binance": {
"name": "Binance",
"popular_pairs": ["BTC-USDT", "ETH-USDT", ...]
},
"coinbase": {
"name": "Coinbase Pro",
"popular_pairs": ["BTC-USD", "ETH-USD", ...]
}
},
"timeframes": [...],
"default_settings": {
"exchange": "binance",
"timeframe": "1h",
"fill_gaps": true,
"output_directory": "output"
}
}
The graphical interface provides:
- Exchange Selection: Dropdown with supported exchanges
- Trading Pair Selection: Auto-populated based on selected exchange
- Date Range Picker: With quick preset buttons (7, 30, 90 days)
- Progress Tracking: Real-time progress bar and status updates
- Results Display: Detailed scraping results and data preview
- File Browser: Easy output file selection
crypto-ohlcv-scraper/
βββ crypto_ohlcv_scraper.py # Main CLI scraper
βββ crypto_scraper_gui.py # GUI application
βββ config.json # Configuration file
βββ output/ # Output directory (auto-created)
βββ requirements.txt # Python dependencies
βββ README.md # This file
The scraper includes comprehensive error handling for:
- Network Issues: Connection timeouts and API errors
- Rate Limiting: Automatic delays to respect API limits
- Data Validation: Input parameter validation
- File Operations: Output directory creation and file saving
- Threading: Safe GUI updates from background threads
The gap filling feature:
- Identifies missing time periods in the data
- Uses forward fill for OHLC values
- Sets volume to 0 for missing periods
- Maintains data integrity and chronological order
python crypto_ohlcv_scraper.py -e binance -s BTC-USDT -t 1d -sd 2024-01-01 -ed 2024-01-31
python crypto_ohlcv_scraper.py -e coinbase -s ETH-USD -t 1h -sd 2024-01-15 -ed 2024-01-20
python crypto_ohlcv_scraper.py -e binance -s SOL-USDT -t 4h -sd 2024-01-01 -ed 2024-01-31 -o solana_4h_january.csv
The scraper automatically handles rate limiting:
- Binance: 0.1 second delay between requests
- Coinbase: Built-in respect for API limits
- Batch Processing: Large date ranges are split into smaller chunks
Comprehensive logging includes:
- Request timestamps and data counts
- Error messages with context
- Gap filling statistics
- File save confirmations
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
This tool is for educational and research purposes. Please:
- Respect exchange API terms of service
- Use appropriate rate limiting
- Be mindful of data usage costs
- Verify data accuracy for trading decisions
1. "No data retrieved"
- Check symbol format (BTC-USDT for Binance, BTC-USD for Coinbase)
- Verify date range is valid
- Ensure exchange supports the trading pair
2. "Rate limit exceeded"
- Reduce date range
- Increase delays between requests
- Try again after waiting period
3. "GUI not opening"
- Ensure tkinter is installed:
pip install tkinter
- Check Python version compatibility
For issues and questions:
- Open an issue on GitHub
- Check existing issues for solutions
- Provide detailed error messages and steps to reproduce
- Additional exchange support (Kraken, Bitfinex)
- Real-time data streaming
- Data visualization features
- Database storage options
- API key authentication
- WebSocket support
- Docker containerization
Happy Trading! ππ°