A powerful Python package to effortlessly download all images from any webpage. Built with Selenium and modern Python practices, this tool automates image scraping with robust error handling and detailed logging.
- Dynamic Content Handling: Automatically scrolls through pages to load dynamic images
- Robust Error Handling: Comprehensive error catching and logging
- URL Validation: Ensures all images are valid before downloading
- Customizable Save Locations: Organizes downloaded images into folders based on page titles
- Detailed Logging: Provides comprehensive logging of all operations
- Type Safety: Full type hints for better code reliability
- Resource Management: Proper cleanup of browser resources
- Progress Tracking: Returns list of successfully downloaded files
Install the package from PyPI using pip:
pip install LpImagesDownloader
- Python 3.7+
- Chrome browser installed
- Internet connection
from LpImagesDownloader import download_images
# Download images from a webpage, scrolling 3 times to load dynamic content
downloaded_files = download_images("https://en.wikipedia.org/wiki/India", 3)
print(f"Successfully downloaded {len(downloaded_files)} images")
from LpImagesDownloader import download_images
import logging
# Configure logging
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(levelname)s - %(message)s'
)
try:
# Download images with custom scroll count
downloaded_files = download_images("https://example.com", 5)
print(f"Successfully downloaded {len(downloaded_files)} images")
# Process downloaded files
for file_path in downloaded_files:
print(f"Downloaded: {file_path}")
except Exception as e:
logging.error(f"Failed to download images: {e}")
2024-03-14 10:30:15 - INFO - Setting up environment...
2024-03-14 10:30:16 - INFO - Loading URL: https://example.com
2024-03-14 10:30:17 - INFO - Running operations in the background...
2024-03-14 10:30:18 - INFO - Scrolling page 1...
2024-03-14 10:30:20 - INFO - Scrolling page 2...
2024-03-14 10:30:22 - INFO - Scrolling page 3...
2024-03-14 10:30:23 - INFO - Total detected images on page: 25
2024-03-14 10:30:24 - INFO - Downloading 1.jpg...
2024-03-14 10:30:25 - INFO - Downloading 2.jpg...
...
2024-03-14 10:30:35 - INFO - Total images downloaded: 25
2024-03-14 10:30:35 - INFO - You can view the saved images at: /path/to/Saved Images/Example
The package uses sensible defaults but can be customized:
- Headless Mode: Browser runs in headless mode by default
- Timeout: Default page load timeout is 30 seconds
- Save Location: Images are saved in a "Saved Images" directory with subdirectories based on page titles
- Clone the repository:
git clone https://github.com/LpCodes/LP-All-Images-Downloader.git
cd LP-All-Images-Downloader
- Install development dependencies:
pip install -r requirements.txt
- Run tests:
python -m pytest
Created and maintained by @LpCodes.
This project is licensed under the MIT License. Feel free to use and modify it as needed.
Contributions are welcome! Here's how you can contribute:
- Fork the repository
- Create a new branch:
git checkout -b feature-name
- Make your changes and commit them:
git commit -m 'Add feature-name'
- Push to your branch:
git push origin feature-name
- Open a pull request and describe your changes
- Follow PEP 8 style guide
- Add type hints to all functions
- Include docstrings for all functions
- Add tests for new features
- Update documentation as needed
- Some websites may block automated browsers
- Very large pages may require more memory
- Some dynamic content may not load properly
Have ideas to improve the package or documentation? Open an issue on the GitHub repository.
- Bug Tracker: Report Issues
- Source Code: GitHub Repository
- Documentation: Read the Docs
Thank you for using LP Images Downloader! Your feedback helps make this project better. 😊