imgscraper for artists from the artist: parser - collecting references - description, features and dependencies

      🎨 Description:
        The program is designed for automatic parsing primarily references for artists, downloading, sorting and filtering them from:
        - Pinterest (with support for full-size images)
        - Google Images
        - Bing Images

        💡 Does not require third-party APIs, collection occurs through Selenium/webdriver-manager/icrawler locally.

       🚀 Main features:
      • Search images by keywords, categories, poses and additional (separate) requests
      • Automatic filtering:
      
      - Check for a person in the frame (YOLO + you can change the trigger parameter)
      - Detect a full-length person in the image (you can change the trigger parameter)
      - Filter low-quality files (<10 KB)
      - Filter for black and white images (you can combine Poses + Additional requests = "black and white" + Filter black and white)
      - Check and remove duplicates via pHash (you can change the trigger parameter)
      
      • Work with history: exclude already loaded images (seen_urls.json)
      • Save full-size URLs in `full_urls.log`
      • Load full-res images
      • Parse Pinterest in windowless\headless mode
      • Index folders for faster duplicate checking
      • Real-time work log on the "Debug" tab
      • Save program settings between launches
      • Selecting the minimum image size (wide\height) when parsing
      • Selecting a preset for parsing poses (foreshortening\dynamic action\perspective pack)
      • Selecting the parsing mode for sketches\lines only
      • Full randomization of the search when parsing (helpful when you search for new material through search engines)

       📝 Notes:
        - The first launch may be long due to the absence of some dependent libraries on the device (like YOLO\WebDriver-manager)

       ⚡ Future improvements:
        - New classes to search (animals and items)
        - Filter by precise poses and number of people in image
        - Pose estimation (Keypoint detection), checks that the frame contains visible limbs/pose
        - Automatic model selection: if the task is general, take the standard yolov8n.pt for speed, or yolov8m.pt for quality
        - Visualization and filter preview - open a window with examples of recognized frames, highlighted boxes and statistics (how many were filtered out, how many were left)
        - Asynchronous loading + processing
        - Advanced content analysis (YOLO + Pose Estimation + OCR + scene recognition: selection by facial expression/body position)
        - Watermark removal, filters by main (dominant) color and composition
        - Automatic sorting by categories/tags
        - Built-in ban protection: User-Agent rotation, random delays and random browser emulation
        - Export results to ZIP/JSON
        - GUI and UX: gonna be like a Figma/Notion style

        👨‍💻 Authors:
        - Logic developer and tester: Hara
        - Development and code: ChatGPT 
      
       📦 Dependencies:
        - Python 3.x
        - PyQt5
        - Selenium
        - webdriver-manager
        - icrawler
        - pillow
        - requests
        - beautifulsoup4
        - imagehash
        - ultralitycs

# imgscraper for artists: parser - collecting references - setup

      1. clone the repository
         git clone https://github.com/rambaeharambae/img_drawing_reference_scraper.git
      2. install dependencies
         pip install -r requirements.txt
      3. run via python
         python imgscraper.py
      4. build an .exe executable file
         pyinstaller --onefile --noconsole imgscraper.py

# imgscraper for artists: parser - collecting references - download

      go to release and download .exe

# imgscraper for artists: parser - collecting references - knows bugs

      If you get too much not what you wanted to parse (heads instead of full body person) — change full-body ratio threshold or use additional requests or both.
      ---
      The first launch could be long if you're missing some dependencies.

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
LICENSE		LICENSE
README.md		README.md
README_RU.md		README_RU.md
imgscraper.py		imgscraper.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

imgscraper for artists from the artist: parser - collecting references - description, features and dependencies

# imgscraper for artists: parser - collecting references - setup

# imgscraper for artists: parser - collecting references - download

# imgscraper for artists: parser - collecting references - knows bugs

About

Uh oh!

Releases 4

Packages

Languages

License

rambaeharambae/img_drawing_reference_scraper

Folders and files

Latest commit

History

Repository files navigation

imgscraper for artists from the artist: parser - collecting references - description, features and dependencies

# imgscraper for artists: parser - collecting references - setup

# imgscraper for artists: parser - collecting references - download

# imgscraper for artists: parser - collecting references - knows bugs

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Languages

Packages