Skip to content

qcrisw/url-robots-checker

Repository files navigation

UrlRobotsChecker

Check if a given URL is allowed to be scraped based on the given website's robots.txt rules.

Usage

from url_robots_checker import UrlRobotsChecker

url = 'foobar.com'
url_robots_checker = UrlRobotsChecker(url)
if not url_robots_checker.can_fetch(url):
    print("Not allowed to scrape")

About

Check if a given url is allowed to be scraped based on the given website's robots.txt rules

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published