A Web Crawler which crawls the webpage in BFS order and returns the depth from origin ,most frequent word and number of valid external links on the page
Instructions:
- Install Beautiful Soup using
pip install beautifulsoup4
- Run BFS_Crawler.py using
python BFS_Crawler.py
- Enter a valid url starting with http:// or https://
- Enter an integer for specifying max limit for number of pages to be crawled
- Enter an integer for specifying timeout (in seconds)
Result will pe printed on the console and a log.txt file will be generated