WebCrawler

A Web Crawler which crawls the webpage in BFS order and returns the depth from origin ,most frequent word and number of valid external links on the page

Instructions:

Install Beautiful Soup using pip install beautifulsoup4
Run BFS_Crawler.py using python BFS_Crawler.py
Enter a valid url starting with http:// or https://
Enter an integer for specifying max limit for number of pages to be crawled
Enter an integer for specifying timeout (in seconds)

Result will pe printed on the console and a log.txt file will be generated

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
BFS_Crawler.py		BFS_Crawler.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

WebCrawler

About

Uh oh!

Releases

Packages

Uh oh!

Languages

GaganpreetKhurana/WebCrawler

Folders and files

Latest commit

History

Repository files navigation

WebCrawler

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages