Skip to content

RoboFinder is a Bash tool that extracts unique paths from archived robots.txt files of a target domain using the Wayback Machine. It filters successful responses, retrieves Disallow paths, and outputs a sorted, unique list. Ideal for security testing, SEO analysis, and web archival research

Notifications You must be signed in to change notification settings

Mrterrestrial/RoboFinder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

RoboFinder

RoboFinder is a Bash tool designed to extract unique paths from archived robots.txt files of a target domain using the Wayback Machine. It automates the process of fetching robots.txt files from different timestamps, extracting the disallowed paths, and presenting a unique, sorted list of those paths.

Features

  • Fetch Archived robots.txt Files: Retrieves robots.txt files with a status code of 200 from the Wayback Machine.
  • Extract Disallowed Paths: Extracts all Disallow paths specified in the robots.txt files.
  • Unique and Sorted Output: Outputs a unique and sorted list of paths to a text file.

Requirements

The following tools must be installed on your system:

  • curl: For making HTTP requests.
  • jq: For processing JSON data.
  • grep: For pattern matching.
  • awk: For text processing.
  • sort: For sorting the output.

Installation

  1. Clone this repository:

    git clone https://github.com/Mrterrestrial/RoboFinder.git
  2. Make the Script Executable:

    chmod +x RoboFinder.sh

Usage

  • Run the script with the target domain as an argument:

    ./RoboFinder.sh target.tld

This command will generate a file named target.tld_robots_paths.txt containing the unique paths extracted from the robots.txt files.

License

This project is licensed under the MIT License - see MIT License for details.

Contributing

Feel free to fork the repository and submit pull requests. For any issues or feature requests, please open an issue on GitHub.

About

RoboFinder is a Bash tool that extracts unique paths from archived robots.txt files of a target domain using the Wayback Machine. It filters successful responses, retrieves Disallow paths, and outputs a sorted, unique list. Ideal for security testing, SEO analysis, and web archival research

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages