RoboFinder is a Bash tool designed to extract unique paths from archived robots.txt
files of a target domain using the Wayback Machine. It automates the process of fetching
robots.txt
files from different timestamps, extracting the disallowed paths, and presenting a unique, sorted list of those paths.
- Fetch Archived
robots.txt
Files: Retrievesrobots.txt
files with a status code of 200 from the Wayback Machine. - Extract Disallowed Paths: Extracts all
Disallow
paths specified in therobots.txt
files. - Unique and Sorted Output: Outputs a unique and sorted list of paths to a text file.
The following tools must be installed on your system:
curl
: For making HTTP requests.jq
: For processing JSON data.grep
: For pattern matching.awk
: For text processing.sort
: For sorting the output.
-
Clone this repository:
git clone https://github.com/Mrterrestrial/RoboFinder.git
-
Make the Script Executable:
chmod +x RoboFinder.sh
-
Run the script with the target domain as an argument:
./RoboFinder.sh target.tld
This command will generate a file named target.tld_robots_paths.txt containing the unique paths extracted from the robots.txt files.
This project is licensed under the MIT License - see MIT License for details.
Feel free to fork the repository and submit pull requests. For any issues or feature requests, please open an issue on GitHub.