Skip to content

Conversation

lkotlus
Copy link

@lkotlus lkotlus commented Aug 2, 2024

Feature Description

  • Optimizations: some general things, and implemented a hash set that prevents the same url from being visited multiple times, which frequently lead to infinite crawling.
  • Bug fixes: potential issues with how the domain restrictions were being handled.
  • Out of scope paths and domains: users can now enter domains and paths that are out of the scope of the scan (useful for pentests).
  • Headless browser support: the ability to use a headless browser rather than just requests.get() when making requests. This is more thorough, as dynamic content of the site is accessed due to the web page actually being rendered. This does lead to longer waits, but can be worth it depending on how the target site is put together. In the future, user-like interaction with the site can be implemented. This feature was implemented using selenium.

Checklist

  • I wrote at least some documentation for this feature.

Checklist

  • This Pull will not add the same thing as another currently-open request.
  • Your Pull was made against the rivermont:dev branch and not rivermont:master.
  • This Pull does not commit any keys, passwords, personal data, or other private information.
  • I updated lines 20 and 21 in the README to reflect any changed I made.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants