Author: watercrawl
Version: 0.5.0
Type: Tool
WaterCrawl is a powerful web crawling tool designed for developers. This plugin allows you to easily crawl websites, extract data, and search the web. It is compatible with WaterCrawl v0.9.* and offers a range of features for efficient data extraction and discovery.
Scrapes a single URL and outputs the content in markdown format with options for including HTML, links, and screenshots.
![]() |
![]() |
---|
Initiates a web crawl to extract data from specified URLs with configurable options for including/excluding URL patterns and generating alt text for images.
![]() |
![]() |
---|
Manage crawl jobs using a crawl request UUID. You can retrieve the job status, get all results, fetch a single result by its UUID, or cancel an ongoing task.
Search the web for information using WaterCrawl's search API with configurable options for language, country, time range, and search depth.
Retrieve search results or manage search jobs based on a search request UUID. You can get the status and results of a search, or cancel it.
Generates a sitemap directly from a URL. You can configure it to include subdomains, ignore existing sitemap.xml files, and more.
To install the WaterCrawl plugin, follow these steps:
Download the latest .difypkg
file from the GitHub Releases or from the Dify marketplace.
Login to your WaterCrawl account here. Or your self-hosted WaterCrawl instance. In the dashboard go to API Keys
and create a new key or use an existing key.
then in the Dify plugin management page, go to Plugins
and click on the + Install
button. and use API Key
in the plugin configuration.
To contribute to the WaterCrawl plugin, follow these steps:
- Clone the repository:
git clone https://github.com/watercrawl/watercrawl-dify-plugin.git
- Navigate to the project directory:
cd watercrawl-dify-plugin
- Make virtual environment:
python -m venv env
- Activate the virtual environment:
source env/bin/activate
- Install dependencies:
pip install -r requirements.txt
- Copy the
.env.example
file to.env
and fill in the necessary values. - Run the plugin:
python -m main
For support, please contact us at support@watercrawl.dev.
This project is licensed under the MIT License - see the LICENSE file for details.