A powerful and flexible web crawler that allows you to create automated web scraping workflows using structured action formats. Built with TypeScript and integrated with proxy and captcha solving capabilities.
- Structured Action Format: Define web crawling sequences using typed actions
- Captcha Handling: Integrated captcha solving capabilities
- Proxy Support: Built-in integration with Bright Data for reliable proxy services
- Type Safety: Fully typed actions for better development experience
- Docker Support: Containerized deployment ready
The crawler supports various action types to handle different web automation scenarios:
type BotActionClass = 'captcha' | 'debug' | 'default';
type BotActionCaptchaType = 'wait' | 'solve' | 'disableAutoSolve';
type BotActionDebugType = 'screenshot' | 'url';
type BotActionType =
| 'form'
| 'click'
| 'delay'
| 'goto'
| 'input'
| 'download'
| BotActionCaptchaType
| BotActionDebugType;
type BotAction = {
type: BotActionType;
cat?: BotActionClass;
selector?: string;
value?: string;
timeout?: number;
validationURL?: string;
};
- Clone the repository
- Install dependencies:
- Create a .env file in the root directory with your Bright Data credentials:
- Build the project:
- Build the Docker image:
- Run the container:
Send a POST request to the crawler with an array of actions. Example:
- default : Basic web automation actions
- captcha : Captcha-related operations
- debug : Debugging and monitoring actions
- goto : Navigate to a URL
- click : Click on an element
- input : Enter text into a field
- form : Submit a form
- delay : Wait for a specified duration
- download : Download a file
- solve : Solve a captcha
- wait : Wait for a captcha
- disableAutoSolve : Disable automatic captcha solving
- screenshot : Take a screenshot
- url : Get current URL
- Node.js
- Docker
- Bright Data account
MIT