Flash Web Crawler

A powerful and flexible web crawler that allows you to create automated web scraping workflows using structured action formats. Built with TypeScript and integrated with proxy and captcha solving capabilities.

Features

Structured Action Format: Define web crawling sequences using typed actions
Captcha Handling: Integrated captcha solving capabilities
Proxy Support: Built-in integration with Bright Data for reliable proxy services
Type Safety: Fully typed actions for better development experience
Docker Support: Containerized deployment ready

Action Types

The crawler supports various action types to handle different web automation scenarios:

type BotActionClass = 'captcha' | 'debug' | 'default';

type BotActionCaptchaType = 'wait' | 'solve' | 'disableAutoSolve';

type BotActionDebugType = 'screenshot' | 'url';

type BotActionType =
  | 'form'
  | 'click'
  | 'delay'
  | 'goto'
  | 'input'
  | 'download'
  | BotActionCaptchaType
  | BotActionDebugType;

type BotAction = {
  type: BotActionType;
  cat?: BotActionClass;
  selector?: string;
  value?: string;
  timeout?: number;
  validationURL?: string;
};

Setup

Clone the repository
Install dependencies:
Create a .env file in the root directory with your Bright Data credentials:

Building and Running

Build the project:
Build the Docker image:
Run the container:

Usage

Send a POST request to the crawler with an array of actions. Example:

Action Classes

default : Basic web automation actions
captcha : Captcha-related operations
debug : Debugging and monitoring actions

Supported Actions

goto : Navigate to a URL
click : Click on an element
input : Enter text into a field
form : Submit a form
delay : Wait for a specified duration
download : Download a file
solve : Solve a captcha
wait : Wait for a captcha
disableAutoSolve : Disable automatic captcha solving
screenshot : Take a screenshot
url : Get current URL

Requirements

Node.js
Docker
Bright Data account

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
deploy		deploy
img		img
src		src
test		test
.editorconfig		.editorconfig
.eslintignore		.eslintignore
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
.prettierrc.js		.prettierrc.js
Dockerfile		Dockerfile
README.md		README.md
esbuild.config.js		esbuild.config.js
package-lock.json		package-lock.json
package.json		package.json
response.json		response.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Flash Web Crawler

Features

Action Types

Setup

Building and Running

Usage

Action Classes

Supported Actions

Requirements

License

About

Uh oh!

Releases

Packages

Languages

kanja-core/flash

Folders and files

Latest commit

History

Repository files navigation

Flash Web Crawler

Features

Action Types

Setup

Building and Running

Usage

Action Classes

Supported Actions

Requirements

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages