🕷️ Crawlio JS SDK

crawlio-js is a Node.js SDK for interacting with the Crawlio web scraping and crawling API. It provides programmatic access to scraping, crawling, and batch processing endpoints with built-in error handling.

Visit Crawlio See Docs

📦 Installation

npm install crawlio.js

🚀 Getting Started

import { Crawlio } from 'crawlio.js'

const client = new Crawlio({ apiKey: 'your-api-key' })

const result = await client.scrape({ url: 'https://example.com' })
console.log(result.html)

🔧 Constructor

`new Crawlio(options: CrawlioOptions)`

Creates a new Crawlio client.

Options:

Name	Type	Required	Description
apiKey	`string`	✅	Your Crawlio API key
baseUrl	`string`	❌	API base URL (default: `https://crawlio.xyz`)

📘 API Methods

`scrape(options: ScrapeOptions): Promise<ScrapeResponse>`

Scrapes a single page.

await client.scrape({ url: 'https://example.com' })

ScrapeOptions:

Name	Type	Required	Description
url	`string`	✅	Target URL
exclude	`string[]`	✅	CSS selectors to exclude
includeOnly	`string[]`	❌	CSS selectors to include
markdown	`boolean`	❌	Convert HTML to Markdown
returnUrls	`boolean`	❌	Return all discovered URLs
workflow	`Workflow[]`	❌	Custom workflow steps to execute
normalizeBase64	`boolean`	❌	Normalize base64 content
cookies	`CookiesInfo[]`	❌	Cookies to include in the request
userAgent	`string`	❌	Custom User-Agent header for the request

`crawl(options: CrawlOptions): Promise<CrawlResponse>`

Initiates a site-wide crawl.

CrawlOptions:

Name	Type	Required	Description
url	`string`	✅	Root URL to crawl
count	`number`	✅	Number of pages to crawl
sameSite	`boolean`	❌	Limit crawl to same domain
patterns	`string[]`	❌	URL patterns to match
exclude	`string[]`	❌	CSS selectors to exclude
includeOnly	`string[]`	❌	CSS selectors to include

`crawlStatus(id: string): Promise<CrawlStatusResponse>`

Checks the status of a crawl job.

`crawlResults(id: string): Promise<{ results: ScrapeResponse[] }>`

Gets results from a completed crawl.

`search(query: string, options?: SearchOptions): Promise<SearchResponse>`

Performs a search on scraped content.

SearchOptions:

Name	Type	Description
site	`string`	Limit search to a specific domain

`batchScrape(options: BatchScrapeOptions): Promise<BatchScrapeResponse>`

Initiates scraping for multiple URLs in one request.

BatchScrapeOptions:

Name	Type	Description
url	`string[]`	List of URLs
options	`Omit<ScrapeOptions, 'url'>`	Common options for all URLs

`batchScrapeStatus(batchId: string): Promise<BatchScrapeStatusResponse>`

Checks the status of a batch scrape job.

`batchScrapeResult(batchId: string): Promise<{ results: { id: string; result: ScrapeResponse } }>`

Fetches results from a completed batch scrape.

🛑 Error Handling

All Crawlio errors extend from CrawlioError. You can catch and inspect these for more context.

Error Types:

CrawlioError
CrawlioRateLimit
CrawlioLimitExceeded
CrawlioAuthenticationError
CrawlioInternalServerError
CrawlioFailureError

📄 Types

`ScrapeResponse`

{
  jobId: string
  html: string
  markdown: string
  meta: Record<string, string>
  urls?: string[]
  url: string
}

`CrawlStatusResponse`

{
  id: string
  status: 'IN_QUEUE' | 'RUNNING' | 'LIMIT_EXCEEDED' | 'ERROR' | 'SUCCESS'
  error: number
  success: number
  total: number
}

`CookiesInfo`

{
  name: string
  value: string
  path: string
  expires?: number
  httpOnly: boolean
  secure: boolean
  domain: string
  sameSite: 'Strict' | 'Lax' | 'None'
}

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.github/workflows		.github/workflows
src		src
.eslintrc.cjs		.eslintrc.cjs
.gitignore		.gitignore
.npmignore		.npmignore
.npmrc		.npmrc
LICENSE.md		LICENSE.md
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
prettier.config.cjs		prettier.config.cjs
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🕷️ Crawlio JS SDK

📦 Installation

🚀 Getting Started

🔧 Constructor

`new Crawlio(options: CrawlioOptions)`

📘 API Methods

`scrape(options: ScrapeOptions): Promise<ScrapeResponse>`

`crawl(options: CrawlOptions): Promise<CrawlResponse>`

`crawlStatus(id: string): Promise<CrawlStatusResponse>`

`crawlResults(id: string): Promise<{ results: ScrapeResponse[] }>`

`search(query: string, options?: SearchOptions): Promise<SearchResponse>`

`batchScrape(options: BatchScrapeOptions): Promise<BatchScrapeResponse>`

`batchScrapeStatus(batchId: string): Promise<BatchScrapeStatusResponse>`

`batchScrapeResult(batchId: string): Promise<{ results: { id: string; result: ScrapeResponse } }>`

🛑 Error Handling

Error Types:

📄 Types

`ScrapeResponse`

`CrawlStatusResponse`

`CookiesInfo`

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

Weekend-Dev-Labs/crawlio-js

Folders and files

Latest commit

History

Repository files navigation

🕷️ Crawlio JS SDK

📦 Installation

🚀 Getting Started

🔧 Constructor

new Crawlio(options: CrawlioOptions)

📘 API Methods

scrape(options: ScrapeOptions): Promise<ScrapeResponse>

crawl(options: CrawlOptions): Promise<CrawlResponse>

crawlStatus(id: string): Promise<CrawlStatusResponse>

crawlResults(id: string): Promise<{ results: ScrapeResponse[] }>

search(query: string, options?: SearchOptions): Promise<SearchResponse>

batchScrape(options: BatchScrapeOptions): Promise<BatchScrapeResponse>

batchScrapeStatus(batchId: string): Promise<BatchScrapeStatusResponse>

batchScrapeResult(batchId: string): Promise<{ results: { id: string; result: ScrapeResponse } }>

🛑 Error Handling

Error Types:

📄 Types

ScrapeResponse

CrawlStatusResponse

CookiesInfo

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

`new Crawlio(options: CrawlioOptions)`

`scrape(options: ScrapeOptions): Promise<ScrapeResponse>`

`crawl(options: CrawlOptions): Promise<CrawlResponse>`

`crawlStatus(id: string): Promise<CrawlStatusResponse>`

`crawlResults(id: string): Promise<{ results: ScrapeResponse[] }>`

`search(query: string, options?: SearchOptions): Promise<SearchResponse>`

`batchScrape(options: BatchScrapeOptions): Promise<BatchScrapeResponse>`

`batchScrapeStatus(batchId: string): Promise<BatchScrapeStatusResponse>`

`batchScrapeResult(batchId: string): Promise<{ results: { id: string; result: ScrapeResponse } }>`

`ScrapeResponse`

`CrawlStatusResponse`

`CookiesInfo`

Packages