🕷️ Crawlio Python SDK

Crawlio is a Python SDK for accessing the Crawlio API — a powerful service for web scraping, crawling, and content analysis. It supports single-page scraping, full-site crawling, batch operations, and structured search across results.

👉 Visit Crawlio | 📚 View API Docs

📦 Installation

pip install crawlio-py

🚀 Getting Started

from crawlio.client import Crawlio
from crawlio.types import ScrapeOptions

client = Crawlio(api_key="your-api-key")

options: ScrapeOptions = {
    "url": "https://example.com",
    "markdown": True
}

result = client.scrape(options)
print(result["markdown"])

🔐 Authentication

You must pass your Crawlio api_key when instantiating the client:

from crawlio.client import Crawlio

client = Crawlio(api_key="your_api_key")

🧭 Usage

`scrape(options: ScrapeOptions) -> ScrapeResponse`

Scrape a single webpage.

from crawlio.types import ScrapeOptions

client.scrape({
    "url": "https://example.com",
    "exclude": ["nav", "footer"],
    "markdown": True
})

`crawl(options: CrawlOptions) -> CrawlResponse`

Start a full-site crawl.

from crawlio.types import CrawlOptions

client.crawl({
    "url": "https://example.com",
    "count": 10,
    "sameSite": True
})

`crawl_status(crawl_id: str) -> CrawlStatusResponse`

Check the status of a crawl job.

client.crawl_status("crawl123")

`crawl_results(crawl_id: str) -> CrawlResultResponse`

Get results from a completed crawl.

client.crawl_results("crawl123")

`search(query: str, options: Optional[SearchOptions] = None) -> SearchResponse`

Search through previously scraped content.

client.search("privacy policy", {"site": "example.com"})

`batch_scrape(options: BatchScrapeOptions) -> BatchScrapeResponse`

Submit multiple URLs for scraping at once.

client.batch_scrape({
    "url": ["https://a.com", "https://b.com"],
    "options": {"markdown": True}
})

`batch_scrape_status(batch_id: str) -> BatchScrapeStatusResponse`

Check the status of a batch scrape.

client.batch_scrape_status("batch456")

`batch_scrape_result(batch_id: str) -> BatchScrapeResultResponse`

Retrieve results of a completed batch scrape.

client.batch_scrape_result("batch456")

🧨 Error Handling

All exceptions inherit from CrawlioError.

Exception Types

Exception Class	Description
`CrawlioError`	Base error class
`CrawlioRateLimit`	Too many requests
`CrawlioLimitExceeded`	API usage limit exceeded
`CrawlioAuthenticationError`	Invalid or missing API key
`CrawlioInternalServerError`	Server error
`CrawlioFailureError`	Other client or server failure

Example:

from crawlio.exception import CrawlioError

try:
    result = client.scrape({"url": "https://example.com"})
except CrawlioError as e:
    print(f"Error: {e}, Details: {e.response}")

📄 Response Format (Example)

`Scrape`

{
  "jobId": "abc123",
  "html": "<html>...</html>",
  "markdown": "## Title",
  "meta": { "title": "Example" },
  "urls": ["https://example.com/about"],
  "url": "https://example.com"
}

📃 License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.github/workflows		.github/workflows
crawlio		crawlio
.gitignore		.gitignore
LICENSE		LICENSE
pyproject.toml		pyproject.toml
readme.md		readme.md
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🕷️ Crawlio Python SDK

📦 Installation

🚀 Getting Started

🔐 Authentication

🧭 Usage

`scrape(options: ScrapeOptions) -> ScrapeResponse`

`crawl(options: CrawlOptions) -> CrawlResponse`

`crawl_status(crawl_id: str) -> CrawlStatusResponse`

`crawl_results(crawl_id: str) -> CrawlResultResponse`

`search(query: str, options: Optional[SearchOptions] = None) -> SearchResponse`

`batch_scrape(options: BatchScrapeOptions) -> BatchScrapeResponse`

`batch_scrape_status(batch_id: str) -> BatchScrapeStatusResponse`

`batch_scrape_result(batch_id: str) -> BatchScrapeResultResponse`

🧨 Error Handling

Exception Types

📄 Response Format (Example)

`Scrape`

📃 License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

Weekend-Dev-Labs/crawlio-py

Folders and files

Latest commit

History

Repository files navigation

🕷️ Crawlio Python SDK

📦 Installation

🚀 Getting Started

🔐 Authentication

🧭 Usage

scrape(options: ScrapeOptions) -> ScrapeResponse

crawl(options: CrawlOptions) -> CrawlResponse

crawl_status(crawl_id: str) -> CrawlStatusResponse

crawl_results(crawl_id: str) -> CrawlResultResponse

search(query: str, options: Optional[SearchOptions] = None) -> SearchResponse

batch_scrape(options: BatchScrapeOptions) -> BatchScrapeResponse

batch_scrape_status(batch_id: str) -> BatchScrapeStatusResponse

batch_scrape_result(batch_id: str) -> BatchScrapeResultResponse

🧨 Error Handling

Exception Types

📄 Response Format (Example)

Scrape

📃 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

`scrape(options: ScrapeOptions) -> ScrapeResponse`

`crawl(options: CrawlOptions) -> CrawlResponse`

`crawl_status(crawl_id: str) -> CrawlStatusResponse`

`crawl_results(crawl_id: str) -> CrawlResultResponse`

`search(query: str, options: Optional[SearchOptions] = None) -> SearchResponse`

`batch_scrape(options: BatchScrapeOptions) -> BatchScrapeResponse`

`batch_scrape_status(batch_id: str) -> BatchScrapeStatusResponse`

`batch_scrape_result(batch_id: str) -> BatchScrapeResultResponse`

`Scrape`

Packages