Skip to content

Commit 859b76a

Browse files
committed
Refactoring: API type adjustment - parameter adjustment - return value adjustment - connection controller
1 parent acf2cb9 commit 859b76a

File tree

12 files changed

+402
-273
lines changed

12 files changed

+402
-273
lines changed

README.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,20 +2,21 @@
22

33
English | [简体中文](https://github.com/coder-hxl/x-crawl/blob/main/docs/cn.md)
44

5-
x-crawl is a flexible nodejs crawler library. It can crawl pages in batches, network requests in batches, download file resources in batches, polling and crawling, etc. Supports asynchronous/synchronous mode crawling. Running on nodejs, the usage is flexible and simple, friendly to JS/TS developers.
5+
x-crawl is a flexible nodejs crawler library. It can crawl pages in batches, network requests in batches, download file resources in batches, polling and crawling, etc. Flexible and simple to use, friendly to JS/TS developers.
66

7-
> If you feel good, you can give [x-crawl repository](https://github.com/coder-hxl/x-crawl) a Star to support it, your Star will be the motivation for my update.
7+
> If you like x-crawl, you can give [x-crawl repository](https://github.com/coder-hxl/x-crawl) a Star to support it, which is its recognition.
88
99
## Features
1010

1111
- **🔥 Async/Sync** - Just change the mode property to toggle async/sync crawling mode.
1212
- **⚙️ Multiple functions** - Batch crawling of pages, batch network requests, batch download of file resources, polling crawling, etc.
13-
- **🖋️ Flexible writing style** - Multiple crawling configurations and ways to get crawling results.
1413
- **⏱️ Interval crawling** - no interval/fixed interval/random interval, you can use/avoid high concurrent crawling.
15-
- **🚀 Crawl Repost** - Under development.
14+
- **🔄 Crawl retry** - under development.
15+
- **🚀 Priority Queue** - under development.
1616
- **☁️ Crawl SPA** - Batch crawl SPA (Single Page Application) to generate pre-rendered content (ie "SSR" (Server Side Rendering)).
1717
- **⚒️ Controlling Pages** - Headless browsers can submit forms, keystrokes, event actions, generate screenshots of pages, etc.
18-
- **🧾 Capture Record** - Capture and record the crawled results, and highlight the reminders.
18+
- **🧾 Capture Record** - Capture and record the crawled results, and highlight them on the console.
19+
- **🖋️ Flexible writing style** - It is very flexible to adapt to various crawling configurations and obtain crawling results.
1920
- **🦾TypeScript** - Own types, implement complete types through generics.
2021

2122
## Relationship with puppeteer

docs/cn.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,21 +2,21 @@
22

33
[English](https://github.com/coder-hxl/x-crawl#x-crawl) | 简体中文
44

5-
x-crawl 是一个灵活的 nodejs 爬虫库。可批量爬取页面、批量网络请求、批量下载文件资源、轮询爬取等。支持 异步/同步 模式爬取。跑在 nodejs 上,用法灵活和简单,对 JS/TS 开发者友好。
5+
x-crawl 是一个灵活的 nodejs 爬虫库。可批量爬取页面、批量网络请求、批量下载文件资源、轮询爬取等。用法灵活和简单,对 JS/TS 开发者友好。
66

7-
> 如果感觉不错,可以给 [x-crawl 存储库](https://github.com/coder-hxl/x-crawl) 点个 Star 支持一下,您的 Star 将是我更新的动力
7+
> 如果你喜欢 x-crawl ,可以给 [x-crawl 存储库](https://github.com/coder-hxl/x-crawl) 点个 Star 支持一下,这是对它的认可
88
99
## 特征
1010

1111
- **🔥 异步/同步** - 只需更改一下 mode 属性即可切换 异步/同步 爬取模式。
1212
- **⚙️ 多种功能** - 可批量爬取页面、批量网络请求、批量下载文件资源、轮询爬取等。
13-
- **🖋️ 写法灵活** - 多种爬取配置、获取爬取结果的写法。
1413
- **⏱️ 间隔爬取** - 无间隔/固定间隔/随机间隔,可以 使用/避免 高并发爬取。
15-
- **🚀 爬取重发** - 开发中。
16-
14+
- **🔄 爬取重试** - 开发中。
15+
- **🚀 优先队列** - 开发中。
1716
- **☁️ 爬取 SPA** - 批量爬取 SPA(单页应用程序)生成预渲染内容(即“SSR”(服务器端渲染))。
1817
- **⚒️ 控制页面** - 无头浏览器可以表单提交、键盘输入、事件操作、生成页面的屏幕截图等。
19-
- **🧾 捕获记录** - 对爬取的结果进行捕获记录,并进行高亮的提醒。
18+
- **🧾 捕获记录** - 对爬取的结果进行捕获记录,并在控制台进行高亮的提醒。
19+
- **🖋️ 写法灵活** - 适配多种爬取配置、获取爬取结果的写法,非常灵活。
2020
- **🦾TypeScript** - 拥有类型,通过泛型实现完整的类型。
2121

2222
## 跟 puppeteer 的关系

package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
{
22
"private": true,
33
"name": "x-crawl",
4-
"version": "4.0.0",
4+
"version": "5.0.0",
55
"author": "coderHXL",
66
"description": "x-crawl is a flexible nodejs crawler library.",
77
"license": "MIT",

0 commit comments

Comments
 (0)