Skip to content

Commit 1c97855

Browse files
committed
docs: added printing information description
1 parent 83ad9d9 commit 1c97855

File tree

4 files changed

+76
-3
lines changed

4 files changed

+76
-3
lines changed

README.md

Lines changed: 38 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,7 @@ x-crawl is an open source project under the MIT license, completely free to use.
5757
- [Rotate Proxy](#rotate-proxy)
5858
- [Custom Device Fingerprint](#custom-device-fingerprint)
5959
- [Priority Queue](#priority-queue)
60+
- [Print information](#print-information)
6061
- [About Results](#about-results)
6162
- [TypeScript](#typescript)
6263
- [API](#api)
@@ -198,7 +199,7 @@ myXCrawl.startPolling({ d: 1 }, async (count, stopPolling) => {
198199
running result:
199200

200201
<div align="center">
201-
<img src="https://raw.githubusercontent.com/coder-hxl/x-crawl/main/assets/run-example-gif.gif" />
202+
<img src="https://raw.githubusercontent.com/coder-hxl/x-crawl/main/assets/run-example.gif" />
202203
</div>
203204

204205
**Note:** Please do not crawl randomly, you can check the **robots.txt** protocol before crawling. The class name of the website may change, this is just to demonstrate how to use x-crawl.
@@ -788,6 +789,34 @@ myXCrawl
788789
789790
The larger the value of the priority attribute, the higher the priority in the current crawling queue.
790791
792+
### Print information
793+
794+
The crawled print information consists of start (displaying mode and total number), process (displaying number and how long to wait), and result (displaying success and failure information). There will be something like **1-page-2** in front of each piece of information. The first 1 represents the first crawler instance, the middle page represents the API type, and the following 2 represents the second page of the first crawler instance. Do this The purpose is to better distinguish which API the information comes from.
795+
796+
When you do not want to display the crawled information in the terminal, you can control the display or hiding through the options.
797+
798+
```js
799+
import xCrawl from 'x-crawl'
800+
801+
// Only hide the process, start and result display
802+
const myXCrawl = xCrawl({ log: { process: false } })
803+
804+
// Hide all information
805+
const myXCrawl = xCrawl({ log: false })
806+
```
807+
808+
The log option accepts an object or boolean type:
809+
810+
- Boolean
811+
812+
- true: show all
813+
- false: hide all
814+
815+
- object
816+
- start: control the start information
817+
- process: control of process information
818+
- result: control of result information
819+
791820
### About Results
792821
793822
Each crawl target will generate a detail object, which will contain the following properties:
@@ -1491,6 +1520,13 @@ export interface XCrawlConfig extends CrawlCommonConfig {
14911520
enableRandomFingerprint?: boolean
14921521
baseUrl?: string
14931522
intervalTime?: IntervalTime
1523+
log?:
1524+
| {
1525+
start?: boolean
1526+
process?: boolean
1527+
result?: boolean
1528+
}
1529+
| boolean
14941530
crawlPage?: {
14951531
puppeteerLaunch?: PuppeteerLaunchOptions // puppeteer
14961532
}
@@ -1503,6 +1539,7 @@ export interface XCrawlConfig extends CrawlCommonConfig {
15031539
- enableRandomFingerprint: true
15041540
- baseUrl: undefined
15051541
- intervalTime: undefined
1542+
- log: { start: true, process: true, result: true }
15061543
- crawlPage: undefined
15071544
15081545
#### Detail target config

assets/run-example-gif.gif

-3.55 MB
Binary file not shown.

assets/run-example.gif

2.52 MB
Loading

docs/cn.md

Lines changed: 38 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,7 @@ x-crawl 是采用 MIT 许可的开源项目,使用完全免费。如果你在
5757
- [轮换代理](#轮换代理)
5858
- [自定义设备指纹](#自定义设备指纹)
5959
- [优先队列](#优先队列)
60+
- [打印信息](#打印信息)
6061
- [关于结果](#关于结果)
6162
- [TypeScript](#TypeScript)
6263
- [API](#API)
@@ -195,9 +196,8 @@ myXCrawl.startPolling({ d: 1 }, async (count, stopPolling) => {
195196
运行效果:
196197

197198
<div align="center">
198-
<img src="https://raw.githubusercontent.com/coder-hxl/x-crawl/main/assets/run-example-gif.gif" />
199+
<img src="https://raw.githubusercontent.com/coder-hxl/x-crawl/main/assets/run-example.gif" />
199200
</div>
200-
201201
**注意:** 请勿随意爬取,爬取前可查看 **robots.txt** 协议。网站的类名可能会有变更,这里只是为了演示如何使用 x-crawl 。
202202

203203
## 核心概念
@@ -782,6 +782,34 @@ myXCrawl
782782
783783
priority 属性的值越大就在当前爬取队列中越优先。
784784
785+
### 打印信息
786+
787+
爬取的打印信息由开始(显示模式和总数)、过程(显示数量和等待多久)、结果(显示成功和失败信息)组成。每段信息前面都会有如 **1-page-2** ,前面的 1 代表第 1 个爬虫实例,中间的 page 代表 API 类型,后面的 2 代表第 1 个爬虫实例的第 2 个 page ,这样做的目的是为了更好区分信息来自哪个 API 。
788+
789+
当您不希望在终端显示爬取信息时,可以通过选项自己控制显示或隐藏。
790+
791+
```js
792+
import xCrawl from 'x-crawl'
793+
794+
// 只隐藏过程,开始和结果显示
795+
const myXCrawl = xCrawl({ log: { process: false } })
796+
797+
// 隐藏全部信息
798+
const myXCrawl = xCrawl({ log: false })
799+
```
800+
801+
log 选项接收对象或布尔类型:
802+
803+
- 布尔
804+
805+
- true: 全部显示
806+
- false:全部隐藏
807+
808+
- 对象
809+
- start:对开始信息控制
810+
- process:对过程信息控制
811+
- result:对结果信息控制
812+
785813
### 关于结果
786814
787815
每个爬取目标都会产生一个详情对象,该详情对象会包含以下属性:
@@ -1479,6 +1507,13 @@ export interface XCrawlConfig extends CrawlCommonConfig {
14791507
enableRandomFingerprint?: boolean
14801508
baseUrl?: string
14811509
intervalTime?: IntervalTime
1510+
log?:
1511+
| {
1512+
start?: boolean
1513+
process?: boolean
1514+
result?: boolean
1515+
}
1516+
| boolean
14821517
crawlPage?: {
14831518
puppeteerLaunch?: PuppeteerLaunchOptions // puppeteer
14841519
}
@@ -1491,6 +1526,7 @@ export interface XCrawlConfig extends CrawlCommonConfig {
14911526
- enableRandomFingerprint: true
14921527
- baseUrl: undefined
14931528
- intervalTime: undefined
1529+
- log: { start: true, process: true, result: true }
14941530
- crawlPage: undefined
14951531
14961532
#### Detail Target Config

0 commit comments

Comments
 (0)