Skip to content

Commit 32ba697

Browse files
committed
Docs: Adjustment of cases
1 parent a6d7e23 commit 32ba697

File tree

6 files changed

+74
-65
lines changed

6 files changed

+74
-65
lines changed

README.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ The crawlPage API internally uses the [puppeteer](https://github.com/puppeteer/p
4242
- [Interval time](#Interval-time)
4343
- [Fail retry](#Fail-retry)
4444
- [Priority queue](#Priority-queue)
45-
- [About the results](#About the results)
45+
- [About the results](#About-the-results)
4646
- [API](#API)
4747
- [xCrawl](#xCrawl)
4848
- [Type](#Type)
@@ -78,17 +78,17 @@ The crawlPage API internally uses the [puppeteer](https://github.com/puppeteer/p
7878
- [CrawlDataConfig](#CrawlDataConfig)
7979
- [CrawlFileConfig](#CrawlFileConfig)
8080
- [StartPollingConfig](#StartPollingConfig)
81-
- [API Result](#API-Result)
82-
- [XCrawlInstance](#XCrawlInstance)
83-
- [CrawlCommonRes](#CrawlCommonRes)
84-
- [CrawlPageSingleRes](#CrawlPageSingleRes)
85-
- [CrawlDataSingleRes](#CrawlDataSingleRes)
86-
- [CrawlFileSingleRes](#CrawlFileSingleRes)
87-
- [CrawlPageRes](#CrawlPageRes)
88-
- [CrawlDataRes](#CrawlDataRes)
89-
- [CrawlFileRes](#CrawlFileRes)
90-
- [API Other](#API-Other)
91-
- [AnyObject](#AnyObject)
81+
- [API Result](#API-Result)
82+
- [XCrawlInstance](#XCrawlInstance)
83+
- [CrawlCommonRes](#CrawlCommonRes)
84+
- [CrawlPageSingleRes](#CrawlPageSingleRes)
85+
- [CrawlDataSingleRes](#CrawlDataSingleRes)
86+
- [CrawlFileSingleRes](#CrawlFileSingleRes)
87+
- [CrawlPageRes](#CrawlPageRes)
88+
- [CrawlDataRes](#CrawlDataRes)
89+
- [CrawlFileRes](#CrawlFileRes)
90+
- [API Other](#API-Other)
91+
- [AnyObject](#AnyObject)
9292
- [More](#More)
9393

9494
## Install

assets/cn/crawler-result.png

241 KB
Loading

assets/cn/crawler.png

51.1 KB
Loading

docs/cn.md

Lines changed: 31 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -101,32 +101,50 @@ npm install x-crawl
101101

102102
## 示例
103103

104-
每天自动获取 bilibili 国漫主页的轮播图片为例:
104+
每天自动获取 bilibili 首页、国漫、电影这三个页面的轮播图片为例:
105105

106106
```js
107107
// 1.导入模块 ES/CJS
108108
import xCrawl from 'x-crawl'
109109

110110
// 2.创建一个爬虫实例
111-
const myXCrawl = xCrawl({ intervalTime: { max: 3000, min: 2000 } })
111+
const myXCrawl = xCrawl({
112+
maxRetry: 3,
113+
intervalTime: { max: 3000, min: 2000 }
114+
})
112115

113116
// 3.设置爬取任务
114117
// 调用 startPolling API 开始轮询功能,每隔一天会调用回调函数
115118
myXCrawl.startPolling({ d: 1 }, async (count, stopPolling) => {
116-
// 调用 crawlPage API 爬取 Page
117-
const res = await myXCrawl.crawlPage('https://www.bilibili.com/guochuang/')
118-
const { page } = res.data
119+
// 调用 crawlPage API 爬取 首页、国漫、电影 这三个页面
120+
const res = await myXCrawl.crawlPage([
121+
'https://www.bilibili.com',
122+
'https://www.bilibili.com/guochuang',
123+
'https://www.bilibili.com/movie'
124+
])
119125

120-
// 设置请求配置,获取轮播图片的 URL
121-
const requestConfigs = await page.$$eval('.chief-recom-item img', (imgEls) =>
122-
imgEls.map((item) => item.src)
123-
)
126+
// 存放图片 URL
127+
const imgUrls: string[] = []
128+
const elSelectorMap = ['.carousel-inner img', '.chief-recom-item img', '.bg-item img']
129+
for (const item of res) {
130+
const { id } = item
131+
const { page } = item.data
132+
133+
// 获取页面轮播图片元素的 URL
134+
const urls = await page.$$eval(elSelectorMap[id - 1], (imgEls) =>
135+
imgEls.map((item) => item.src)
136+
)
137+
imgUrls.push(...urls)
138+
139+
// 关闭页面
140+
page.close()
141+
}
124142

125143
// 调用 crawlFile API 爬取图片
126-
await myXCrawl.crawlFile({ requestConfigs, fileConfig: { storeDir: './upload' } })
127-
128-
// 关闭页面
129-
page.close()
144+
await myXCrawl.crawlFile({
145+
requestConfigs: imgUrls,
146+
fileConfig: { storeDir: './upload' }
147+
})
130148
})
131149
```
132150

0 commit comments

Comments
 (0)