Skip to content

Commit dd93802

Browse files
committed
Feat: Version update
1 parent 9ed3599 commit dd93802

File tree

4 files changed

+36
-7
lines changed

4 files changed

+36
-7
lines changed

CHANGELOG.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,15 @@
1+
# [v7.0.0](https://github.com/coder-hxl/x-crawl/compare/v7.0.0...v7.0.1) (2023-05-04)
2+
3+
### 🐞 Bug fixes
4+
5+
- The params configuration option for the crawlData API is not working.
6+
7+
---
8+
9+
### 🐞 漏洞修复
10+
11+
- crawlData API 的 params 配置选项不起作用。
12+
113
# [v7.0.0](https://github.com/coder-hxl/x-crawl/compare/v6.0.1...v7.0.0) (2023-04-26)
214

315
### 🚨 Breaking Changes

package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
{
22
"private": true,
33
"name": "x-crawl",
4-
"version": "7.0.0",
4+
"version": "7.0.1",
55
"author": "coderHXL",
66
"description": "x-crawl is a flexible Node.js multifunctional crawler library.",
77
"license": "MIT",

publish/README.md

Lines changed: 22 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
English | [简体中文](https://github.com/coder-hxl/x-crawl/blob/main/docs/cn.md)
44

5-
x-crawl is a flexible Node.js multipurpose crawler library. The usage is flexible, and there are many built-in functions for crawl pages, crawl interfaces, crawl files, etc.
5+
x-crawl is a flexible Node.js multifunctional crawler library. Flexible usage and numerous functions can help you quickly, safely, and stably crawl pages, interfaces, and files.
66

77
> If you also like x-crawl, you can give [x-crawl repository](https://github.com/coder-hxl/x-crawl) a star to support it, thank you for your support!
88
@@ -23,7 +23,7 @@ x-crawl is a flexible Node.js multipurpose crawler library. The usage is flexibl
2323

2424
## Relationship with Puppeteer
2525

26-
The crawlPage API has built-in [puppeteer](https://github.com/puppeteer/puppeteer), you only need to pass in some configuration options to complete some operations, the result will expose the Brower instance and Page instance, you get Brower instance and Page instance will be intact, x-crawl will not rewrite them.
26+
The crawlPage API has built-in [puppeteer](https://github.com/puppeteer/puppeteer), you only need to pass in some configuration options to let x-crawl help you complete some operations, and the result will expose the Brower instance and the Page instance Come out, the Brower instance and Page instance you get will be intact, and x-crawl will not rewrite them.
2727

2828
# Table of Contents
2929

@@ -40,6 +40,7 @@ The crawlPage API has built-in [puppeteer](https://github.com/puppeteer/puppetee
4040
- [Page Instance](#Page-Instance)
4141
- [life Cycle](#life-Cycle)
4242
- [onCrawlItemComplete](#onCrawlItemComplete)
43+
- [Open Browser](#Open-Browser)
4344
- [Crawl Interface](#Crawl-Interface)
4445
- [life Cycle](#life-Cycle-1)
4546
- [onCrawlItemComplete](#onCrawlItemComplete-1)
@@ -163,7 +164,7 @@ myXCrawl.startPolling({ d: 1 }, async (count, stopPolling) => {
163164
await new Promise((r) => setTimeout(r, 300))
164165

165166
// Gets the URL of the page image
166-
const urls = await page!.$$eval(
167+
const urls = await page.$$eval(
167168
`${elSelectorMap[id - 1]} img`,
168169
(imgEls) => {
169170
return imgEls.map((item) => item.src)
@@ -282,13 +283,13 @@ myXCrawl.crawlPage('https://www.example.com').then((res) => {
282283

283284
#### Browser Instance
284285

285-
When you call crawlPage API to crawl pages in the same crawler instance, the browser instance used is the same, because the crawlPage API of the browser instance in the same crawler instance is shared. It's a headless browser, no UI shell, what he does is bring **all modern web platform features** provided by the browser rendering engine to the code. For specific usage, please refer to [Browser](https://pptr.dev/api/puppeteer.browser).
286+
When you call crawlPage API to crawl pages in the same crawler instance, the browser instance used is the same, because the crawlPage API of the browser instance in the same crawler instance is shared. For specific usage, please refer to [Browser](https://pptr.dev/api/puppeteer.browser).
286287

287288
**Note:** The browser will keep running and the file will not be terminated. If you want to stop, you can execute browser.close() to close it. Do not call [crawlPage](#crawlPage) or [page](#page) if you need to use it later. Because the crawlPage API of the browser instance in the same crawler instance is shared.
288289

289290
#### Page Instance
290291

291-
When you call crawlPage API to crawl pages in the same crawler instance, a new page instance will be generated from the browser instance. It can be used for interactive operations. For specific usage, please refer to [Page](https://pptr.dev/api/puppeteer.page).
292+
When you call crawlPage API to crawl pages in the same crawler instance, a new page instance will be generated from the browser instance. For specific usage, please refer to [Page](https://pptr.dev/api/puppeteer.page).
292293

293294
The browser instance will retain a reference to the page instance. If it is no longer used in the future, the page instance needs to be closed by itself, otherwise it will cause a memory leak.
294295

@@ -323,6 +324,22 @@ In the onCrawlItemComplete function, you can get the results of each crawled goa
323324

324325
**Note:** If you need to crawl many pages at one time, you need to use this life cycle function to process the results of each target and close the page instance after each page is crawled down. If you do not close the page instance, then The program will crash due to too many opened pages.
325326

327+
#### Open Browser
328+
329+
Disable running the browser in headless mode.
330+
331+
```js
332+
import xCrawl from 'x-crawl'
333+
334+
const myXCrawl = xCrawl({
335+
maxRetry: 3,
336+
// Cancel running the browser in headless mode
337+
crawlPage: { launchBrowser: { headless: false } }
338+
})
339+
340+
myXCrawl.crawlPage('https://www.example.com').then((res) => {})
341+
```
342+
326343
### Crawl Interface
327344

328345
Crawl interface data through [crawlData()](#crawlData) .

publish/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "x-crawl",
3-
"version": "7.0.0",
3+
"version": "7.0.1",
44
"author": "coderHXL",
55
"description": "x-crawl is a flexible Node.js multifunctional crawler library.",
66
"license": "MIT",

0 commit comments

Comments
 (0)