Skip to content

Commit a4cdc1d

Browse files
committed
Update API rename
1 parent c3f90bb commit a4cdc1d

File tree

12 files changed

+317
-290
lines changed

12 files changed

+317
-290
lines changed

README.md

Lines changed: 69 additions & 69 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ If it helps you, please give the [repository](https://github.com/coder-hxl/x-cra
1818

1919
## Relationship with puppeteer
2020

21-
The fetchPage API internally uses the [puppeteer](https://github.com/puppeteer/puppeteer) library to crawl pages.
21+
The crawlPage API internally uses the [puppeteer](https://github.com/puppeteer/puppeteer) library to crawl pages.
2222

2323
The following can be done:
2424

@@ -45,17 +45,17 @@ The following can be done:
4545
+ [Example](#Example-1)
4646
+ [Mode](#Mode)
4747
+ [IntervalTime](#IntervalTime)
48-
* [fetchPage](#fetchPage)
48+
* [crawlPage](#crawlPage)
4949
+ [Type](#Type-2)
5050
+ [Example](#Example-2)
5151
+ [About page](#About-page)
52-
* [fetchData](#fetchData)
52+
* [crawlData](#crawlData)
5353
+ [Type](#Type-3)
5454
+ [Example](#Example-3)
55-
* [fetchFile](#fetchFile)
55+
* [crawlFile](#crawlFile)
5656
+ [Type](#Type-4)
5757
+ [Example](#Example-4)
58-
* [fetchPolling](#fetchPolling)
58+
* [crawlPolling](#crawlPolling)
5959
+ [Type](#Type-5)
6060
+ [Example](#Example-5)
6161
- [Types](#Types)
@@ -65,15 +65,15 @@ The following can be done:
6565
* [RequestConfig](#RequestConfig)
6666
* [IntervalTime](#IntervalTime)
6767
* [XCrawlBaseConfig](#XCrawlBaseConfig)
68-
* [FetchBaseConfigV1](#FetchBaseConfigV1)
69-
* [FetchPageConfig](#FetchPageConfig )
70-
* [FetchDataConfig](#FetchDataConfig)
71-
* [FetchFileConfig](#FetchFileConfig)
68+
* [CrawlBaseConfigV1](#CrawlBaseConfigV1)
69+
* [CrawlPageConfig](#CrawlPageConfig )
70+
* [CrawlDataConfig](#CrawlDataConfig)
71+
* [CrawlFileConfig](#CrawlFileConfig)
7272
* [StartPollingConfig](#StartPollingConfig)
73-
* [FetchResCommonV1](#FetchResCommonV1)
74-
* [FetchResCommonArrV1](#FetchResCommonArrV1)
73+
* [CrawlResCommonV1](#CrawlResCommonV1)
74+
* [CrawlResCommonArrV1](#CrawlResCommonArrV1)
7575
* [FileInfo](#FileInfo)
76-
* [FetchPage](#FetchPage)
76+
* [CrawlPage](#CrawlPage)
7777
- [More](#More)
7878

7979
## Install
@@ -101,8 +101,8 @@ const myXCrawl = xCrawl({
101101
// 3.Set the crawling task
102102
// Call the startPolling API to start the polling function, and the callback function will be called every other day
103103
myXCrawl.startPolling({ d: 1 }, () => {
104-
// Call fetchPage API to crawl Page
105-
myXCrawl.fetchPage('https://www.youtube.com/').then((res) => {
104+
// Call crawlPage API to crawl Page
105+
myXCrawl.crawlPage('https://www.youtube.com/').then((res) => {
106106
const { jsdom } = res.data // By default, the JSDOM library is used to parse Page
107107

108108
// Get the cover image element of the Promoted Video
@@ -118,8 +118,8 @@ myXCrawl.startPolling({ d: 1 }, () => {
118118
}
119119
})
120120

121-
// Call the fetchFile API to crawl pictures
122-
myXCrawl.fetchFile({ requestConfig, fileConfig: { storeDir: './upload' } })
121+
// Call the crawlFile API to crawl pictures
122+
myXCrawl.crawlFile({ requestConfig, fileConfig: { storeDir: './upload' } })
123123
})
124124
})
125125

@@ -209,17 +209,17 @@ const myXCrawl2 = xCrawl({
209209

210210
### Crawl page
211211

212-
Fetch a page via [fetchPage()](#fetchPage)
212+
Crawl a page via [crawlPage()](#crawlPage)
213213

214214
```js
215-
myXCrawl.fetchPage('https://xxx.com').then(res => {
215+
myXCrawl.crawlPage('https://xxx.com').then(res => {
216216
const { jsdom, page } = res.data
217217
})
218218
```
219219

220220
### Crawl interface
221221

222-
Crawl interface data through [fetchData()](#fetchData)
222+
Crawl interface data through [crawlData()](#crawlData)
223223

224224
```js
225225
const requestConfig = [
@@ -228,14 +228,14 @@ const requestConfig = [
228228
{ url: 'https://xxx.com/xxxx' }
229229
]
230230

231-
myXCrawl.fetchData({ requestConfig }).then(res => {
231+
myXCrawl.crawlData({ requestConfig }).then(res => {
232232
// deal with
233233
})
234234
```
235235

236236
### Crawl files
237237

238-
Fetch file data via [fetchFile()](#fetchFile)
238+
Crawl file data via [crawlFile()](#crawlFile)
239239

240240
```js
241241
import path from 'node:path'
@@ -246,7 +246,7 @@ const requestConfig = [
246246
{ url: 'https://xxx.com/xxxx' }
247247
]
248248

249-
myXCrawl. fetchFile({
249+
myXCrawl. crawlFile({
250250
requestConfig,
251251
fileConfig: {
252252
storeDir: path.resolve(__dirname, './upload') // storage folder
@@ -284,9 +284,9 @@ const myXCrawl = xCrawl({
284284
})
285285
```
286286

287-
Passing **baseConfig** is for **fetchPage/fetchData/fetchFile** to use these values by default.
287+
Passing **baseConfig** is for **crawlPage/crawlData/crawlFile** to use these values by default.
288288

289-
**Note:** To avoid repeated creation of instances in subsequent examples, **myXCrawl** here will be the crawler instance in the **fetchPage/fetchData/fetchFile** example.
289+
**Note:** To avoid repeated creation of instances in subsequent examples, **myXCrawl** here will be the crawler instance in the **crawlPage/crawlData/crawlFile** example.
290290

291291
#### Mode
292292

@@ -306,26 +306,26 @@ The intervalTime option defaults to undefined . If there is a setting value, it
306306

307307
The first request is not to trigger the interval.
308308

309-
### fetchPage
309+
### crawlPage
310310

311-
fetchPage is the method of the above [myXCrawl](https://github.com/coder-hxl/x-crawl#Example-1) instance, usually used to crawl page.
311+
crawlPage is the method of the above [myXCrawl](https://github.com/coder-hxl/x-crawl#Example-1) instance, usually used to crawl page.
312312

313313
#### Type
314314

315-
- Look at the [FetchPageConfig](#FetchPageConfig) type
316-
- Look at the [FetchPage](#FetchPage-2) type
315+
- Look at the [CrawlPageConfig](#CrawlPageConfig) type
316+
- Look at the [CrawlPage](#CrawlPage-2) type
317317

318318
```ts
319-
function fetchPage: (
320-
config: FetchPageConfig,
321-
callback?: (res: FetchPage) => void
322-
) => Promise<FetchPage>
319+
function crawlPage: (
320+
config: CrawlPageConfig,
321+
callback?: (res: CrawlPage) => void
322+
) => Promise<CrawlPage>
323323
```
324324

325325
#### Example
326326

327327
```js
328-
myXCrawl.fetchPage('/xxx').then((res) => {
328+
myXCrawl.crawlPage('/xxx').then((res) => {
329329
const { jsdom } = res.data
330330
console.log(jsdom.window.document.querySelector('title')?.textContent)
331331
})
@@ -335,21 +335,21 @@ myXCrawl.fetchPage('/xxx').then((res) => {
335335

336336
Get the page instance from res.data.page, which can do interactive operations such as events. For specific usage, refer to [page](https://pptr.dev/api/puppeteer.page).
337337

338-
### fetchData
338+
### crawlData
339339

340-
fetchData is the method of the above [myXCrawl](#Example-1) instance, which is usually used to crawl APIs to obtain JSON data and so on.
340+
crawlData is the method of the above [myXCrawl](#Example-1) instance, which is usually used to crawl APIs to obtain JSON data and so on.
341341

342342
#### Type
343343

344-
- Look at the [FetchDataConfig](#FetchDataConfig) type
345-
- Look at the [FetchResCommonV1](#FetchResCommonV1) type
346-
- Look at the [FetchResCommonArrV1](#FetchResCommonArrV1) type
344+
- Look at the [CrawlDataConfig](#CrawlDataConfig) type
345+
- Look at the [CrawlResCommonV1](#CrawlResCommonV1) type
346+
- Look at the [CrawlResCommonArrV1](#CrawlResCommonArrV1) type
347347

348348
```ts
349-
function fetchData: <T = any>(
350-
config: FetchDataConfig,
351-
callback?: (res: FetchResCommonV1<T>) => void
352-
) => Promise<FetchResCommonArrV1<T>>
349+
function crawlData: <T = any>(
350+
config: CrawlDataConfig,
351+
callback?: (res: CrawlResCommonV1<T>) => void
352+
) => Promise<CrawlResCommonArrV1<T>>
353353
```
354354

355355
#### Example
@@ -361,27 +361,27 @@ const requestConfig = [
361361
{ url: '/xxxx' }
362362
]
363363
364-
myXCrawl.fetchData({ requestConfig }).then(res => {
364+
myXCrawl.crawlData({ requestConfig }).then(res => {
365365
console.log(res)
366366
})
367367
```
368368

369-
### fetchFile
369+
### crawlFile
370370

371-
fetchFile is the method of the above [myXCrawl](#Example-1) instance, which is usually used to crawl files, such as pictures, pdf files, etc.
371+
crawlFile is the method of the above [myXCrawl](#Example-1) instance, which is usually used to crawl files, such as pictures, pdf files, etc.
372372

373373
#### Type
374374

375-
- Look at the [FetchFileConfig](#FetchFileConfig) type
376-
- Look at the [FetchResCommonV1](#FetchResCommonV1) type
377-
- Look at the [FetchResCommonArrV1](#FetchResCommonArrV1) type
375+
- Look at the [CrawlFileConfig](#CrawlFileConfig) type
376+
- Look at the [CrawlResCommonV1](#CrawlResCommonV1) type
377+
- Look at the [CrawlResCommonArrV1](#CrawlResCommonArrV1) type
378378
- Look at the [FileInfo](#FileInfo) type
379379

380380
```ts
381-
function fetchFile: (
382-
config: FetchFileConfig,
383-
callback?: (res: FetchResCommonV1<FileInfo>) => void
384-
) => Promise<FetchResCommonArrV1<FileInfo>>
381+
function crawlFile: (
382+
config: CrawlFileConfig,
383+
callback?: (res: CrawlResCommonV1<FileInfo>) => void
384+
) => Promise<CrawlResCommonArrV1<FileInfo>>
385385
```
386386

387387
#### Example
@@ -393,7 +393,7 @@ const requestConfig = [
393393
{ url: '/xxxx' }
394394
]
395395
396-
myXCrawl.fetchFile({
396+
myXCrawl.crawlFile({
397397
requestConfig,
398398
fileConfig: {
399399
storeDir: path.resolve(__dirname, './upload') // storage folder
@@ -405,7 +405,7 @@ myXCrawl.fetchFile({
405405

406406
### startPolling
407407

408-
fetchPolling is a method of the [myXCrawl](#Example-1) instance, typically used to perform polling operations, such as getting news every once in a while.
408+
crawlPolling is a method of the [myXCrawl](#Example-1) instance, typically used to perform polling operations, such as getting news every once in a while.
409409

410410
#### Type
411411

@@ -423,7 +423,7 @@ function startPolling(
423423
```js
424424
myXCrawl.startPolling({ h: 1, m: 30 }, () => {
425425
// will be executed every one and a half hours
426-
// fetchPage/fetchData/fetchFile
426+
// crawlPage/crawlData/crawlFile
427427
})
428428
```
429429

@@ -485,32 +485,32 @@ interface XCrawlBaseConfig {
485485
}
486486
```
487487

488-
### FetchBaseConfigV1
488+
### CrawlBaseConfigV1
489489

490490
```ts
491-
interface FetchBaseConfigV1 {
491+
interface CrawlBaseConfigV1 {
492492
requestConfig: RequestConfig | RequestConfig[]
493493
intervalTime?: IntervalTime
494494
}
495495
```
496496

497-
### FetchPageConfig
497+
### CrawlPageConfig
498498

499499
```ts
500-
type FetchPageConfig = string | RequestBaseConfig
500+
type CrawlPageConfig = string | RequestBaseConfig
501501
```
502502

503-
### FetchDataConfig
503+
### CrawlDataConfig
504504

505505
```ts
506-
interface FetchDataConfig extends FetchBaseConfigV1 {
506+
interface CrawlDataConfig extends CrawlBaseConfigV1 {
507507
}
508508
```
509509

510-
### FetchFileConfig
510+
### CrawlFileConfig
511511

512512
```ts
513-
interface FetchFileConfig extends FetchBaseConfigV1 {
513+
interface CrawlFileConfig extends CrawlBaseConfigV1 {
514514
fileConfig: {
515515
storeDir: string // Store folder
516516
extension?: string // Filename extension
@@ -528,21 +528,21 @@ interface StartPollingConfig {
528528
}
529529
```
530530

531-
### FetchResCommonV1
531+
### CrawlResCommonV1
532532

533533
```ts
534-
interface FetchCommon<T> {
534+
interface CrawlCommon<T> {
535535
id: number
536536
statusCode: number | undefined
537537
headers: IncomingHttpHeaders // nodejs: http type
538538
data: T
539539
}
540540
```
541541

542-
### FetchResCommonArrV1
542+
### CrawlResCommonArrV1
543543

544544
```ts
545-
type FetchResCommonArrV1<T> = FetchResCommonV1<T>[]
545+
type CrawlResCommonArrV1<T> = CrawlResCommonV1<T>[]
546546
```
547547

548548
### FileInfo
@@ -556,10 +556,10 @@ interface FileInfo {
556556
}
557557
```
558558

559-
### FetchPage
559+
### CrawlPage
560560

561561
```ts
562-
interface FetchPage {
562+
interface CrawlPage {
563563
httpResponse: HTTPResponse | null // The type of HTTPResponse in the puppeteer library
564564
data: {
565565
page: Page // The type of Page in the puppeteer library

0 commit comments

Comments
 (0)