Skip to content

Commit 197239e

Browse files
authored
updates to README
updated the names of CrawlRate and BurstRate with changes in the Model. Also updated the sample main function with the same.
1 parent d1b9cbe commit 197239e

File tree

1 file changed

+14
-10
lines changed

1 file changed

+14
-10
lines changed

README.md

Lines changed: 14 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,8 @@ func main() {
2929
options := octopus.GetDefaultCrawlOptions()
3030
options.MaxCrawlDepth = 3
3131
options.TimeToQuit = 10
32+
options.CrawlRatePerSec = 5
33+
options.CrawlBurstLimitPerSec = 8
3234
options.OpAdapter = opAdapter
3335

3436
crawler := octopus.New(options)
@@ -43,21 +45,23 @@ Customizations can be made by supplying the crawler an instance of `CrawlOptions
4345

4446
```go
4547
type CrawlOptions struct {
46-
MaxCrawlDepth int64 // Max Depth of Crawl, 0 is the initial link.
47-
MaxCrawledUrls int64 // Max number of links to be crawled in total.
48-
StayWithinBaseHost bool // [Not-Implemented-Yet]
49-
CrawlRate int64 // Max Rate at which requests can be made (req/sec).
50-
CrawlBurstLimit int64 // Max Burst Capacity (should be atleast the crawl rate).
51-
RespectRobots bool // [Not-Implemented-Yet]
52-
IncludeBody bool // Include the Request Body (Contents of the web page) in the result of the crawl.
53-
OpAdapter OutputAdapter // A user defined crawl output handler (See next section for info).
54-
ValidProtocols []string // Valid protocols to crawl (http, https, ftp, etc.)
55-
TimeToQuit int64 // Timeout (seconds) between two attempts or requests, before the crawler quits.
48+
MaxCrawlDepth int64 // Max Depth of Crawl, 0 is the initial link.
49+
MaxCrawledUrls int64 // Max number of links to be crawled in total.
50+
StayWithinBaseHost bool // [Not-Implemented-Yet]
51+
CrawlRatePerSec int64 // Max Rate at which requests can be made (req/sec).
52+
CrawlBurstLimitPerSec int64 // Max Burst Capacity (should be atleast the crawl rate).
53+
RespectRobots bool // [Not-Implemented-Yet]
54+
IncludeBody bool // Include the Request Body (Contents of the web page) in the result of the crawl.
55+
OpAdapter OutputAdapter // A user defined crawl output handler (See next section for info).
56+
ValidProtocols []string // Valid protocols to crawl (http, https, ftp, etc.)
57+
TimeToQuit int64 // Timeout (seconds) between two attempts or requests, before the crawler quits.
5658
}
5759
```
5860

5961
A default instance of the `CrawlOptions` can be obtained by calling `octopus.GetDefaultCrawlOptions()`. This can be further customized by overriding individual properties.
6062

63+
**NOTE:** If rate-limiting is not required, then just ignore(don't set value) both `CrawlRatePerSec` and `CrawlBurstLimitPerSec` in the `CrawlOptions`.
64+
6165
### Output Adapters
6266

6367
An Output Adapter is the final destination of a crawler processed request. The output of the crawler is fed here, according to the customizations made before starting the crawler through the `CrawlOptions` attached to the crawler.

0 commit comments

Comments
 (0)