Skip to content

RWI: indexDistribution.minChunkSize does nothing #724

@okybaca

Description

@okybaca

My local RWI index grows a lot while crawling, but it's distribution to other peers is really slow and it doesn't "dissolve" at all.

Usually, just a small amount of urls/term is transfered in a single batch:

Index transfer of 7 references for terms [ XXX ..] and 13 URLs to peer ...

occasionally, up to 1000 references are sent.

I would like to maximise the term/urls in a single transfer, to speed-up the process a bit.

I found a configuration option indexDistribution.minChunkSize = 10 for that, in the configuration file, but it does nothing, nor it's referenced anywhere in functional code.

My suggestion is to improve the code (probably in transferRWI.java or Transmission.java), so it reflects the indexDistribution.minChunkSize setting (indexDistribution.maxChunkSize as well), allowing optimisation of RWI transfer.

As I can hear from disk activity, that this is quite io-heavy operation, so the option should remain configurable by user, to fine-tune the local instance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugIndicates an unexpected problem or unintended behaviorindex

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions