-
-
Notifications
You must be signed in to change notification settings - Fork 463
Description
My local RWI index grows a lot while crawling, but it's distribution to other peers is really slow and it doesn't "dissolve" at all.
Usually, just a small amount of urls/term is transfered in a single batch:
Index transfer of 7 references for terms [ XXX ..] and 13 URLs to peer ...
occasionally, up to 1000 references are sent.
I would like to maximise the term/urls in a single transfer, to speed-up the process a bit.
I found a configuration option indexDistribution.minChunkSize = 10
for that, in the configuration file, but it does nothing, nor it's referenced anywhere in functional code.
My suggestion is to improve the code (probably in transferRWI.java
or Transmission.java
), so it reflects the indexDistribution.minChunkSize
setting (indexDistribution.maxChunkSize
as well), allowing optimisation of RWI transfer.
As I can hear from disk activity, that this is quite io-heavy operation, so the option should remain configurable by user, to fine-tune the local instance.