Skip to content

RWIs fill out the whole memory space for YaCy #731

@okybaca

Description

@okybaca

When YaCy instance crawls a lot, RWI kelondro index becomes too large, and
YaCy gets stucked with out of memory exception. Instance gets blocked and
doesn't work any more. Only killing the instance, manual deleting the
excessive RWI and restart helps.

RWI transfer and 'dissolving' is slow (as described in #724) and crawling
easily overfills RWI index.

Possible solution could be:

  • when certain memory limit is reached, the oldest RWIs are deleted
    automatically, similarly to DISK LIMITS

  • employ more effective RWIs memory behavior (i got no idea, but i guess
    that all rwi's indexes are kept in memory (?)), so that the bigger index
    doesn't consume more and more RAM, above the limits

  • use modern garbage collector, like described here:
    https://www.geeksforgeeks.org/java/demystifying-memory-management-in-java/
    that would probably help also in some other aspects of YaCy operation

Reported also in the forum:
https://community.searchlab.eu/t/topic/1088

log:

...
W 2025/09/03 12:00:20 ConcurrentLog * net.yacy.cora.util.SpaceExceededException: 25846200 bytes needed for RowCollection grow after OutOfMemoryError J
ava heap space: 72781488 free at Wed Sep 03 12:00:20 CEST 2025 
net.yacy.cora.util.SpaceExceededException: 25846200 bytes needed for RowCollection grow after OutOfMemoryError Java heap space: 72781488 free at Wed S
ep 03 12:00:20 CEST 2025 
        at net.yacy.kelondro.index.RowCollection.ensureSize(RowCollection.java:276) 
        at net.yacy.kelondro.index.RowCollection.addUnique(RowCollection.java:425) 
        at net.yacy.kelondro.index.RowCollection.addUnique(RowCollection.java:403) 
        at net.yacy.kelondro.index.RAMIndex.addUnique(RAMIndex.java:216) 
        at net.yacy.kelondro.index.RAMIndexCluster.addUnique(RAMIndexCluster.java:133) 
        at net.yacy.kelondro.index.RowHandleMap.<init>(RowHandleMap.java:116) 
        at net.yacy.kelondro.blob.HeapReader.initIndexReadDump(HeapReader.java:179) 
        at net.yacy.kelondro.blob.HeapReader.<init>(HeapReader.java:91) 
        at net.yacy.kelondro.blob.HeapModifier.<init>(HeapModifier.java:58) 
        at net.yacy.kelondro.blob.ArrayStack.<init>(ArrayStack.java:209) 
        at net.yacy.kelondro.rwi.ReferenceContainerArray.<init>(ReferenceContainerArray.java:68) 
        at net.yacy.kelondro.rwi.IndexCell.<init>(IndexCell.java:99) 
        at net.yacy.search.index.Segment.connectRWI(Segment.java:162) 
        at net.yacy.search.Switchboard.<init>(Switchboard.java:610) 
        at net.yacy.yacy.startup(yacy.java:212) 
        at net.yacy.yacy.main(yacy.java:809) 
I 2025/09/03 12:00:20 KELONDRO * HeapReader: generating index for [REDACTED]/DATA/INDEX/freeworld/SEGMENTS/default/text.index.202508
20035645330.blob, 6624 MB. Please wait. 
W 2025/09/03 12:01:17 ConcurrentLog * net.yacy.cora.util.SpaceExceededException: 9419140 bytes needed for RowCollection grow after OutOfMemoryError Ja
va heap space: 35786272 free at Wed Sep 03 12:01:17 CEST 2025 
net.yacy.cora.util.SpaceExceededException: 9419140 bytes needed for RowCollection grow after OutOfMemoryError Java heap space: 35786272 free at Wed Se
p 03 12:01:17 CEST 2025 
        at net.yacy.kelondro.index.RowCollection.ensureSize(RowCollection.java:276) 
        at net.yacy.kelondro.index.RowCollection.addUnique(RowCollection.java:425) 
        at net.yacy.kelondro.index.RowCollection.addUnique(RowCollection.java:403) 
        at net.yacy.kelondro.index.RAMIndex.addUnique(RAMIndex.java:216) 
        at net.yacy.kelondro.index.RAMIndexCluster.addUnique(RAMIndexCluster.java:133) 
        at net.yacy.kelondro.index.RowHandleMap.putUnique(RowHandleMap.java:294) 
        at net.yacy.kelondro.index.RowHandleMap$initDataConsumer.call(RowHandleMap.java:499) 
        at net.yacy.kelondro.index.RowHandleMap$initDataConsumer.call(RowHandleMap.java:438) 
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317) 
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) 
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) 
        at java.base/java.lang.Thread.run(Thread.java:1589) 
I 2025/09/03 12:01:34 ConcurrentLog shutdown of ConcurrentLog.Worker void because it was not running. 
E 2025/09/03 12:01:34 UNCAUGHT-EXCEPTION * Thread main: Java heap space 
java.lang.OutOfMemoryError: Java heap space 

java.lang.OutOfMemoryError: Java heap space 
E 2025/09/03 12:01:34 ConcurrentLog Java heap space 
java.lang.OutOfMemoryError: Java heap space 

        at net.yacy.kelondro.blob.HeapModifier.<init>(HeapModifier.java:58) 
        at net.yacy.kelondro.blob.ArrayStack.<init>(ArrayStack.java:209) 
        at net.yacy.kelondro.rwi.ReferenceContainerArray.<init>(ReferenceContainerArray.java:68) 
        at net.yacy.kelondro.rwi.IndexCell.<init>(IndexCell.java:99) 
        at net.yacy.search.index.Segment.connectRWI(Segment.java:162) 
        at net.yacy.search.Switchboard.<init>(Switchboard.java:610) 
        at net.yacy.yacy.startup(yacy.java:212) 
        at net.yacy.yacy.main(yacy.java:809) 
I 2025/09/03 12:00:20 KELONDRO * HeapReader: generating index for [REDACTED]/DATA/INDEX/freeworld/SEGMENTS/default/te
20035645330.blob, 6624 MB. Please wait. 
W 2025/09/03 12:01:17 ConcurrentLog * net.yacy.cora.util.SpaceExceededException: 9419140 bytes needed for RowCollection grow after OutO
va heap space: 35786272 free at Wed Sep 03 12:01:17 CEST 2025 
net.yacy.cora.util.SpaceExceededException: 9419140 bytes needed for RowCollection grow after OutOfMemoryError Java heap space: 35786272
p 03 12:01:17 CEST 2025 
        at net.yacy.kelondro.index.RowCollection.ensureSize(RowCollection.java:276) 
        at net.yacy.kelondro.index.RowCollection.addUnique(RowCollection.java:425) 
        at net.yacy.kelondro.index.RowCollection.addUnique(RowCollection.java:403) 
        at net.yacy.kelondro.index.RAMIndex.addUnique(RAMIndex.java:216) 
        at net.yacy.kelondro.index.RAMIndexCluster.addUnique(RAMIndexCluster.java:133) 
        at net.yacy.kelondro.index.RowHandleMap.putUnique(RowHandleMap.java:294) 
        at net.yacy.kelondro.index.RowHandleMap$initDataConsumer.call(RowHandleMap.java:499) 
        at net.yacy.kelondro.index.RowHandleMap$initDataConsumer.call(RowHandleMap.java:438) 
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317) 
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) 
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) 
        at java.base/java.lang.Thread.run(Thread.java:1589) 
I 2025/09/03 12:01:34 ConcurrentLog shutdown of ConcurrentLog.Worker void because it was not running. 
E 2025/09/03 12:01:34 UNCAUGHT-EXCEPTION * Thread main: Java heap space 
...

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugIndicates an unexpected problem or unintended behaviorindexprio high

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions