Skip to content

Cannot restore large index #66

@NicolasAlmerge

Description

@NicolasAlmerge

Hello,

I am using Python 3.10.9 and vectordb==0.0.20 (latest as of this date), and I have a trouble when restoring saved data.

I have two large files A and B, and when I index them, snapshot them and restore them separately, everything works fine.

When I read and parse files A and B, index all the documents in both, then save them together, the snapshotting is successful. However, when trying to restore the data, I get the following error:

Traceback (most recent call last):
  ...
  File "~/.local/lib/python3.10/site-packages/vectordb/db/executors/inmemory_exact_indexer.py", line 86, in restore
    self._indexer = InMemoryExactNNIndex[self._input_schema](index_file_path=snapshot_file)
  File "~/.local/lib/python3.10/site-packages/docarray/index/backends/in_memory.py", line 68, in __init__
    self._docs = DocList.__class_getitem__(
  File "~/.local/lib/python3.10/site-packages/docarray/array/doc_list/io.py", line 810, in load_binary
    return cls._load_binary_all(
  File "~ /.local/lib/python3.10/site-packages/docarray/array/doc_list/io.py", line 608, in _load_binary_all
    proto.ParseFromString(d)
google.protobuf.message.DecodeError: Error parsing message

Given previous tests I made and explanation, I suspect the issue is that the index is too large, hence raising the error. Does anyone know what can be done to fix this issue?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions