intentionally Repeated calls to documentstore.writedocuments and updateindexes to incrementally load DB #3535
-
I am certainly not going to call deletedocuments . is there special error trapping code i need to add? Duplicate will error, per tutorials, but theze will be rare in my app. Will nonduplicate passages correctly insert though? The tutorial didnt say what to expect but i looked sorry if i overlooked. And will collision be tested by haystack during writedatabase, on the passage field, or meta.name field also ? So im asking those two things, namely, is duplicity checked on which of the two texts of one doc during writedatabases, and second, will incremental calling be ok on write_databases and update_indexes? Oh and can update_indexes be possible too i hope? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
Is incremental update supported by update_embeddings? After we insert oneor two new docs via docstore.write_documents, like some apps want to, we certainly don't want docstore.update_embedding s to waste time running again on old documents it already updated embeddings on in the past!. |
Beta Was this translation helpful? Give feedback.
-
Hi @geoffreya, not sure if I understood you correctly. Of course, it is possible to call
If you want to call Let me know if you have further questions :) |
Beta Was this translation helpful? Give feedback.
Hi @geoffreya, not sure if I understood you correctly. Of course, it is possible to call
write_documents
on the DocumentStores incrementally / more than once. Duplicity is checked based on the ID of a Document. If you don't provide an ID yourself, this ID will be generated based on thecontent
field of a Document. There are three options on how we deal with duplicate Documents:skip
: Ignore the new duplicate documents (this means, the existing Document is kept in the DocumentStore while the new duplicate one is discarded.)overwrite
: Update any existing documents with the same ID when adding documents.fail
: An error is raised if the document ID of the document being added already exists.…