Using a SQL Document Store and Airflow to automate indexing #4030
-
Hey guys, Is there an example of how data is stored in a SQL document store, and how I can trigger a pipeline using an airflow SQL sensor if data is updated or added to the document store? Also, the data source is also a SQL database. Can I do this using the created and updated fields in the sql document store? |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 11 replies
-
Hi @Ashish-Soni08! Not sure what you mean by how data is stored, you can simply use the |
Beta Was this translation helpful? Give feedback.
-
@bogdankostic I had another query.. How can I get the storage space occupied by the indexed data in respective document stores? |
Beta Was this translation helpful? Give feedback.
-
Hi @bogdankostic I hope you are well I had some questions regarding the I generated some data and converted it to haystack format and called when I check the database I see The embeddings column is not found in the database and the vector_id column is full of NULLs. But I can access the other data. Am I doing something wrong? I did map the Is this why the If I explicitly create a table with a field embedding that accepts an array of float values in the database and then try to store the data ? it will work, right? Can I store Haystack documents that way? Why are there methods like |
Beta Was this translation helpful? Give feedback.
Hi @Ashish-Soni08! Not sure what you mean by how data is stored, you can simply use the
write_documents
method to add Documents to your Document Store (see also our Documentation).The
SQLDocumentStore
is usually not used on its own but in combination with a vector database, for example FAISS, to store the document embeddings into a FAISS instance and the document's content and metadata into a SQL database.Also, I'm not very familiar with airflow so I'd be happy if you could provide a bit more information in order to help you.