Recommend me an approach for a Chatbot please #17503
PabloOchoaWITHIN
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello! I'm trying to make a chatbot that is able to answer questions using internal documentation from my company. I wanted to know what approach you recommend me to take. The general idea is the following:
I have documents in Google Drive (docs, PDFs, slides), and I want to use an LLM that is able to understand this information and answer any questions I have with regards to these documents. The idea is to create a pipeline such that the files get added to the knowledge of the LLM in such way that I don't have to be adding the files manually.
I ask for your help to give me your opinion on what approach I should take to achieve this. Here are three approaches I tried and some issues I faced:
Custom GPTs - Uploading files
My first approach was using Custom GPTs to do this but I ran in the issue that the number of files you can upload to the Knowledge Base (uploaded files) was limited to 10 files and you had to upload these manually. I need a much higher limit and I don't want to upload files manually.
Custom GPTs - Actions (API calls)
Additionally, I tried using the CustomGPT's actions to call an endpoint where my file was located (used REST API of Google Storage to get the file) but the problem is that the GPT was able to read files up to 100 KB during runtime, and meaning that for the GPT to answer, it has to read a file and then answers you if it was able to find the answer in the file. The problem with this is that if I wanted to ask something that relates more than one document (example. How do you compare month1.txt with month2.txt?) it would need to read more than one document, plus, it would need to know the exact name and location of that file in order to be able to read it through the API.
RAG with LangChain
My second approach was using Langchain with Chroma DB where the first thing I did was pull documents from Google Drive, then split those documents into chunks using Langchain, make embeddings using the VertexAI API and store those embeddings in a Chroma database. Then used PaLM with text-bison@002 as my LLM. Finally, the user was able to ask any questions with regards to the documents and the LLM was able to answer. This seemed to me like the right approach, but I wanted to make sure and see different opinions from people who know about this topics.
The only downside I found with this approach is that I will probably need to create my own UI right? or does anyone knows if it is possible to use this RAG Application with the OpenAI Interface?
I hope you can help me. If you have any questions please let me know! Your answers will be a big help for me.
Thank you.
Beta Was this translation helpful? Give feedback.
All reactions