Skip to content

esponges/gpt-langchain-upload-doc-chatbot

Repository files navigation

Chat with any pdf you upload

Powered by the OpenAI API and langchain.

Any uploaded document will be parsed and upserted into the Pinecone vector DB.

The contextual chat is done using Langchain LLM chain methods that retrieve vectorized documents from Pinecone DB and then make use of the OpenAI embeddings to process the conversation.

Tech stack used includes LangChain, Pinecone, Typescript, Openai, and Next.js. LangChain is a framework that makes it easier to build scalable AI/LLM apps and chatbots. Pinecone is a vectorstore for storing embeddings and your PDF in text to later retrieve similar docs.

Development

  1. Clone the repo or download the ZIP
git clone [github https url]
  1. Install packages

First run npm install

  1. Set up your .env file
  • Copy .env.example into .env Your .env file should look like this:
OPENAI_API_KEY=

PINECONE_API_KEY=
PINECONE_ENVIRONMENT=

PINECONE_INDEX_NAME=

  • Visit openai to retrieve API keys and insert into your .env file.
  • Visit pinecone to create and retrieve your API keys, and also retrieve your environment and index name from the dashboard.
  1. In utils/makechain.ts chain change the QA_PROMPT for your own usecase. Change modelName in new OpenAI to gpt-4, if you have access to gpt-4 api. Please verify outside this repo that you have access to gpt-4 api, otherwise the application will not work.
  • Make sure your pinecone dashboard environment and index matches the one in the pinecone.ts and .env files.
  • Check that you've set the vector dimensions to 1536.
  • Make sure your pinecone namespace is in lowercase.
  • Pinecone indexes of users on the Starter(free) plan are deleted after 7 days of inactivity. To prevent this, send an API request to Pinecone to reset the counter before 7 days.
  • Retry from scratch with a new Pinecone project, index, and cloned repo.

Credit

This project is was mostly taken from the Maayooear project. I implemented the parsing & upload with any PDF feature and some extra refactorings.

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •