Gemini-File

Pdf Query chat-bot using Gemini Pro model and Llama Index

Gemini-File is a Streamlit web application that allows users to upload PDF files, index their contents using the Gemini search engine from the Llama-Index library, and query the documents.

Preview

Gemini-File.Preview.mp4

Features

Upload PDF files for indexing.
Perform text queries on the indexed documents.
Powered by the Gemini Pro model and Hugging Face embeddings.

Getting Started

Prerequisites

!! Strongly Recommend running this code while connected to GPU !!

Before you begin, ensure you have the following installed:

Python (>=3.6)
Streamlit
Llama-Index library
Google API key (set as an environment variable)

You can get this Google gemini APi key from Google AI Developer Website , you can easily signup and get one for free.

Google API Key Configuration

The Google API key is set as an environment variable. Ensure it is correctly configured before running the app.

Installation

Clone the repository:

git clone https://github.com/AjayK47/Gemini-File.git

Install dependencies:
```
pip install -r requirements.txt
```

Usage

Uploading a PDF

Run the Streamlit app:
```
streamlit run app.py
```
Access the app in your web browser.
Use the "Upload your PDF" button to upload a PDF file.

Querying Documents

It takes some time to index your file to database or storage depending on size of your file.
Click on the search or submit button to perform the query., it will produce a Response.

Customisation

Using Other Embedding Models

You can customize the embedding model used for document indexing. Edit the 'app.py' file and modify the 'HuggingFaceEmbedding' instantiation:

# Example using a different Hugging Face model
embed_model_custom = HuggingFaceEmbedding(model_name="your/own-model-name")

you can find best text embedding model for you with help of MTEB Leaderboard

Contributing

Contributions are encouraged! Fork the repository, create a feature branch, make changes, push to the branch, and open a pull request

Future improvements

Use Open Source Embedding Models: Explore integrating open-source embedding models instead of relying on proprietary models like Gemini API.
Improved UI/UX: Enhance the user interface and experience for better usability.
Scalability: Optimize the application for large document collections and improve search speed.
Dockerization: Provide a Docker container for easy deployment.

Authors

Ajay K

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
.env		.env
Geminifile.ipynb		Geminifile.ipynb
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Gemini-File

Preview

Table of Contents

Features

Getting Started

Prerequisites

Google API Key Configuration

Installation

Usage

Uploading a PDF

Querying Documents

Customisation

Using Other Embedding Models

Contributing

Future improvements

Authors

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

AjayK47/Gemini-File

Folders and files

Latest commit

History

Repository files navigation

Gemini-File

Preview

Table of Contents

Features

Getting Started

Prerequisites

Google API Key Configuration

Installation

Usage

Uploading a PDF

Querying Documents

Customisation

Using Other Embedding Models

Contributing

Future improvements

Authors

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages