PDF Talker

A web application that allows users to upload PDF files and chat with them using AI. The application processes the PDF content, stores it in a vector database (ChromaDB) and knowledge graph (Neo4j), and enables natural language conversations about the PDF content.

Features

PDF Upload: Upload PDF files of any size
PDF Processing: Automatic chunking and embedding of PDF content
Vector Storage: ChromaDB for semantic search and retrieval
Knowledge Graph: Neo4j for storing relationships between chunks
Chat Interface: Ask questions about your PDF content in natural language
Voice Input: Record and send voice messages to chat with PDFs using speech
Chat History: View and manage your conversation history
Agentic AI: LangGraph workflow for intelligent responses
API Documentation: Interactive Swagger UI to explore and test the API
Containerization: Docker setup for easy deployment

Tech Stack

Backend: Node.js, Express
Frontend: EJS templates, Bootstrap, JavaScript
Databases:
- MongoDB: For storing user data, PDF metadata, and chat history
- ChromaDB: Vector database for PDF content embeddings
- Neo4j: Graph database for knowledge relationships
AI:
- LangChain.js: For building AI workflows
- LangGraph: For creating agentic AI systems
- OpenAI: For embeddings, text generation, and speech-to-text conversion
Documentation:
- Swagger UI: Interactive API documentation
Deployment:
- Docker & Docker Compose: For containerization and orchestration

Setup

Prerequisites

Node.js and npm (for local development)
Docker and Docker Compose (for containerized setup)
OpenAI API key

Installation

Option 1: Standard Installation

Clone the repository:

git clone https://github.com/yourusername/pdf-talker.git
cd pdf-talker

Install dependencies:
```
npm install
```

Create a .env file in the project root with the following variables:

PORT=3000
MONGODB_URI=mongodb://localhost:27017/pdf-talker
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=your_password
OPENAI_API_KEY=your_openai_api_key

Start the ChromaDB server:

docker run -p 8000:8000 chromadb/chroma:latest

Run the application:
```
npm start
```
For development with auto-restart:
```
npm run dev
```
Open your browser and navigate to http://localhost:3000

Option 2: Docker Installation

Clone the repository:

git clone https://github.com/yourusername/pdf-talker.git
cd pdf-talker

Create a .env file based on the .env.docker template:
```
cp .env.docker .env
```
Edit the .env file and add your OpenAI API key.
Start all services using Docker Compose:
```
docker-compose up -d
```
This will start the following containers:
- MongoDB database
- Neo4j graph database
- ChromaDB server
- PDF Talker Node.js application
Open your browser and navigate to:
- http://localhost:3000 - PDF Talker application
- http://localhost:3000/api-docs - API documentation
- http://localhost:7474 - Neo4j Browser (credentials: neo4j/password)
- http://localhost:8000 - ChromaDB server API
To stop all services:
```
docker-compose down
```
To stop and remove all data volumes:
```
docker-compose down -v
```

Docker Helper Script

For convenience, a helper script is provided to manage Docker operations:

Make the script executable (if not already):
```
chmod +x docker.sh
```

Use the script to manage containers:

./docker.sh start    # Start all containers
./docker.sh stop     # Stop all containers
./docker.sh restart  # Restart all containers
./docker.sh status   # Show container status
./docker.sh logs     # View container logs
./docker.sh clean    # Stop and remove volumes
./docker.sh help     # Show help message

The script will automatically create a .env file from the template if one doesn't exist.

Docker Architecture

The Docker setup features:

Service Orchestration: Docker Compose manages all services
Inter-container Communication: Services communicate over a dedicated network
Health Checks: Each service monitors its own health
Dependency Management: The app waits for all services to be ready before starting
Data Persistence:
- MongoDB and Neo4j data is stored in Docker volumes
- ChromaDB data is stored in persistent volume for vector embeddings
Environment Isolation: Configuration via environment variables

Usage

Upload a PDF from the homepage
Wait for the PDF to be processed (status will change from "Processing" to "Ready")
Click on "Chat" to start a conversation about the PDF
Ask questions in natural language about the PDF content:
- Type your question and click "Send"
- Or click "Record" to use voice input, then stop when finished
View and manage your chat history

Audio Input Support

PDF Talker supports the following audio formats for voice input:

MP3 (.mp3)
WAV (.wav)
OGG/Vorbis (.ogg)
WebM (.webm)
FLAC (.flac)
M4A (.m4a)
MP4 audio (.mp4)
MPEG audio (.mpeg, .mpga)
OGG audio (.oga)

The maximum file size for audio uploads is 20MB.

Utility Scripts

The application includes several utility scripts to help with maintenance and debugging:

Test ChromaDB connection:
```
node utils/test-chroma-connection.js
```
Rebuild ChromaDB collections from existing PDFs:
```
node utils/rebuild-chroma-collections.js
```

Test the complete chat pipeline with a specific PDF:

node utils/test-chat-pipeline.js <pdfId> "Your test question"

API Documentation

The application includes interactive API documentation powered by Swagger UI.

Start the application
Navigate to http://localhost:3000/api-docs in your browser
Explore and test the available API endpoints
You can also access the raw OpenAPI specification at http://localhost:3000/swagger.json

Project Structure

pdf-talker/
├── config/              # Configuration files
├── models/              # MongoDB models
├── public/              # Static assets
│   ├── css/             # Stylesheets
│   └── js/              # Client-side JavaScript
├── routes/              # Express route handlers
├── services/            # Business logic and services
├── uploads/             # Uploaded PDF files
│   └── audio/           # Temporary audio recordings
├── utils/               # Utility functions
│   ├── chromaStore.js              # ChromaDB client configuration
│   ├── test-chroma-connection.js   # Test ChromaDB connectivity
│   ├── test-chat-pipeline.js       # Test complete chat workflow
│   └── rebuild-chroma-collections.js # Recreate ChromaDB collections
├── views/               # EJS templates
│   ├── partials/        # Reusable template parts
│   ├── pdf/             # PDF-related templates
│   └── chat/            # Chat-related templates
├── .dockerignore        # Files to exclude from Docker build
├── .env                 # Environment variables
├── .env.docker          # Template for Docker environment variables
├── .gitignore           # Git ignore file
├── app.js               # Main application file
├── docker-compose.yml   # Docker Compose configuration
├── docker-start.sh      # Docker startup script
├── docker.sh            # Docker helper script
├── Dockerfile           # Docker image definition
├── package.json         # Project dependencies
├── swagger.js           # Swagger configuration
├── swagger-routes.js    # Swagger route annotations
├── wait-for-it.sh       # Service availability checker
└── README.md            # Project documentation

Troubleshooting

ChromaDB Issues

The application requires a running ChromaDB server at http://localhost:8000
If you encounter vector search errors, try rebuilding the collections with node utils/rebuild-chroma-collections.js
To test ChromaDB connectivity, run node utils/test-chroma-connection.js

Audio Processing Issues

Ensure your audio file is in one of the supported formats
Check that the file size is under 20MB
If transcription fails, try a different audio format or check server logs for detailed error messages

License

ISC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PDF Talker

Features

Tech Stack

Setup

Prerequisites

Installation

Option 1: Standard Installation

Option 2: Docker Installation

Docker Helper Script

Docker Architecture

Usage

Audio Input Support

Utility Scripts

API Documentation

Project Structure

Troubleshooting

ChromaDB Issues

Audio Processing Issues

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
config		config
models		models
public/css		public/css
routes		routes
services		services
uploads		uploads
utils		utils
views		views
.dockerignore		.dockerignore
.env.docker		.env.docker
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
app.js		app.js
debug-chroma.js		debug-chroma.js
docker-compose.yml		docker-compose.yml
docker-start.sh		docker-start.sh
docker.sh		docker.sh
env.example		env.example
package-lock.json		package-lock.json
package.json		package.json
swagger-routes.js		swagger-routes.js
swagger.js		swagger.js
wait-for-it.sh		wait-for-it.sh

jogi47/pdf-talker

Folders and files

Latest commit

History

Repository files navigation

PDF Talker

Features

Tech Stack

Setup

Prerequisites

Installation

Option 1: Standard Installation

Option 2: Docker Installation

Docker Helper Script

Docker Architecture

Usage

Audio Input Support

Utility Scripts

API Documentation

Project Structure

Troubleshooting

ChromaDB Issues

Audio Processing Issues

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages