pdfquery is a command‑line tool that turns your PDFs into searchable vector indices (FAISS) and lets GPT answer questions using only the relevant passages.
For full usage instructions run pdfquery --help once built or refer to the Usage
Use the instructions below or use the Makefile:
| Command | Description |
|---|---|
make venv |
Create virtual environment (.venv) |
make install |
Install pdfquery in editable mode + dev deps |
make test |
Run pytest |
make lint |
Run ruff linting |
make format |
Run black code formatter |
make docker-build |
Build local Docker image (pdfquery:latest) |
make clean |
Remove caches, build artifacts, and .venv |
# Install locally (editable mode)
pip install -e .
# Build an index (stores files under ./vector/<NAME>/)
pdfquery index --source path/to/file.pdf --name my‑pdf
# Ask a question
pdfquery query --name my‑pdf "What are the main requirements?"Set your OpenAI key first:
export OPENAI_API_KEY="sk‑..." # required# build the image
docker build -t pdfquery:latest .
# run
docker run --rm pdfquery:latest
# example: index a PDF inside the container
docker run -e OPENAI_API_KEY=${OPENAI_API_KEY} --rm -v $PWD:/data pdfquery:latest \
index --source /data/my.pdf --name mydoc