This repo is for exploring text embeddings for climate data (DS205 W08 Lab).
You will need to have the following folder present with all the PDFs of NDC documents:
# local models get downloaded when you have run the W07 notebooks
local_models/
data/
ndc-docs-lazy/
ndc-docs-robust/
ndc-pdfs/
These PDFs need to have been preprocessed with the W07 Lecture / Lab notebooks.
Install the dependencies in a virtual environment (the original implementation was using Python 3.12.3) with the following command:
pip install -r requirements.txt