Skip to content

LSE-DSI/DS205-W08-embeddings-clustering

Repository files navigation

DS205-W08-embeddings-clustering

This repo is for exploring text embeddings for climate data (DS205 W08 Lab).

You will need to have the following folder present with all the PDFs of NDC documents:

# local models get downloaded when you have run the W07 notebooks
local_models/
data/
  ndc-docs-lazy/
  ndc-docs-robust/
  ndc-pdfs/ 

These PDFs need to have been preprocessed with the W07 Lecture / Lab notebooks.

Install the dependencies in a virtual environment (the original implementation was using Python 3.12.3) with the following command:

pip install -r requirements.txt

About

This repo is for exploring text embeddings for climate data (DS205 W08 Lab).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published