Autor: Roger Camara
Convector is a tiny, practical toolkit to turn .csv
datasets into newline‑delimited JSON (output.jsonl
) with 384‑dim sentence embeddings and the original row as payload
. It pairs with a simple importer to load the file into a local Qdrant vector DB (Docker).
-
convector.py – reads your CSV, auto‑detects columns, builds one text per row, generates 384‑dim embeddings, and writes
output.jsonl
:{"id":"<uuid>", "text":"<row-as-text>", "vector":[...384 floats...], "payload":{...original row...}}
-
qdrantimport.py – asks for
output.jsonl
, lists Qdrant collections, and imports in batches with a progress bar.
We use a free embedding model (
paraphrase-multilingual-MiniLM-L12-v2
) that outputs 384 dimensions. If you switch to another provider (e.g., OpenAI), you can use larger vectors—just make sure your collection size matches.
-
Python 3.9+
-
Install deps:
pip install -r requirements.txt
-
Put your
.csv
next toconvector.py
. -
Run:
python convector.py
-
Paste/drag your CSV path and confirm the detected columns.
-
You’ll get
output.jsonl
in the current folder.
docker run -p 6333:6333 --name qdrant --rm qdrant/qdrant
curl -X PUT "http://localhost:6333/collections/my_collection" -H "Content-Type: application/json" -d '{"vectors": {"size": 384, "distance": "Cosine"}}'
python qdrantimport.py
- Enter
output.jsonl
path. - Press Enter to keep default Qdrant URL (
http://localhost:6333
). - Select
my_collection
. - Watch the progress bar until ✅ Done.
curl -X POST "http://localhost:6333/collections/my_collection/points/search" -H "Content-Type: application/json" -d '{
"vector": [0.1, 0.2, ... 384 floats ...],
"limit": 3
}'
curl -X POST "http://localhost:6333/collections/my_collection/points/scroll" -H "Content-Type: application/json" -d '{
"filter": {
"must": [
{"key": "payload.column_name", "match": {"value": "some_value"}}
]
},
"limit": 3
}'
- Output is always
output.jsonl
in the current folder. - IDs are deterministic (UUID5) for reproducible imports.
- Free model = 384 dimensions. Change model & collection size together.