A cloud function that generates and stores embeddings for content received from Supabase webhooks.
This project deploys a Google Cloud Function that:
- Receives webhook requests from Supabase containing post data
- Extracts the post_id, title, and content
- Generates embeddings for both the title and content using a sentence transformer model
- Stores the post_id with its corresponding embeddings in a Supabase table
├── main.py # Entry point for the cloud function
├── src/ # Source code
│ ├── functions/
│ │ └── embedding.py # Core embedding functionality
│ └── services/
│ └── supabase/
│ └── table_functions.py # Supabase operations
├── embed_current_post.py # Utility to embed existing posts
├── deploy.sh # Deployment script
└── requirements.txt # Dependencies
- Python 3.9+
- Google Cloud CLI
- A Supabase account with API credentials
- Sentence-transformers model
Create a .env
file with the following variables:
SUPABASE_API_URL=your_supabase_url
SUPABASE_API_SECRET_KEY=your_supabase_key
EMBEDDING_MODEL_PATH=your_model_path
EMBEDDING_TABLE_NAME=your_embedding_table_name
- Clone the repository
- Install dependencies:
pip install -r requirements.txt
Run the deployment script:
chmod +x deploy.sh
./deploy.sh
This will deploy the cloud function to Google Cloud in the asia-east1
region.
To generate embeddings for all existing posts:
python embed_current_post.py
Configure a Supabase webhook to trigger on post creation/update events. The webhook should point to your deployed cloud function URL and include the following payload:
{
"record": {
"id": "post_id",
"title_raw": "post title",
"content_raw": "post content"
}
}
The embedding table should have the following structure:
post_id
: Foreign key to the posts tabletitle_embedding
: Vector array for the title embeddingcontent_embedding
: Vector array for the content embedding
This project is in production for OfferLand