Skip to content

HaohanTsao/Embedding_Model_API

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Embedding Model API

A cloud function that generates and stores embeddings for content received from Supabase webhooks.

Overview

This project deploys a Google Cloud Function that:

  1. Receives webhook requests from Supabase containing post data
  2. Extracts the post_id, title, and content
  3. Generates embeddings for both the title and content using a sentence transformer model
  4. Stores the post_id with its corresponding embeddings in a Supabase table

Architecture

├── main.py                # Entry point for the cloud function
├── src/                   # Source code
│   ├── functions/
│   │   └── embedding.py   # Core embedding functionality
│   └── services/
│       └── supabase/
│           └── table_functions.py  # Supabase operations
├── embed_current_post.py  # Utility to embed existing posts
├── deploy.sh              # Deployment script
└── requirements.txt       # Dependencies

Setup

Prerequisites

  • Python 3.9+
  • Google Cloud CLI
  • A Supabase account with API credentials
  • Sentence-transformers model

Environment Variables

Create a .env file with the following variables:

SUPABASE_API_URL=your_supabase_url
SUPABASE_API_SECRET_KEY=your_supabase_key
EMBEDDING_MODEL_PATH=your_model_path
EMBEDDING_TABLE_NAME=your_embedding_table_name

Installation

  1. Clone the repository
  2. Install dependencies:
    pip install -r requirements.txt
    

Usage

Deploy to Google Cloud

Run the deployment script:

chmod +x deploy.sh
./deploy.sh

This will deploy the cloud function to Google Cloud in the asia-east1 region.

Embedding Existing Posts

To generate embeddings for all existing posts:

python embed_current_post.py

Webhook Integration

Configure a Supabase webhook to trigger on post creation/update events. The webhook should point to your deployed cloud function URL and include the following payload:

{
  "record": {
    "id": "post_id",
    "title_raw": "post title",
    "content_raw": "post content"
  }
}

Database Schema

The embedding table should have the following structure:

  • post_id: Foreign key to the posts table
  • title_embedding: Vector array for the title embedding
  • content_embedding: Vector array for the content embedding

Note

This project is in production for OfferLand

About

API for text embedding

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published