Skip to content

SPANDigital/cel2sql

Repository files navigation

cel2sql

cel2sql converts CEL (Common Expression Language) Github Reoi to SQL condition. It is specifically targeting PostgreSQL standard SQL.

Latest Release - v2.8.0

🚀 Latest New Features:

  • 🔥 Regex Pattern Matching Support: Full support for CEL matches() function with RE2 to POSIX regex conversion
  • Enhanced JSON Field Existence: Improved has() macro support for JSON/JSONB fields with proper path checking
  • PostgreSQL Pattern Compatibility: Automatic conversion of RE2 regex patterns to PostgreSQL-compatible POSIX format
  • Advanced Text Search: Support for complex regex operations with proper escaping and PostgreSQL syntax

Recent Features (v2.6.0):

  • 🔥 JSON/JSONB Comprehensions Support: Full support for CEL comprehensions on JSON/JSONB arrays
  • Advanced JSON Array Operations: Support for exists(), all(), exists_one() on JSON/JSONB arrays
  • Numeric JSON Field Casting: Automatic casting of numeric JSON fields (e.g., (score)::numeric)
  • Nested JSON Array Access: Support for comprehensions on nested JSON arrays (e.g., settings.permissions)
  • JSON Type Safety: Null and type checks for JSON/JSONB comprehensions using jsonb_typeof()

Previous Features (v2.5.0):

  • 🔥 CEL Comprehensions Support: Full support for all(), exists(), exists_one(), filter(), and map() comprehensions
  • PostgreSQL UNNEST Integration: Comprehensions are converted to efficient PostgreSQL SQL using UNNEST() and array functions
  • Nested Comprehensions: Support for complex nested comprehensions like employees.exists(e, e.skills.exists(s, s == 'Go'))
  • Array Field Access: Direct comprehensions on array fields in PostgreSQL schemas

Previous Features (v2.4.0):

  • Comprehensive JSON/JSONB Support: Native PostgreSQL JSON path operations
  • Dynamic Schema Loading: Load table schemas directly from PostgreSQL databases
  • Enhanced Type System: Improved PostgreSQL type mappings and array support
  • Testcontainer Integration: Full test coverage with real PostgreSQL databases

Key Improvements:

  • PostgreSQL-optimized SQL generation (single quotes, proper functions)
  • JSON field access: user.preferences.theme → user.preferences->>'theme'
  • Array operations: size(array) → ARRAY_LENGTH(array, 1)
  • String operations: contains() → POSITION(...) > 0
  • CEL comprehensions: list.all(x, x > 0) → NOT EXISTS (SELECT 1 FROM UNNEST(list) AS x WHERE NOT (x > 0))
  • All tests pass with comprehensive integration coverage

Usage

import (
    "context"
    "fmt"
    
    "github.com/spandigital/cel2sql/v2"
    "github.com/spandigital/cel2sql/v2/pg"
    "github.com/spandigital/cel2sql/v2/sqltypes"
    "github.com/google/cel-go/cel"
)

// PostgreSQL table schema definition
employeeSchema := pg.Schema{
    {Name: "name", Type: "text", Repeated: false},
    {Name: "hired_at", Type: "timestamp with time zone", Repeated: false},
    {Name: "age", Type: "integer", Repeated: false},
    {Name: "active", Type: "boolean", Repeated: false},
}

// Prepare CEL environment
env, _ := cel.NewEnv(
    cel.CustomTypeProvider(pg.NewTypeProvider(map[string]pg.Schema{
        "Employee": employeeSchema,
    })),
    cel.Variable("employee", cel.ObjectType("Employee")),
)

// Convert CEL to SQL
ast, _ := env.Compile(`employee.name == "John Doe" && employee.hired_at >= current_timestamp() - duration("24h")`)
sqlCondition, _ := cel2sql.Convert(ast)

fmt.Println(sqlCondition) // employee.name = 'John Doe' AND employee.hired_at >= CURRENT_TIMESTAMP - INTERVAL '1 DAY'

Dynamic Schema Loading

cel2sql supports dynamically loading table schemas from a PostgreSQL database:

import (
    "context"
    "fmt"
    
    "github.com/spandigital/cel2sql/v2"
    "github.com/spandigital/cel2sql/v2/pg"
    "github.com/spandigital/cel2sql/v2/sqltypes"
    "github.com/google/cel-go/cel"
)

func main() {
    ctx := context.Background()
    
    // Create a type provider with database connection
    provider, err := pg.NewTypeProviderWithConnection(ctx, "postgres://user:pass@localhost/mydb?sslmode=disable")
    if err != nil {
        panic(err)
    }
    defer provider.Close()
    
    // Load table schema dynamically from database
    err = provider.LoadTableSchema(ctx, "employees")
    if err != nil {
        panic(err)
    }
    
    // Use the loaded schema in CEL environment
    env, err := cel.NewEnv(
        cel.CustomTypeProvider(provider),
        cel.Variable("employee", cel.ObjectType("employees")),
    )
    if err != nil {
        panic(err)
    }
    
    // Convert CEL to SQL using the dynamically loaded schema
    ast, issues := env.Compile(`employee.name == "John Doe" && employee.age > 30`)
    if issues != nil && issues.Err() != nil {
        panic(issues.Err())
    }
    
    sqlCondition, err := cel2sql.Convert(ast)
    if err != nil {
        panic(err)
    }
    
    fmt.Println(sqlCondition)
    // Output: employee.name = 'John Doe' AND employee.age > 30
}

This approach is particularly useful when:

  • Database schemas change frequently
  • You want to avoid manually defining schemas
  • Working with multiple tables with different structures
  • Building dynamic query builders

Type Conversion

CEL Type PostgreSQL Data Type
int bigint
uint Unsupported but treated as bigint
double double precision
bool boolean
string text
bytes bytea
list ARRAY
map JSONB (for complex objects)
null_type NULL
timestamp timestamp with time zone
duration INTERVAL

JSON/JSONB Support

cel2sql provides comprehensive support for PostgreSQL JSON and JSONB columns:

// Example with JSON/JSONB fields
userSchema := pg.Schema{
    {Name: "name", Type: "text", Repeated: false},
    {Name: "preferences", Type: "jsonb", Repeated: false},
    {Name: "profile", Type: "json", Repeated: false},
}

env, _ := cel.NewEnv(
    cel.CustomTypeProvider(pg.NewTypeProvider(map[string]pg.Schema{
        "User": userSchema,
    })),
    cel.Variable("user", cel.ObjectType("User")),
)

// CEL expressions automatically convert to PostgreSQL JSON path operations
ast, _ := env.Compile(`user.preferences.theme == "dark"`)
sqlCondition, _ := cel2sql.Convert(ast)
fmt.Println(sqlCondition) // user.preferences->>'theme' = 'dark'

// Nested JSON access
ast, _ = env.Compile(`user.profile.settings.notifications == "enabled"`)
sqlCondition, _ = cel2sql.Convert(ast)
fmt.Println(sqlCondition) // user.profile->>'settings'->>'notifications' = 'enabled'

Supported JSON Operations:

  • Field access: user.preferences.theme → user.preferences->>'theme'
  • Nested access: user.profile.settings.key → user.profile->>'settings'->>'key'
  • Works with both json and jsonb column types
  • Automatically detects JSON columns and applies proper PostgreSQL syntax

Regex Pattern Matching

cel2sql provides comprehensive support for CEL matches() function with automatic RE2 to POSIX regex conversion:

// Define schema with text fields for pattern matching
schema := pg.Schema{
    {Name: "email", Type: "text", Repeated: false},
    {Name: "phone", Type: "text", Repeated: false},
    {Name: "description", Type: "text", Repeated: false},
}

env, _ := cel.NewEnv(
    cel.CustomTypeProvider(pg.NewTypeProvider(map[string]pg.Schema{
        "Contact": schema,
    })),
    cel.Variable("contact", cel.ObjectType("Contact")),
)

// Email validation pattern
ast, _ := env.Compile(`contact.email.matches(r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$")`)
sqlCondition, _ := cel2sql.Convert(ast)
fmt.Println(sqlCondition) 
// Output: contact.email ~ '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'

// Phone number pattern matching
ast, _ = env.Compile(`contact.phone.matches(r"^\+?1?[-.\s]?\(?([0-9]{3})\)?[-.\s]?([0-9]{3})[-.\s]?([0-9]{4})$")`)
sqlCondition, _ = cel2sql.Convert(ast)
fmt.Println(sqlCondition)
// Output: contact.phone ~ '^\+?1?[-.\s]?\(?([0-9]{3})\)?[-.\s]?([0-9]{3})[-.\s]?([0-9]{4})$'

// Case-insensitive pattern matching
ast, _ = env.Compile(`contact.description.matches(r"(?i)urgent|priority")`)
sqlCondition, _ = cel2sql.Convert(ast)
fmt.Println(sqlCondition)
// Output: contact.description ~* 'urgent|priority'

Supported Regex Features:

  • RE2 to POSIX Conversion: Automatic conversion of CEL RE2 patterns to PostgreSQL-compatible POSIX regex
  • Case-insensitive Matching: (?i) flag converts to PostgreSQL ~* operator
  • Standard Patterns: Email validation, phone numbers, URLs, and custom text patterns
  • Escape Handling: Proper escaping of special characters for PostgreSQL
  • Pattern Optimization: Efficient regex compilation and execution in PostgreSQL

Regex Operators:

  • ~ - Case-sensitive POSIX regex match
  • ~* - Case-insensitive POSIX regex match (when (?i) flag detected)
  • !~ - Case-sensitive POSIX regex non-match
  • !~* - Case-insensitive POSIX regex non-match

Enhanced JSON Field Existence

The has() macro provides enhanced support for checking JSON/JSONB field existence:

// Check if JSON field exists
ast, _ := env.Compile(`has(user.preferences.theme)`)
sqlCondition, _ := cel2sql.Convert(ast)
fmt.Println(sqlCondition) 
// Output: user.preferences ? 'theme'

// Check nested JSON field existence
ast, _ = env.Compile(`has(user.profile.settings.notifications)`)
sqlCondition, _ := cel2sql.Convert(ast)
fmt.Println(sqlCondition)
// Output: user.profile->'settings' ? 'notifications'

// Combined existence and value check
ast, _ = env.Compile(`has(user.preferences.theme) && user.preferences.theme == "dark"`)
sqlCondition, _ = cel2sql.Convert(ast)
fmt.Println(sqlCondition)
// Output: user.preferences ? 'theme' AND user.preferences->>'theme' = 'dark'

JSON Field Existence Features:

  • Path-aware Checking: Properly handles nested JSON path existence
  • JSONB Optimization: Uses PostgreSQL's efficient ? operator for JSONB fields
  • Null Safety: Prevents errors when accessing non-existent JSON fields
  • Combined Operations: Works seamlessly with value comparisons and other JSON operations

Supported CEL Operators/Functions

Symbol Type SQL
! (bool) -> bool NOT bool
- (unary) (int) -> int -int
(double) -> double -double
!= (A, A) -> bool A != A
(bool, bool) -> bool bool IS NOT bool
(A, null) -> bool A IS NOT NULL
% (int, int) -> int MOD(int, int)
&& (bool, bool) -> bool bool AND bool
* (int, int) -> int int * int
(double, double) -> double double * double
+ (int, int) -> int int + int
(double, double) -> double double + double
(string, string) -> string string || string
(bytes, bytes) -> bytes bytes || bytes
(list(A), list(A)) -> list(A) list(A) || list(A)
(google.protobuf.Timestamp, google.protobuf.Duration) -> google.protobuf.Timestamp TIMESTAMP_ADD(timestamp, INTERVAL duration date_part)
(google.protobuf.Duration, google.protobuf.Timestamp) -> google.protobuf.Timestamp TIMESTAMP_ADD(timestamp, INTERVAL duration date_part)
- (binary) (int, int) -> int int - int
(double, double) -> double double - double
(google.protobuf.Timestamp, google.protobuf.Duration) -> google.protobuf.Timestamp TIMESTAMP_SUB(timestamp, INTERVAL duration date_part)
/ (int, int) -> int int / int
(double, double) -> double double / double
<= (bool, bool) -> bool bool <= bool
(int, int) -> bool int <= int
(double, double) -> bool double <= double
(string, string) -> bool string <= string
(bytes, bytes) -> bool bytes <= bytes
(google.protobuf.Timestamp, google.protobuf.Timestamp) -> bool timestamp <= timestamp
< (bool, bool) -> bool bool < bool
(int, int) -> bool int < int
(double, double) -> bool double < double
(string, string) -> bool string < string
(bytes, bytes) -> bool bytes < bytes
(google.protobuf.Timestamp, google.protobuf.Timestamp) -> bool timestamp < timestamp
== (A, A) -> bool A = A
(bool, bool) -> bool A IS A
(A, null) -> bool A IS NULL
>= (bool, bool) -> bool bool >= bool
(int, int) -> bool int >= int
(double, double) -> bool double >= double
(string, string) -> bool string >= string
(bytes, bytes) -> bool bytes >= bytes
(google.protobuf.Timestamp, google.protobuf.Timestamp) -> bool timestamp >= timestamp
> (bool, bool) -> bool bool > bool
(int, int) -> bool int > int
(double, double) -> bool double > double
(string, string) -> bool string > string
(bytes, bytes) -> bool bytes > bytes
(google.protobuf.Timestamp, google.protobuf.Timestamp) -> bool timestamp > timestamp
? : (bool, A, A) -> A IF(bool, A, A)
[ ] (list(A), int) -> A list[OFFSET(int)]
(map(A, B), A) -> B map.`A`
in (A, list(A)) -> bool A IN UNNEST(list)
|| (bool, bool) -> bool bool OR bool
bool (int) -> bool CAST(int AS BOOL)
(string) -> bool CAST(string AS BOOL)
bytes (string) -> bytes CAST(stringAS BYTES)
contains string.(string) -> bool POSITION(string IN string) > 0
double (int) -> double CAST(int AS FLOAT64)
(string) -> double CAST(string AS FLOAT64)
duration (string) -> google.protobuf.Duration INTERVAL duration date_part
endsWith string.(string) -> bool ENDS_WITH(string, string)
getDate google.protobuf.Timestamp.() -> int EXTRACT(DAY FROM timestamp)
google.protobuf.Timestamp.(string) -> int EXTRACT(DAY FROM timestamp AT string)
getDayOfMonth google.protobuf.Timestamp.() -> int EXTRACT(DAY FROM timestamp) - 1
google.protobuf.Timestamp.(string) -> int EXTRACT(DAY FROM timestamp AT string) - 1
getDayOfWeek google.protobuf.Timestamp.() -> int EXTRACT(DAYOFWEEK FROM timestamp) - 1
google.protobuf.Timestamp.(string) -> int EXTRACT(DAYOFWEEK FROM timestamp AT string) - 1
getDayOfYear google.protobuf.Timestamp.() -> int EXTRACT(DAYOFYEAR FROM timestamp) - 1
google.protobuf.Timestamp.(string) -> int EXTRACT(DAYOFYEAR FROM timestamp AT string) - 1
getFullYear google.protobuf.Timestamp.() -> int EXTRACT(YEAR FROM timestamp)
google.protobuf.Timestamp.(string) -> int EXTRACT(YEAR FROM timestamp AT string)
getHours google.protobuf.Timestamp.() -> int EXTRACT(HOUR FROM timestamp)
google.protobuf.Timestamp.(string) -> int EXTRACT(HOUR FROM timestamp AT string)
getMilliseconds google.protobuf.Timestamp.() -> int EXTRACT(MILLISECOND FROM timestamp)
google.protobuf.Timestamp.(string) -> int EXTRACT(MILLISECOND FROM timestamp AT string)
getMinutes google.protobuf.Timestamp.() -> int EXTRACT(MINUTE FROM timestamp)
google.protobuf.Timestamp.(string) -> int EXTRACT(MINUTE FROM timestamp AT string)
getMonth google.protobuf.Timestamp.() -> int EXTRACT(MONTH FROM timestamp) - 1
google.protobuf.Timestamp.(string) -> int EXTRACT(MONTH FROM timestamp AT string) - 1
getSeconds google.protobuf.Timestamp.() -> int EXTRACT(SECOND FROM timestamp)
google.protobuf.Timestamp.(string) -> int EXTRACT(SECOND FROM timestamp AT string)
has (map) -> bool JSON/JSONB field ? 'key' OR nested_path -> 'parent' ? 'key'
int (bool) -> int CAST(bool AS INT64)
(double) -> int CAST(double AS INT64)
(string) -> int CAST(string AS INT64)
(google.protobuf.Timestamp) -> int UNIX_SECONDS(timestamp)
matches string.(string) -> bool string ~ regex_pattern OR string ~* regex_pattern (case-insensitive)
size (string) -> int CHAR_LENGTH(string)
(bytes) -> int BYTE_LENGTH(bytes)
(list(A)) -> int ARRAY_LENGTH(list, 1)
startsWith string.(string) -> bool STARTS_WITHstring, string)
string (bool) -> string CAST(bool AS STRING)
(int) -> string CAST(int AS STRING)
(double) -> string CAST(double AS STRING)
(bytes) -> string CAST(bytes AS STRING)
(timestamp) -> string CAST(timestamp AS STRING)
timestamp (string) -> google.protobuf.Timestamp TIMESTAMP(string)

Standard SQL Types/Functions

cel2sql supports time related types bellow.

  • DATE
  • TIME
  • DATETIME

cel2sql contains time related functions bellow.

  • current_date()
  • current_time()
  • current_datetime()
  • current_timestamp()
  • interval(N, date_part)

CEL Comprehensions

cel2sql now supports CEL comprehensions for working with lists and arrays. Comprehensions are converted to PostgreSQL-compatible SQL using UNNEST() and various array functions.

Supported Comprehension Types

CEL Expression Description Generated SQL Pattern
list.all(x, condition) All elements satisfy condition NOT EXISTS (SELECT 1 FROM UNNEST(list) AS x WHERE NOT (condition))
list.exists(x, condition) At least one element satisfies condition EXISTS (SELECT 1 FROM UNNEST(list) AS x WHERE condition)
list.exists_one(x, condition) Exactly one element satisfies condition (SELECT COUNT(*) FROM UNNEST(list) AS x WHERE condition) = 1
list.filter(x, condition) Return elements that satisfy condition ARRAY(SELECT x FROM UNNEST(list) AS x WHERE condition)
list.map(x, transform) Transform all elements ARRAY(SELECT transform FROM UNNEST(list) AS x)

Examples

Simple Array Comprehensions

// Check if all numbers are positive
cel: [1, 2, 3, 4, 5].all(x, x > 0)
sql: NOT EXISTS (SELECT 1 FROM UNNEST(ARRAY[1, 2, 3, 4, 5]) AS x WHERE NOT (x > 0))

// Check if any number is greater than 3
cel: [1, 2, 3, 4, 5].exists(x, x > 3)
sql: EXISTS (SELECT 1 FROM UNNEST(ARRAY[1, 2, 3, 4, 5]) AS x WHERE x > 3)

// Filter even numbers
cel: [1, 2, 3, 4, 5].filter(x, x % 2 == 0)
sql: ARRAY(SELECT x FROM UNNEST(ARRAY[1, 2, 3, 4, 5]) AS x WHERE MOD(x, 2) = 0)

// Double all numbers
cel: [1, 2, 3, 4, 5].map(x, x * 2)
sql: ARRAY(SELECT x * 2 FROM UNNEST(ARRAY[1, 2, 3, 4, 5]) AS x)

PostgreSQL Schema-based Comprehensions

// Define schema with array fields
schema := pg.Schema{
    {Name: "name", Type: "text", Repeated: false},
    {Name: "age", Type: "bigint", Repeated: false},
    {Name: "skills", Type: "text", Repeated: true}, // Array field
}

// CEL expressions with comprehensions
cel: employees.all(e, e.age >= 18)
sql: NOT EXISTS (SELECT 1 FROM UNNEST(employees) AS e WHERE NOT (e.age >= 18))

cel: employees.filter(e, e.age > 30).map(e, e.name)
sql: ARRAY(SELECT e.name FROM UNNEST(ARRAY(SELECT e FROM UNNEST(employees) AS e WHERE e.age > 30)) AS e)

cel: emp.skills.exists(s, s == 'Go')
sql: EXISTS (SELECT 1 FROM UNNEST(emp.skills) AS s WHERE s = 'Go')

Nested Comprehensions

// Check if any employee has Go skills (nested comprehension)
cel: employees.exists(e, e.skills.exists(s, s == 'Go'))
sql: EXISTS (SELECT 1 FROM UNNEST(employees) AS e WHERE EXISTS (SELECT 1 FROM UNNEST(e.skills) AS s WHERE s = 'Go'))

// Filter employees with all high scores
cel: employees.filter(e, e.scores.all(s, s >= 80))
sql: ARRAY(SELECT e FROM UNNEST(employees) AS e WHERE NOT EXISTS (SELECT 1 FROM UNNEST(e.scores) AS s WHERE NOT (s >= 80)))

Working with Composite Types

// Define nested schema
addressSchema := pg.Schema{
    {Name: "city", Type: "text", Repeated: false},
    {Name: "country", Type: "text", Repeated: false},
}

employeeSchema := pg.Schema{
    {Name: "name", Type: "text", Repeated: false},
    {Name: "address", Type: "composite", Schema: addressSchema},
}

// CEL with nested field access
cel: employees.filter(e, e.address.city == 'New York')
sql: ARRAY(SELECT e FROM UNNEST(employees) AS e WHERE e.address.city = 'New York')

cel: employees.map(e, e.address.city)
sql: ARRAY(SELECT e.address.city FROM UNNEST(employees) AS e)

Performance Considerations

  • UNNEST with large arrays: PostgreSQL's UNNEST() function is efficient but consider indexing strategies for large datasets
  • Nested comprehensions: May generate complex SQL; consider restructuring data or using materialized views for frequently accessed patterns
  • Map operations: Return new arrays which may use memory; consider streaming for large results

Usage in Practice

package main

import (
    "fmt"
    "github.com/google/cel-go/cel"
    "github.com/spandigital/cel2sql/v2"
    "github.com/spandigital/cel2sql/v2/pg"
)

func main() {
    // Define PostgreSQL schema
    schema := pg.Schema{
        {Name: "id", Type: "bigint", Repeated: false},
        {Name: "name", Type: "text", Repeated: false},
        {Name: "skills", Type: "text", Repeated: true},
        {Name: "scores", Type: "bigint", Repeated: true},
    }

    provider := pg.NewTypeProvider(map[string]pg.Schema{"Employee": schema})

    env, _ := cel.NewEnv(
        cel.CustomTypeProvider(provider),
        cel.Variable("employees", cel.ListType(cel.ObjectType("Employee"))),
    )

    // Compile and convert CEL comprehension to SQL
    ast, _ := env.Compile(`employees.filter(e, e.scores.all(s, s >= 80)).map(e, e.name)`)
    sqlCondition, _ := cel2sql.Convert(ast)
    
    fmt.Println(sqlCondition)
    // Output: ARRAY(SELECT e.name FROM UNNEST(ARRAY(SELECT e FROM UNNEST(employees) AS e WHERE NOT EXISTS (SELECT 1 FROM UNNEST(e.scores) AS s WHERE NOT (s >= 80)))) AS e)
}

About

CEL to SQL condition, compatible with PostgreSQL

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •