diff --git a/.gitignore b/.gitignore index 09f943d7..55d2f401 100644 --- a/.gitignore +++ b/.gitignore @@ -203,4 +203,7 @@ release/ # jupyter notebook .ipynb_checkpoints -__pycache__ \ No newline at end of file +__pycache__ + +# dbdev +eql--*.sql \ No newline at end of file diff --git a/DEVELOPMENT.md b/DEVELOPMENT.md index 40e053e3..3240ea93 100644 --- a/DEVELOPMENT.md +++ b/DEVELOPMENT.md @@ -274,6 +274,28 @@ To cut a [release](https://github.com/cipherstash/encrypt-query-language/release This will trigger the [Release EQL](https://github.com/cipherstash/encrypt-query-language/actions/workflows/release-eql.yml) workflow, which will build and attach artifacts to [the release](https://github.com/cipherstash/encrypt-query-language/releases/). +### dbdev + +We publish a Trusted Language Extension for PostgreSQL for use on [dbdev](https://database.dev/). +You can find the extension on [dbdev's extension catalog](https://database.dev/cipherstash/eql). + +#### Publishing to dbdev + +**DISCLAIMER:** At the moment, we are manually publishing the extension to dbdev and the versions might not be in sync with the releases on GitHub until we automate this process. + +Steps to publish + +> [!NOTE] +> Make sure you have the [dbdev CLI](https://supabase.github.io/dbdev/cli/) installed and logged in using the `dbdev shared token` in 1Password. + +1. Run `mise run build` to build the extension which will create the following file in the `dbdev` directory. (Note: this release artifact is built from the Supabase release artifact). +2. After the build is complete, you will have a file in the `dbdev` directory called `eql--0.0.0.sql`. +3. Update the file name from `eql--0.0.0.sql` replacing `0.0.0` with the version number of the release. +4. Also update the `eql.control` file with the new version number. +5. Run `dbdev publish` to publish the extension to dbdev. + +Reach out to @calvinbrewer if you need help. + ## Building ### Dependencies diff --git a/README.md b/README.md index ea95be18..fb5f05d8 100644 --- a/README.md +++ b/README.md @@ -5,42 +5,25 @@ Encrypt Query Language (EQL) is a set of abstractions for transmitting, storing, and interacting with encrypted data and indexes in PostgreSQL. -> [!TIP] > **New to EQL?** Start with the higher level helpers for EQL in [Python](https://github.com/cipherstash/eqlpy), [Go](https://github.com/cipherstash/goeql), or [JavaScript](https://github.com/cipherstash/jseql) and [TypeScript](https://github.com/cipherstash/jseql), or the [examples](#helper-packages-and-examples) for those languages. +> [!TIP] +> **New to EQL?** +> EQL is the basis for searchable encryption functionality when using [Protect.js](https://github.com/cipherstash/protectjs) and/or [CipherStash Proxy](https://github.com/cipherstash/proxy). Store encrypted data alongside your existing data: - Encrypted data is stored using a `jsonb` column type -- Query encrypted data with specialized SQL functions +- Query encrypted data with specialized SQL functions (equality, range, full-text, etc.) - Index encrypted columns to enable searchable encryption -- Integrate with [CipherStash Proxy](/docs/tutorials/PROXY.md) for transparent encryption/decryption. ## Table of Contents - [Installation](#installation) - - [CipherStash Proxy](#cipherstash-proxy) -- [Documentation](#documentation) + - [dbdev](#dbdev) - [Getting started](#getting-started) - [Enable encrypted columns](#enable-encrypted-columns) - - [Configuring the column](#configuring-the-column) - - [Activating configuration](#activating-configuration) - - [Refreshing CipherStash Proxy Configuration](#refreshing-cipherstash-proxy-configuration) -- [Storing data](#storing-data) - - [Inserting Data](#inserting-data) - - [Reading Data](#reading-data) -- [Configuring indexes for searching data](#configuring-indexes-for-searching-data) - - [Adding an index](#adding-an-index) -- [Searching data with EQL](#searching-data-with-eql) - - [Equality search](#equality-search) - - [Full-text search](#full-text-search) - - [Range queries](#range-queries) - - [Array Operations](#array-operations) - - [JSON Path Operations](#json-path-operations) -- [JSON and JSONB support](#json-and-jsonb-support) -- [Frequently Asked Questions](#frequently-asked-questions) -- [Helper packages](#helper-packages-and-examples) -- [Releasing](#releasing) +- [Encrypt configuration](#encrypt-configuration) +- [CipherStash integrations using EQL](#cipherstash-integrations-using-eql) - [Developing](#developing) -- [Testing](#testing) --- @@ -60,16 +43,12 @@ The simplest way to get up and running with EQL is to execute the install SQL fi psql -f cipherstash-encrypt.sql ``` -### CipherStash Proxy +### dbdev -EQL relies on [CipherStash Proxy](docs/tutorials/PROXY.md) for low-latency encryption & decryption. -We plan to support direct language integration in the future. +> [!WARNING] +> The version released on dbdev may not be in sync with the version released on GitHub until we automate the publishing process. -If you want to use CipherStash Proxy with the below examples or the [helper packages](#helper-packages-and-examples), you can use the [playground environment](playground/README.md). - -## Documentation - -You can read more about the EQL concepts and reference guides in the [documentation directory](docs/README.md). +You can find the EQL extension on [dbdev's extension catalog](https://database.dev/cipherstash/eql) with instructions on how to install it. ## Getting started @@ -77,7 +56,7 @@ Once the custom types and functions are installed in your PostgreSQL database, y ### Enable encrypted columns -Define encrypted columns using the `eql_v2_encrypted` type, which extends the `jsonb` type with additional constraints to ensure data integrity. +Define encrypted columns using the `eql_v2_encrypted` type, which stores encrypted data as `jsonb` with additional constraints to ensure data integrity. **Example:** @@ -88,314 +67,22 @@ CREATE TABLE users ( ); ``` -### Configuring the column - -Initialize the column using the `eql_v2.add_column` function to enable encryption and decryption via CipherStash Proxy. - -```sql -SELECT eql_v2.add_column('users', 'encrypted_email'); -``` - -**Note:** This function allows you to encrypt and decrypt data but does not enable searchable encryption. See [Searching data with EQL](#searching-data-with-eql) for enabling searchable encryption. - - - -**Important:** These functions must be run after any modifications to the configuration. - -#### Refreshing CipherStash Proxy Configuration - -CipherStash Proxy refreshes the configuration every 60 seconds. To force an immediate refresh, run: - -```sql -SELECT eql_v2.reload_config(); -``` - -> Note: This statement must be executed when connected to CipherStash Proxy. -> When connected to the database directly, it is a no-op. - -## Storing data - -Encrypted data is stored as `jsonb` values in the PostgreSQL database, regardless of the original data type. - -You can read more about the data format [here](docs/reference/PAYLOAD.md). - -### Inserting Data - -When inserting data into the encrypted column, wrap the plaintext in the appropriate EQL payload. These statements must be run through the CipherStash Proxy to **encrypt** the data. - -**Example:** - -```sql -INSERT INTO users (encrypted_email) VALUES ( - '{"v":2,"k":"pt","p":"test@example.com","i":{"t":"users","c":"encrypted_email"}}' -); -``` - -Data is stored in the PostgreSQL database as: - -```json -{ - "c": "generated_ciphertext", - "i": { - "c": "encrypted_email", - "t": "users" - }, - "k": "ct", - "bf": null, - "ob": null, - "u": null, - "v": 2 -} -``` - -### Reading Data - -When querying data, select the encrypted column. CipherStash Proxy will **decrypt** the data automatically. - -**Example:** - -```sql -SELECT encrypted_email FROM users; -``` - -Data is returned as: - -```json -{ - "k": "pt", - "p": "test@example.com", - "i": { - "t": "users", - "c": "encrypted_email" - }, - "v": 2, - "q": null -} -``` - -> Note: If you execute this query directly on the database, you will not see any plaintext data but rather the `jsonb` payload with the ciphertext. - -## Configuring indexes for searching data - -In order to perform searchable operations on encrypted data, you must configure indexes for the encrypted columns. - -> **IMPORTANT:** If you have existing data that's encrypted and you add or modify an index, all the data will need to be re-encrypted. -> This is due to the way CipherStash Proxy handles searchable encryption operations. - -### Adding an index - -Add an index to an encrypted column using the `eql_v2.add_search_config` function: - -```sql -SELECT eql_v2.add_search_config( - 'table_name', -- Name of the table - 'column_name', -- Name of the column - 'index_name', -- Index kind ('unique', 'match', 'ore', 'ste_vec') - 'cast_as', -- PostgreSQL type to cast decrypted data ('text', 'int', etc.) - 'opts' -- Index options as JSONB (optional) -); -``` - -You can read more about the index configuration options [here](docs/reference/INDEX.md). - -**Example (Unique index):** - -```sql -SELECT eql_v2.add_search_config( - 'users', - 'encrypted_email', - 'unique', - 'text' -); -``` - -After adding an index, you have to activate the configuration: - -```sql -SELECT eql_v2.migrate_config(); -SELECT eql_v2.activate_config(); -``` - -## Searching data with EQL - -EQL provides specialized functions to interact with encrypted data, supporting operations like equality checks, range queries, and unique constraints. - -In order to use the specialized functions, you must first configure the corresponding indexes. - -### Equality search - -Enable equality search on encrypted data using the `eql_v2.hmac_256` function. - -**Index configuration example:** - -```sql -SELECT eql_v2.add_search_config( - 'users', - 'encrypted_email', - 'unique', - 'text' -); -``` - -**Example:** - -```sql -SELECT * FROM users -WHERE eql_v2.hmac_256(encrypted_email) = eql_v2.hmac_256( - '{"v":2,"k":"pt","p":"test@example.com","i":{"t":"users","c":"encrypted_email"},"q":"hmac_256"}' -); -``` - -Equivalent plaintext query: - -```sql -SELECT * FROM users WHERE email = 'test@example.com'; -``` - -### Full-text search - -Enables basic full-text search on encrypted data using the `eql_v2.bloom_filter` function. - -**Index configuration example:** - -```sql -SELECT eql_v2.add_search_config( - 'users', - 'encrypted_email', - 'match', - 'text', - '{"token_filters": [{"kind": "downcase"}], "tokenizer": { "kind": "ngram", "token_length": 3 }}' -); -``` - -**Example:** - -```sql -SELECT * FROM users -WHERE eql_v2.bloom_filter(encrypted_email) @> eql_v2.bloom_filter( - '{"v":2,"k":"pt","p":"test","i":{"t":"users","c":"encrypted_email"},"q":"match"}' -); -``` - -Equivalent plaintext query: - -```sql -SELECT * FROM users WHERE email LIKE '%test%'; -``` - -### Range queries - -Enable range queries on encrypted data using the `eql_v2.ore_block_u64_8_256`, `eql_v2.ore_cllw_u64_8`, or `eql_v2.ore_cllw_var_8` functions. Supports: - -- `ORDER BY` -- `WHERE` - -**Example (Filtering):** - -```sql -SELECT * FROM users -WHERE eql_v2.ore_block_u64_8_256(encrypted_date) < eql_v2.ore_block_u64_8_256( - '{"v":2,"k":"pt","p":"2023-10-05","i":{"t":"users","c":"encrypted_date"},"q":"ore"}' -); -``` - -Equivalent plaintext query: - -```sql -SELECT * FROM users WHERE date < '2023-10-05'; -``` - -**Example (Ordering):** - -```sql -SELECT id FROM users -ORDER BY eql_v2.ore_block_u64_8_256(encrypted_field) DESC; -``` - -Equivalent plaintext query: - -```sql -SELECT id FROM users ORDER BY field DESC; -``` - -### Array Operations - -EQL supports array operations on encrypted data: - -```sql --- Get array length -SELECT eql_v2.jsonb_array_length(encrypted_array) FROM users; - --- Get array elements -SELECT eql_v2.jsonb_array_elements(encrypted_array) FROM users; - --- Get array element ciphertexts -SELECT eql_v2.jsonb_array_elements_text(encrypted_array) FROM users; -``` - -### JSON Path Operations - -EQL supports JSON path operations on encrypted data using the `->` and `->>` operators: - -```sql --- Get encrypted value at path -SELECT encrypted_data->'$.field' FROM users; - --- Get ciphertext at path -SELECT encrypted_data->>'$.field' FROM users; -``` - -## JSON and JSONB support - -EQL supports encrypting entire JSON and JSONB data sets. -This warrants a separate section in the documentation. -You can read more about the JSONB support in the [JSONB reference guide](docs/reference/JSON.md). - -## Frequently Asked Questions - -### How do I integrate CipherStash EQL with my application? - -Use CipherStash Proxy to intercept PostgreSQL queries and handle encryption and decryption automatically. -The proxy interacts with the database using the EQL functions and types defined in this documentation. - -Use the [helper packages](#helper-packages) to integate EQL functions into your application. - -### Can I use EQL without the CipherStash Proxy? - -No, CipherStash Proxy is required to handle the encryption and decryption operations based on the configurations and indexes defined. - -### How is data encrypted in the database? - -Data is encrypted using CipherStash's cryptographic schemes and stored in the `eql_v2_encrypted` column as a JSONB payload. -Encryption and decryption are handled by CipherStash Proxy. +## Encrypt configuration -## Helper packages and examples +In order to enable searchable encryption, you will need to configure your CipherStash integration appropriately. -We've created a few langauge specific packages to help you interact with the payloads: +- If you are using [CipherStash Proxy](https://github.com/cipherstash/proxy), see [this guide](docs/tutorials/proxy-configuration.md). +- If you are using [Protect.js](https://github.com/cipherstash/protectjs), use the [Protect.js schema](https://github.com/cipherstash/protectjs/blob/main/docs/reference/schema.md). -| Language | ORM | Example | Package | -| ---------- | ----------- | ---------------------------------------------------------------- | --------------------------------------------- | -| Go | Xorm | [Go/Xorm examples](./examples/go/xorm/README.md) | [goeql](https://github.com/cipherstash/goeql) | -| TypeScript | Drizzle | [Drizzle examples](./examples/javascript/apps/drizzle/README.md) | [jseql](https://github.com/cipherstash/jseql) | -| TypeScript | Prisma | [Drizzle examples](./examples/javascript/apps/prisma/README.md) | [jseql](https://github.com/cipherstash/jseql) | -| Python | SQL Alchemy | [Python examples](./examples/python/jupyter_notebook/README.md) | [eqlpy](https://github.com/cipherstash/eqlpy) | +## CipherStash integrations using EQL -### Language specific packages +These frameworks use EQL to enable searchable encryption functionality in PostgreSQL. -- [Go](https://github.com/cipherstash/goeql) -- [JavaScript/TypeScript](https://github.com/cipherstash/jseql) -- [Python](https://github.com/cipherstash/eqlpy) +| Framework | Repo | +| ----------- | ------------------------------------------ | +| Protect.js | [Protect.js](https://github.com/cipherstash/protectjs) | +| Protect.php | [Protect.php](https://github.com/cipherstash/protectphp) | +| CipherStash Proxy | [CipherStash Proxy](https://github.com/cipherstash/proxy) | ## Developing diff --git a/dbdev/README.md b/dbdev/README.md new file mode 120000 index 00000000..32d46ee8 --- /dev/null +++ b/dbdev/README.md @@ -0,0 +1 @@ +../README.md \ No newline at end of file diff --git a/dbdev/eql.control b/dbdev/eql.control new file mode 100644 index 00000000..e3d24851 --- /dev/null +++ b/dbdev/eql.control @@ -0,0 +1,3 @@ +default_version = 2.1.2 +comment = 'Index and search encrypted data in PostgreSQL with SQL' +relocatable = true \ No newline at end of file diff --git a/docs/README.md b/docs/README.md index c34f3d53..9660ba4d 100644 --- a/docs/README.md +++ b/docs/README.md @@ -6,14 +6,12 @@ This directory contains the documentation for the Encrypt Query Language (EQL). - [Postgres data security with CipherStash](concepts/WHY.md) -## How-to guides +## Reference -- [Getting started](tutorials/GETTINGSTARTED.md) -- [Using CipherStash Proxy](tutorials/PROXY.md) +- [EQL index configuration for CipherStash Proxy](reference/index-configuration.md) +- [EQL with JSON and JSONB](reference/json-support.md) +- [EQL payload data format](reference/eql-payload.md) -## Reference +## Tutorials -- [EQL index configuration](reference/INDEX.md) -- [EQL with JSON and JSONB](reference/JSON.md) -- [CipherStash Migrator](reference/MIGRATOR.md) -- [EQL payload data format](reference/PAYLOAD.md) +- [CipherStash Proxy Configuration with EQL functions](tutorials/proxy-configuration.md) diff --git a/docs/concepts/WHY.md b/docs/concepts/WHY.md index 2f6bc3ae..7df39fc5 100644 --- a/docs/concepts/WHY.md +++ b/docs/concepts/WHY.md @@ -9,11 +9,11 @@ This page gives a high-level overview of CipherStash's encryption in use solutio - [Why use encryption in use?](#why-use-encryption-in-use) 2. [CipherStash Proxy](#cipherstash-proxy) - [How it works](#how-it-works) -3. [Encrypt Query Language (EQL)](#encrypt-query-language-eql) -4. [Best practices](#best-practices) -5. [Advanced topics](#advanced-topics) - - [Integrating without proxy](#integrating-without-proxy) -6. [Conclusion](#conclusion) +3. [Protect.js](#protectjs) + - [How it works](#how-it-works-1) +4. [Encrypt Query Language (EQL)](#encrypt-query-language-eql) +5. [Best practices](#best-practices) +6. [Getting started](#getting-started) ## Encryption in use @@ -52,41 +52,37 @@ This enables encryption in use without significant changes to your application c - **Encrypts data**: For write operations, it encrypts the plaintext data before sending it to the database. - **Decrypts data**: For read operations, it decrypts the encrypted data retrieved from the database before returning it to the client. - **Maintains searchability**: Ensures that the encrypted data is searchable and retrievable without sacrificing performance or application functionality. -- **Manages encryption keys**: Securely handles encryption keys required for encrypting and decrypting data. + +## Protect.js + +Protect.js is an NPM package that provides a set of functions to encrypt and decrypt data. +It is a client-side library that can be used to encrypt and decrypt data in your JS/TS application. + +### How it works + +- **Encrypts data**: Protect.js encrypts the plaintext data before sending it to the database. +- **Decrypts data**: Protect.js decrypts the encrypted data retrieved from the database before returning it to the client. +- **Maintains searchability**: Ensures that the encrypted data is searchable and retrievable without sacrificing performance or application functionality. ## Encrypt Query Language (EQL) Encrypt Query Language (EQL) is a set of PostgreSQL functions and data types provided by CipherStash to work with encrypted data and indexes. EQL allows you to perform queries on encrypted data without decrypting it, supporting operations like equality checks, range queries, and unique constraints. -To get started, read the [Getting started](https://github.com/cipherstash/encrypt-query-language/blob/main/GETTINGSTARTED.md) guide. - ## Best practices -- **Use CipherStash Proxy** to handle encryption/decryption transparently. - **Use EQL functions** when interacting with encrypted data. - **Define database constraints**to maintain data integrity. - **Secure key management** of encryption keys. - **Monitor query performance** and optimize as needed. -## Advanced topics - -### Integrating without CipherStash Proxy - -> The SDK approach is currently in development, but if you're interested in contributing, please start a discussion [here](https://github.com/cipherstash/encrypt-query-language/discussions). - -For advanced users who prefer to handle encryption within their application: - -- **SDKs available**: Use CipherStash SDKs (at the moment, Rust and TypeScript) to manage encryption/decryption. -- **Manual encryption**: Implement encryption logic in your application code. -- **Data conformity**: Ensure encrypted data matches the expected `jsonb` schema. -- **Key management**: Handle encryption keys securely within your application. - -**Note**: This approach increases complexity and is recommended only if CipherStash Proxy does not meet specific requirements. - ## Getting started -To get started using CipherStash's encryption is use solution, see the [Getting Started](https://github.com/cipherstash/encrypt-query-language/blob/main/GETTINGSTARTED.md) guide. +Use one of the CipherStash integrations using EQL to get started. + +- [Protect.js](https://github.com/cipherstash/protectjs) +- [CipherStash Proxy](https://github.com/cipherstash/proxy) +- [Protect.php](https://github.com/cipherstash/protectphp) For further help, raise an issue [here](https://github.com/cipherstash/encrypt-query-language/issues). diff --git a/docs/reference/MIGRATOR.md b/docs/reference/MIGRATOR.md deleted file mode 100644 index d5876cf6..00000000 --- a/docs/reference/MIGRATOR.md +++ /dev/null @@ -1,78 +0,0 @@ -# CipherStash Migrator - -CipherStash Migrator is a tool that can be used to migrate plaintext data in a database to its encrypted equivalent. -It works inside the CipherStash Proxy Docker container and can handle different data types such as text, JSONB, integers, booleans, floats, and dates. -By specifying the relevant columns in your table, CipherStash Migrator will seamlessly encrypt the existing data and store it in designated encrypted columns. - -## Prerequisites - -- [CipherStash Proxy](PROXY.md) -- [Have set up EQL in your database](GETTINGSTARTED.md) - - Ensure that the columns where data will be migrated already exist. - -## Usage - -The CipherStash Migrator allows you to specify key-value pairs where the key is the plaintext column, and the value is the corresponding encrypted column. -Multiple key-value pairs can be specified, and the tool will perform a migration for each specified column. - -### Running the migrator - -You will need to SSH into the CipherStash Proxy Docker container to run the migrator. - -```bash -docker exec -it eql-cipherstash-proxy bash -``` - -Once inside the container, you have access to the migrator tool. - -```bash -cipherstash-migrator --version -``` - -#### Flags - -| Flag | Description | Required | -| --- | --- | --- | -| `--columns` | Specifies the plaintext columns and their corresponding encrypted columns. The format is `plaintext_column=encrypted_column`. | Yes | -| `--table` | Specifies the table where the data will be migrated. | Yes | -| `--database-name` | Specifies the database name. | Yes | -| `--username` | Specifies the database username. | Yes | -| `--password` | Specifies the database password. | Yes | - -#### Supported data types - -- Text -- JSONB -- Integer -- Boolean -- Float -- Date - -### Example - -The following is an example of how to run the migrator with a single column: - -```bash -cipherstash-migrator --columns example_column=example_column_encrypted --table examples --database-name postgres --username postgres --password postgres -``` - -If you require additional data types, please [raise an issue](https://github.com/cipherstash/encrypt-query-language/issues) - -### Running migrations with multiple columns - -To run a migration on multiple columns at once, specify multiple key-value pairs in the `--columns` option: - -```bash -cipherstash-migrator --columns test_text=encrypted_text test_jsonb=encrypted_jsonb test_int=encrypted_int test_boolean=encrypted_boolean --table examples --database-name migrator_test --username postgres --password postgres -``` - -## Notes - -- Ensure that the corresponding encrypted columns already exist in the table before running the migration. -- Data migration operations should be tested in a development environment before being executed in production. - ---- - -### Didn't find what you wanted? - -[Click here to let us know what was missing from our docs.](https://github.com/cipherstash/encrypt-query-language/issues/new?template=docs-feedback.yml&title=[Docs:]%20Feedback%20on%20MIGRATOR.md) diff --git a/docs/reference/PAYLOAD.md b/docs/reference/PAYLOAD.md index 933c305d..d5e7f445 100644 --- a/docs/reference/PAYLOAD.md +++ b/docs/reference/PAYLOAD.md @@ -37,8 +37,6 @@ CipherStash Proxy will handle the plaintext payload and create the encrypted pay ## Data format -The format is defined as a [JSON Schema](../../sql/schemas/cs_encrypted_v2.schema.json). - It should never be necessary to directly interact with the stored `jsonb`. CipherStash Proxy handles the encoding, and EQL provides the functions. diff --git a/docs/reference/INDEX.md b/docs/reference/index-config.md similarity index 97% rename from docs/reference/INDEX.md rename to docs/reference/index-config.md index d145e8ad..b4cf25d1 100644 --- a/docs/reference/INDEX.md +++ b/docs/reference/index-config.md @@ -1,4 +1,8 @@ -# EQL index configuration +# EQL index configuration for CipherStash Proxy + +> [!NOTE] +> This guide is for CipherStash Proxy. +> If you are using Protect.js, see the [Protect.js schema](https://github.com/cipherstash/protectjs/blob/main/docs/reference/schema.md). The following functions allow you to configure indexes for encrypted columns. All these functions modify the `eql_v2_configuration` table in your database, and are added during the EQL installation. diff --git a/docs/reference/JSON.md b/docs/reference/json-support.md similarity index 100% rename from docs/reference/JSON.md rename to docs/reference/json-support.md diff --git a/docs/tutorials/GETTINGSTARTED.md b/docs/tutorials/GETTINGSTARTED.md deleted file mode 100644 index bd02c067..00000000 --- a/docs/tutorials/GETTINGSTARTED.md +++ /dev/null @@ -1,588 +0,0 @@ -## Getting started - -## Setup - -Before we begin using EQL and the Proxy, we'll need to do some setup to get the necessary keys and configuration. - -1. Create an [account](https://cipherstash.com/signup). - -2. Install the CLI: - -```shell -brew install cipherstash/tap/stash -``` - -3. Login: - -```shell -stash login -``` - -4. Create a [dataset](https://cipherstash.com/docs/how-to/creating-datasets) and [client](https://cipherstash.com/docs/how-to/creating-clients), and record them as `CS_CLIENT_ID` and `CS_CLIENT_KEY`. - -```shell -stash datasets create eql-test -# grab dataset ID and export CS_DATASET_ID= - -stash clients create eql-test --dataset-id $CS_DATASET_ID -``` - -5. Create an [access key](https://cipherstash.com/docs/how-to/creating-access-keys) for CipherStash Proxy: - -```shell -stash workspaces -# grab the workspace ID and export CS_WORKSPACE_ID= -stash access-keys create --workspace-id $CS_WORKSPACE_ID eql-test -``` - -6. Go to the [EQL playground](../../playground\) and copy over the example `.envrc` file: - -```shell -cd playground -cp .envrc.example .envrc -``` - -Update the `.envrc` file with these environment variables `CS_WORKSPACE_ID`, `CS_CLIENT_ACCESS_KEY`, `CS_ENCRYPTION__CLIENT_ID`, `CS_ENCRYPTION__CLIENT_KEY` and `CS_DATASET_ID`: - -```shell -source .envrc -``` - -7. Start PostgreSQL and CipherStash Proxy and install EQL: - -```shell -docker compose up -``` - -This will: - -- spin up a docker container for the CipherStash Proxy and Postgres -- install EQL - -8. Check Postgres and the Proxy are running: - -```shell -docker ps -``` - -You should see 2 containers running, `postgres_proxy` and `eql-playground-pg`. - -## Example - -These examples will show how EQL works using raw SQL. - -Prerequisites: - -- PostgreSQL and CipherStash Proxy are running in docker containers. - -Let's step through an example of how we go from a plaintext text field to an encrypted text field. - -This guide will include: - -- How to [setup your database](#setup-your-database) -- How to [add indexes](#adding-indexes) -- How to [encrypt existing plaintext data](#encrypting-existing-plaintext-data) -- How to [insert data](#inserting-data) -- How to [query data](#querying-data) - -Connect to your postgres docker container: - -```bash -docker exec -it eql-playground-pg bash -``` - -Start `psql`: - -```bash -PGPASSWORD=postgres PGUSER=postgres psql -``` - -We will use a `users` table with an email field for this example. - -In psql, run: - -```sql -CREATE TABLE IF NOT EXISTS users ( - id serial PRIMARY KEY NOT NULL, - email VARCHAR(100) -); -``` - -Our `users` schema looks like this: - -| Column | Type | Nullable | -| ------- | ------------------------ | -------- | -| `email` | `character varying(100)` | | - -Seed plaintext data into the users table: - -```sql -INSERT INTO users (email) VALUES -('adalovelace@example.com'), -('gracehopper@test.com'), -('edithclarke@email.com'); -``` - -### Setup your database - -In the previous step we: - -- setup a basic users table with a plaintext email (text) field. -- seeded the db with plaintext emails. - -In this part we will add a new column to store our encrypted email data. - -When we add the column we use a `Type` of `cs_encrypted_v2`. - -This type will enforce constraints on the field to ensure that: - -- the payload is in the format EQL and CipherStash Proxy expects. -- the payload has been encrypted before inserting. - -If there are issues with the payload being inserted into a field with a type of `cs_encrypted_v2`, an error will be returned describing what the issue with the payload is. - -To add a new column called `email_encrypted` with a type of `cs_encrypted_v2`: - -```sql -ALTER TABLE users ADD email_encrypted cs_encrypted_v2; -``` - -Our `users` schema now looks like this: - -| Column | Type | Nullable | -| ----------------- | ------------------------ | -------- | -| `email` | `character varying(100)` | | -| `email_encrypted` | `cs_encrypted_v2` | | - -### Adding indexes - -We now have our database schema setup to store encrypted data. - -In this part we will learn about why we need to add indexes and how to add them. - -When you install EQL, a table called `eql_v2_configuration` is created in your database. - -Adding indexes updates this table with the details and configuration needed for CipherStash Proxy to know how to encrypt your data, and what types of queries are able to be performed - -We will also need to add the relevant native database indexes to be able to perform these queries. - -In this example, we want to be able to execute these types of queries on our `email_encrypted` field: - -- free text search -- equality -- order by -- comparison - -This means that we need to add the below indexes for our new `email_encrypted` field. - -For free text queries (e.g `LIKE`, `ILIKE`) we add a `match` index and a GIN index: - -```sql -SELECT cs_add_index_v2('users', 'email_encrypted', 'match', 'text'); -CREATE INDEX ON users USING GIN (cs_match_v2(email_encrypted)); -``` - -For equality queries we add a `unique` index: - -```sql -SELECT cs_add_index_v2('users', 'email_encrypted', 'unique', 'text', '{"token_filters": [{"kind": "downcase"}]}'); -CREATE UNIQUE INDEX ON users(cs_unique_v2(email_encrypted)); -``` - -For ordering or comparison queries we add an `ore` index: - -```sql -SELECT cs_add_index_v2('users', 'email_encrypted', 'ore', 'text'); -CREATE INDEX ON users (ore_block_u64_8_256(email_encrypted)); -``` - -After adding these indexes, our `eql_v2_configuration` table will look like this: - -```bash -id | 1 -state | pending -data | {"v": 2, "tables": {"users": {"email_encrypted": {"cast_as": "text", "indexes": {"ore": {}, "match": {"k": 6, "bf": 2048, "tokenizer": {"kind": "ngram", "token_length": 3}, "token_filters": [{"kind": "downcase"}], "include_original": true}, "unique": {"token_filters": [{"kind": "downcase"}]}}}}}} -``` - -The initial `state` will be set as pending. - -To activate this configuration run: - -```sql -SELECT cs_encrypt_v2(); -SELECT cs_activate_v2(); -``` - -The `cs_configured_v2` table will now have a state of `active`. - -```bash -id | 1 -state | active -data | {"v": 2, "tables": {"users": {"email_encrypted": {"cast_as": "text", "indexes": {"ore": {}, "match": {"k": 6, "bf": 2048, "tokenizer": {"kind": "ngram", "token_length": 3}, "token_filters": [{"kind": "downcase"}], "include_original": true}, "unique": {"token_filters": [{"kind": "downcase"}]}}}}}} -``` - -### Encrypting existing plaintext data - -Prerequisites: - -- [Database is setup](#setup-your-database) -- [Indexes added](#adding-indexes) - -Ensure CipherStash Proxy has the most up to date configuration from the `eql_v2_configuration` table. - -CipherStash Proxy pings the database every 60 seconds to refresh the configuration but we can force the refresh by running: - -```sql -SELECT cs_refresh_encrypt_config(); -``` - -Bundled in with the CipherStash Proxy is a [migrator tool](./MIGRATOR.md). - -This tool encrypts the plaintext data from the plaintext `email` field, and inserts it into the encrypted field, `email_encrypted`. - -We access the migrator tool by requesting a shell inside the CipherStash Proxy container. - -```bash -docker exec -it postgres_proxy bash -``` - -Run: - -```bash -cipherstash-migrator --columns email=email_encrypted --table users --database-name postgres --username postgres --password postgres -``` - -We now have encrypted data in our `email_encrypted` field that we can query. - -Drop the plaintext email column: - -```sql -ALTER TABLE users DROP COLUMN email; -``` - -**Note: In production ensure data is backed up before dropping any columns** - -### Insert a new record - -Before inserting or querying any records, we need to connect to our database via the Proxy. - -We do this so our data is encrypted and decrypted. - -In another terminal run: - -```bash -PGPASSWORD=postgres psql -h localhost -p 6432 -U postgres -d postgres -``` - -When inserting data into the encrypted column we need to wrap the plaintext in an EQL payload. - -The reason for this is that the CipherStash Proxy expects the EQL payload to be able to encrypt the data, and to be able to decrypt the data. - -These statements must be run through the CipherStash Proxy in order to **encrypt** the data. - -For a plaintext of `test@test.com`. - -An EQL payload will look like this: - -```json -{ - "k": "pt", // The kind of EQL payload. The client will always send through plaintext "pt" - "p": "test@test.com", // The plaintext data - "i": { - "t": "users", // The table - "c": "email_encrypted" // The encrypted column - }, - "v": 2, - "q": null // Used in queries only. -} -``` - -**Example:** - -A query to insert an email into the plaintext `email` field in the `users` table looks like this: - -```sql -INSERT INTO users (email) VALUES ('test@test.com'); -``` - -The equivalent of this query to insert a plaintext email and encrypt it into the `email_encrypted` column using EQL: - -```sql -INSERT INTO users (email_encrypted) VALUES ('{"v":2,"k":"pt","p":"test@test.com","i":{"t":"users","c":"email_encrypted"}}'); -``` - -**What is happening?** - -The CipherStash Proxy takes this EQL payload and encrypts the plaintext data. - -It creates an EQL payload that looks similar to this and inserts this into the encrypted field in the database. - -```json -{ - "k": "ct", // The kind of EQL payload. The Proxy will insert a json payload of a ciphertext or "ct". - "c": "encrypted test@test.com", // The source ciphertext of the plaintext email. - "e": { - "t": "users", // Table - "c": "email_encrypted" // Encrypted column - }, - "bf": [42], // The ciphertext used for free text queries i.e match index - "u": "unique ciphertext", // The ciphertext used for unique queries. i.e unique index - "ob": ["a", "b", "c"], // The ciphertext used for order or comparison queries. i.e ore index - "v": 2 -} -``` - -This is what is stored in the `email_encrypted` column. - -### Querying data - -In this part we will step through how to read our encrypted data. - -We will cover: - -- simple queries -- free text search queries -- exact/unique queries -- order by and comparison queries - -#### Simple query - -If we don't need to execute any searchable operations (free text, exact) on the encrypted field. - -The query will look similar to a plaintext query except we will use the encrypted column. - -A plaintext query to select all emails from the users table would look like this: - -```sql -SELECT email FROM users; -``` - -The EQL equivalent of this query is: - -```sql -SELECT email_encrypted FROM users; -``` - -Returns: - -```bash - email_encrypted -------------------------------------------------------------------------------------------------- - {"k":"pt","p":"adalovelace@example.com","i":{"t":"users","c":"email_encrypted"},"v":2,"q":null} - {"k":"pt","p":"gracehopper@test.com","i":{"t":"users","c":"email_encrypted"},"v":2,"q":null} - {"k":"pt","p":"edithclarke@email.com","i":{"t":"users","c":"email_encrypted"},"v":2,"q":null} - {"k":"pt","p":"test@test.com","i":{"t":"users","c":"email_encrypted"},"v":2,"q":null} -``` - -**What is happening?** - -The json stored in the database looks similar to this: - -```json -{ - "k": "ct", // The kind of EQL payload. The Proxy will insert a json payload of a ciphertext or "ct". - "c": "encrypted test@test.com", // The source ciphertext of the plaintext email. - "e": { - "t": "users", // Table - "c": "email_encrypted" // Encrypted column - }, - "bf": [42], // The ciphertext used for free text queries i.e match index - "u": "unique ciphertext", // The ciphertext used for unique queries. i.e unique index - "ob": ["a", "b", "c"], // The ciphertext used for order or comparison queries. i.e ore index - "v": 2 -} -``` - -The Proxy decrypts the json above and returns a plaintext json payload that looks like this: - -```json -{ - "k": "pt", - "p": "test@test.com", // The returned plaintext data - "i": { - "t": "users", - "c": "email_encrypted" - }, - "v": 2, - "q": null -} -``` - -> When working with EQL in an application you would likely be using an ORM. - -> We are currently building out [packages and examples](../../README.md#helper-packages) to make it easier to work with EQL json payloads. - -#### Advanced querying - -EQL provides specialized functions to be able to interact with encrypted data and to support operations like equality checks, comparison queries, and unique constraints. - -#### Full-text search - -Prerequsites: - -- A [match index](#adding-indexes) is needed on the encrypted column to support this operation. -- Connected to the database via the Proxy. - -EQL function to use: `cs_match_v2(val JSONB)` - -EQL query payload for a match query: - -```json -{ - "k": "pt", - "p": "grace", // The text we want to use for search - "i": { - "t": "users", - "c": "email_encrypted" - }, - "v": 2, - "q": "match" // This field is required on queries. This specifies the type of query we are executing. -} -``` - -A plaintext query, to search for any records that have an email like `grace`, looks like this: - -```sql -SELECT * FROM users WHERE email LIKE '%grace%'; -``` - -The EQL equivalent of this query is: - -```sql -SELECT * FROM users WHERE cs_match_v2(email_encrypted) @> cs_match_v2( - '{"v":2,"k":"pt","p":"grace","i":{"t":"users","c":"email_encrypted"},"q":"match"}' - ); -``` - -This query returns: - -| id | email_encrypted | -| --- | -------------------------------------------------------------------------------------------- | -| 2 | {"k":"pt","p":"gracehopper@test.com","i":{"t":"users","c":"email_encrypted"},"v":2,"q":null} | - -#### Equality query - -Prerequsites: - -- A [unique index](#adding-indexes) is needed on the encrypted column to support this operation. - -EQL function to use: `cs_unique_v2(val JSONB)` - -EQL query payload for a match query: - -```json -{ - "k": "pt", - "p": "adalovelace@example.com", // The text we want to use for the equality query - "i": { - "t": "users", - "c": "email_encrypted" - }, - "v": 2, - "q": "unique" // This field is required on queries. This specifies the type of query we are executing. -} -``` - -A plaintext query to search for any records that equal `adalovelace@example.com` looks like this: - -```sql -SELECT * FROM users WHERE email = 'adalovelace@example.com'; -``` - -The EQL equivalent of this query is: - -```sql -SELECT * FROM users WHERE cs_unique_v2(email_encrypted) = cs_unique_v2( - '{"v":2,"k":"pt","p":"adalovelace@example.com","i":{"t":"users","c":"email_encrypted"},"q":"unique"}' - ); -``` - -This query returns: - -| id | email_encrypted | -| --- | ----------------------------------------------------------------------------------------------- | -| 1 | {"k":"pt","p":"adalovelace@example.com","i":{"t":"users","c":"email_encrypted"},"v":2,"q":null} | - -#### Order by query - -Prerequsites: - -- An [ore index](#adding-indexes) is needed on the encrypted column to support this operation. - -EQL function to use: `ore_block_u64_8_256(val JSONB)`. - -A plaintext query order by email looks like this: - -```sql -SELECT * FROM users ORDER BY email ASC; -``` - -The EQL equivalent of this query is: - -```sql -SELECT * FROM users ORDER BY ore_block_u64_8_256(email_encrypted) ASC; -``` - -This query returns: - -| id | email_encrypted | -| --- | ----------------------------------------------------------------------------------------------- | -| 1 | {"k":"pt","p":"adalovelace@example.com","i":{"t":"users","c":"email_encrypted"},"v":2,"q":null} | -| 3 | {"k":"pt","p":"edithclarke@email.com","i":{"t":"users","c":"email_encrypted"},"v":2,"q":null} | -| 2 | {"k":"pt","p":"gracehopper@test.com","i":{"t":"users","c":"email_encrypted"},"v":2,"q":null} | -| 4 | {"k":"pt","p":"test@test.com","i":{"t":"users","c":"email_encrypted"},"v":2,"q":null} | - -#### Comparison query - -Prerequsites: - -- A [unique index](#adding-indexes) is needed on the encrypted column to support this operation. - -EQL function to use: `ore_block_u64_8_256(val JSONB)`. - -EQL query payload for a comparison query: - -```json -{ - "k": "pt", - "p": "gracehopper@test.com", // The text we want to use for the equality query - "i": { - "t": "users", - "c": "email_encrypted" - }, - "v": 2, - "q": "ore" // This field is required on queries. This specifies the type of query we are executing. -} -``` - -A plaintext text query to compare email values looks like this: - -```sql -SELECT * FROM users WHERE email > 'gracehopper@test.com'; -``` - -The EQL equivalent of this query is: - -```sql -SELECT * FROM users WHERE ore_block_u64_8_256(email_encrypted) > ore_block_u64_8_256( - '{"v":2,"k":"pt","p":"gracehopper@test.com","i":{"t":"users","c":"email_encrypted"},"q":"ore"}' - ); -``` - -This query returns: - -| id | email_encrypted | -| --- | ------------------------------------------------------------------------------------- | -| 4 | {"k":"pt","p":"test@test.com","i":{"t":"users","c":"email_encrypted"},"v":2,"q":null} | - -#### Summary - -This tutorial showed how we can go from a plaintext text field to an encrypted field and how to query the encrypted fields. - -We have some [examples here](../../README.md#helper-packages) of what this would look like if you are using an ORM. - ---- - -### Didn't find what you wanted? - -[Click here to let us know what was missing from our docs.](https://github.com/cipherstash/encrypt-query-language/issues/new?template=docs-feedback.yml&title=[Docs:]%20Feedback%20on%20GETTINGSTARTED.md) diff --git a/docs/tutorials/PROXY.md b/docs/tutorials/PROXY.md deleted file mode 100644 index 32e1f292..00000000 --- a/docs/tutorials/PROXY.md +++ /dev/null @@ -1,91 +0,0 @@ -# CipherStash Proxy - -The CipherStash Proxy is a lightweight proxy that can be used to encrypt and decrypt data in your database. - -## Table of Contents - -- [Getting Started](#getting-started) -- [Create a dataset and client](#create-a-dataset-and-client) -- [Configuring CipherStash Proxy](#configuring-cipherstash-proxy) -- [Running the Proxy](#running-the-proxy) -- [Using the Proxy](#using-the-proxy) -- [How EQL works with CipherStash Proxy](#how-eql-works-with-cipherstash-proxy) - - [Writes](#writes) - - [Reads](#reads) - -## Getting Started - -To get started, you'll need to sign up for a free account at [https://dashboard.cipherstash.com](https://dashboard.cipherstash.com). - -Once you've signed up, you can create an access key from your default workspace. - -## Create a dataset and client - -Before you can start using the proxy, you'll need to create a dataset and client. - -You can do this using the [CipherStash CLI](https://cipherstash.com/docs/reference/cli) - -1. [Create a dataset.](https://cipherstash.com/docs/how-to/creating-datasets) -1. [Create a client key for cryptographic operations.](https://cipherstash.com/docs/how-to/creating-clients) - -## Configuring CipherStash Proxy - -You can then create a `cipherstash-proxy.toml` file in the root of this directory. You can use the `cipherstash-proxy.toml.example` file as a starting point. - -Populate the following fields with your values: - -- `workspace_id`: The ID of your workspace. -- `client_access_key`: The access key for your client. -- `client_id`: The ID of your client.` -- `client_key`: The key of your client. -- `database.name`: The name of your database. -- `database.username`: The username for your database. -- `database.password`: The password for your database. -- `database.host`: The host for your database. -- `database.port`: The port for your database. - -## Running the Proxy - -To run the proxy, you can use `docker compose` to start the proxy using the configuration in the `cipherstash-proxy.toml` file. -Run the following command from the `cipherstash-proxy` directory: - -```bash -docker compose up -``` - -## Using the Proxy - -Once the proxy is running, you can use the different language examples to test the proxy and EQL. - -## How EQL works with CipherStash Proxy - -EQL uses **CipherStash Proxy** to mediate access to your PostgreSQL database and provide low-latency encryption & decryption. - -At a high level: - -- encrypted data is stored as `jsonb` -- references to the column in sql statements are wrapped in a helper function -- Cipherstash Proxy transparently encrypts and indexes data - -### Writes - -1. Database client sends `plaintext` data encoded as `jsonb` -1. CipherStash Proxy encrypts the `plaintext` and encodes the `ciphertext` value and associated indexes into the `jsonb` payload -1. The data is written to the encrypted column - -![Insert](/diagrams/overview-insert.drawio.svg) - -### Reads - -1. Wrap references to the encrypted column in the appropriate EQL function -1. CipherStash Proxy encrypts the `plaintext` -1. PostgreSQL executes the SQL statement -1. CipherStash Proxy decrypts any returned `ciphertext` data and returns to client - -![Select](/diagrams/overview-select.drawio.svg) - ---- - -### Didn't find what you wanted? - -[Click here to let us know what was missing from our docs.](https://github.com/cipherstash/encrypt-query-language/issues/new?template=docs-feedback.yml&title=[Docs:]%20Feedback%20on%20PROXY.md) diff --git a/docs/tutorials/proxy-configuration.md b/docs/tutorials/proxy-configuration.md new file mode 100644 index 00000000..f39c0b3c --- /dev/null +++ b/docs/tutorials/proxy-configuration.md @@ -0,0 +1,330 @@ +# CipherStash Proxy Configuration with EQL functions + +Initialize the column using the `eql_v2.add_column` function to enable encryption and decryption via CipherStash Proxy. + +```sql +SELECT eql_v2.add_column('users', 'encrypted_email'); -- where users is the table name and encrypted_email is the column name of type eql_v2_encrypted +``` + +**Note:** This function allows you to encrypt and decrypt data but does not enable searchable encryption. See [Searching data with EQL](#searching-data-with-eql) for enabling searchable encryption. + +## Refreshing CipherStash Proxy Configuration + +CipherStash Proxy refreshes the configuration every 60 seconds. To force an immediate refresh, run: + +```sql +SELECT eql_v2.reload_config(); +``` + +> Note: This statement must be executed when connected to CipherStash Proxy. +> When connected to the database directly, it is a no-op. + +## Storing data + +Encrypted data is stored as `jsonb` values in the PostgreSQL database, regardless of the original data type. + +You can read more about the data format [here](docs/reference/payload.md). + +### Inserting Data + +When inserting data into the encrypted column, wrap the plaintext in the appropriate EQL payload. These statements must be run through the CipherStash Proxy to **encrypt** the data. + +**Example:** + +```sql +INSERT INTO users (encrypted_email) VALUES ( + '{"v":2,"k":"pt","p":"test@example.com","i":{"t":"users","c":"encrypted_email"}}' +); +``` + +Data is stored in the PostgreSQL database as: + +```json +{ + "c": "generated_ciphertext", + "i": { + "c": "encrypted_email", + "t": "users" + }, + "k": "ct", + "bf": null, + "ob": null, + "u": null, + "v": 2 +} +``` + +### Reading Data + +When querying data, select the encrypted column. CipherStash Proxy will **decrypt** the data automatically. + +**Example:** + +```sql +SELECT encrypted_email FROM users; +``` + +Data is returned as: + +```json +{ + "k": "pt", + "p": "test@example.com", + "i": { + "t": "users", + "c": "encrypted_email" + }, + "v": 2, + "q": null +} +``` + +> Note: If you execute this query directly on the database, you will not see any plaintext data but rather the `jsonb` payload with the ciphertext. + +## Configuring indexes for searching data + +In order to perform searchable operations on encrypted data, you must configure indexes for the encrypted columns. + +> **IMPORTANT:** If you have existing data that's encrypted and you add or modify an index, all the data will need to be re-encrypted. +> This is due to the way CipherStash Proxy handles searchable encryption operations. + +### Adding an index + +Add an index to an encrypted column using the `eql_v2.add_search_config` function: + +```sql +SELECT eql_v2.add_search_config( + 'table_name', -- Name of the table + 'column_name', -- Name of the column + 'index_name', -- Index kind ('unique', 'match', 'ore', 'ste_vec') + 'cast_as', -- PostgreSQL type to cast decrypted data ('text', 'int', etc.) + 'opts' -- Index options as JSONB (optional) +); +``` + +You can read more about the index configuration options [here](docs/reference/index-config.md). + +**Example (Unique index):** + +```sql +SELECT eql_v2.add_search_config( + 'users', + 'encrypted_email', + 'unique', + 'text' +); +``` + +Configuration changes are automatically migrated and activated. + +## Searching data with EQL + +EQL provides specialized functions to interact with encrypted data, supporting operations like equality checks, range queries, and unique constraints. + +In order to use the specialized functions, you must first configure the corresponding indexes. + +### Equality search + +Enable equality search on encrypted data using the `eql_v2.hmac_256` function. + +**Index configuration example:** + +```sql +SELECT eql_v2.add_search_config( + 'users', + 'encrypted_email', + 'unique', + 'text' +); +``` + +**Example:** + +```sql +SELECT * FROM users +WHERE eql_v2.hmac_256(encrypted_email) = eql_v2.hmac_256( + '{"v":2,"k":"pt","p":"test@example.com","i":{"t":"users","c":"encrypted_email"},"q":"hmac_256"}' +); +``` + +Equivalent plaintext query: + +```sql +SELECT * FROM users WHERE email = 'test@example.com'; +``` + +### Full-text search + +Enables basic full-text search on encrypted data using the `eql_v2.bloom_filter` function. + +**Index configuration example:** + +```sql +SELECT eql_v2.add_search_config( + 'users', + 'encrypted_email', + 'match', + 'text', + '{"token_filters": [{"kind": "downcase"}], "tokenizer": { "kind": "ngram", "token_length": 3 }}' +); +``` + +**Example:** + +```sql +SELECT * FROM users +WHERE eql_v2.bloom_filter(encrypted_email) @> eql_v2.bloom_filter( + '{"v":2,"k":"pt","p":"test","i":{"t":"users","c":"encrypted_email"},"q":"match"}' +); +``` + +Equivalent plaintext query: + +```sql +SELECT * FROM users WHERE email LIKE '%test%'; +``` + +### Range queries + +Enable range queries on encrypted data using the `eql_v2.ore_block_u64_8_256` function. Supports: + +- `ORDER BY` +- `WHERE` with comparison operators (`<`, `<=`, `>`, `>=`, `=`, `<>`) + +**Index configuration example:** + +```sql +SELECT eql_v2.add_search_config( + 'users', + 'encrypted_date', + 'ore', + 'date' +); +``` + +**Example (Filtering):** + +```sql +SELECT * FROM users +WHERE eql_v2.ore_block_u64_8_256(encrypted_date) < eql_v2.ore_block_u64_8_256( + '{"v":2,"k":"pt","p":"2023-10-05","i":{"t":"users","c":"encrypted_date"},"q":"ore"}' +); +``` + +Equivalent plaintext query: + +```sql +SELECT * FROM users WHERE date < '2023-10-05'; +``` + +**Example (Ordering):** + +```sql +SELECT id FROM users +ORDER BY eql_v2.ore_block_u64_8_256(encrypted_field) DESC; +``` + +Equivalent plaintext query: + +```sql +SELECT id FROM users ORDER BY field DESC; +``` + +### Array Operations + +EQL supports array operations on encrypted data: + +```sql +-- Get array length +SELECT eql_v2.jsonb_array_length(encrypted_array) FROM users; + +-- Get array elements +SELECT eql_v2.jsonb_array_elements(encrypted_array) FROM users; + +-- Get array element ciphertexts +SELECT eql_v2.jsonb_array_elements_text(encrypted_array) FROM users; +``` + +### JSON Path Operations + +EQL supports JSON path operations on encrypted data using the `->` and `->>` operators: + +```sql +-- Get encrypted value at path +SELECT encrypted_data->'$.field' FROM users; + +-- Get ciphertext at path +SELECT encrypted_data->>'$.field' FROM users; +``` + +### Containment Operations + +For encrypted JSONB data, EQL provides containment operations using the `@>` and `<@` operators: + +```sql +-- Check if encrypted_data contains specific structure +SELECT * FROM users +WHERE encrypted_data @> '{"v":2,"k":"pt","p":{"account":{"roles":["admin"]}},"i":{"t":"users","c":"encrypted_data"},"q":"ste_vec"}'::eql_v2_encrypted; + +-- Check if structure is contained in encrypted_data +SELECT * FROM users +WHERE '{"v":2,"k":"pt","p":{"roles":["admin"]},"i":{"t":"users","c":"encrypted_data"},"q":"ste_vec"}'::eql_v2_encrypted <@ encrypted_data; +``` + +### Text Pattern Matching + +EQL supports pattern matching with the `~~` (LIKE) operator: + +```sql +-- Pattern matching (case-sensitive) +SELECT * FROM users +WHERE encrypted_name ~~ '{"v":2,"k":"pt","p":"Alice%","i":{"t":"users","c":"encrypted_name"},"q":"match"}'::eql_v2_encrypted; + +-- Pattern matching (case-insensitive) +SELECT * FROM users +WHERE encrypted_name ~~* '{"v":2,"k":"pt","p":"alice%","i":{"t":"users","c":"encrypted_name"},"q":"match"}'::eql_v2_encrypted; +``` + +## JSON and JSONB support + +EQL supports encrypting entire JSON and JSONB data sets. +This warrants a separate section in the documentation. +You can read more about the JSONB support in the [JSONB reference guide](docs/reference/json-support.md). + +## Frequently Asked Questions + +### How do I integrate CipherStash EQL with my application? + +Use CipherStash Proxy to intercept PostgreSQL queries and handle encryption and decryption automatically. +The proxy interacts with the database using the EQL functions and types defined in this documentation. + +Use the [helper packages](#helper-packages-and-examples) to integrate EQL functions into your application. + +### Can I use EQL without the CipherStash Proxy? + +No, CipherStash Proxy is required to handle the encryption and decryption operations based on the configurations and indexes defined. + +### How is data encrypted in the database? + +Data is encrypted using CipherStash's cryptographic schemes and stored in the `eql_v2_encrypted` column as a JSONB payload. +Encryption and decryption are handled by CipherStash Proxy. + +### What index types are available? + +EQL supports the following index types: + +- `unique` - For exact equality searches using HMAC-256 +- `match` - For full-text search using bloom filters +- `ore` - For range queries and ordering using Order-Revealing Encryption +- `ste_vec` - For JSON/JSONB containment operations using Structured Encryption + +### How do I manage configurations? + +Use these functions to manage your EQL configurations: + +- `eql_v2.add_column()` - Add a new encrypted column +- `eql_v2.remove_column()` - Remove an encrypted column +- `eql_v2.add_search_config()` - Add a search index +- `eql_v2.remove_search_config()` - Remove a search index +- `eql_v2.modify_search_config()` - Modify an existing search index +- `eql_v2.config()` - View current configuration in tabular format \ No newline at end of file diff --git a/tasks/build.sh b/tasks/build.sh index 42af26f0..0f1cb0b6 100755 --- a/tasks/build.sh +++ b/tasks/build.sh @@ -80,6 +80,8 @@ cat src/deps-supabase.txt | tsort | tac > src/deps-ordered-supabase.txt cat src/deps-ordered-supabase.txt | xargs cat | grep -v REQUIRE >> release/cipherstash-encrypt-supabase.sql +cat src/deps-ordered-supabase.txt | xargs cat | grep -v REQUIRE >> dbdev/eql--0.0.0.sql + cat tasks/uninstall.sql >> release/cipherstash-encrypt-uninstall-supabase.sql diff --git a/tasks/postgres.toml b/tasks/postgres.toml index 3232ffdd..df4ecd2b 100644 --- a/tasks/postgres.toml +++ b/tasks/postgres.toml @@ -17,3 +17,10 @@ run = """ mise run postgres:down mise run postgres:up --extra-args "--detach --wait" """ + +["postgres:psql"] +description = "Run psql" +run = """ +{% set default_service = "postgres-" ~ get_env(name="POSTGRES_VERSION",default="17") %} +psql -U {{arg(name="user",default="cipherstash")}} -d {{arg(name="db",default="cipherstash")}} -h localhost -p {{arg(name="port",default="7432")}} --service {{arg(name="service",default=default_service)}} +"""