|
1 |
| -# pgvectorscale |
2 | 1 |
|
3 |
| -A vector index for speeding up ANN search in `pgvector`. |
| 2 | +<p></p> |
| 3 | +<div align=center> |
| 4 | +<picture align=center> |
| 5 | + <source media="(prefers-color-scheme: dark)" srcset="https://assets.timescale.com/docs/images/timescale-logo-dark-mode.svg"> |
| 6 | + <source media="(prefers-color-scheme: light)" srcset="https://assets.timescale.com/docs/images/timescale-logo-light-mode.svg"> |
| 7 | + <img alt="Timescale logo" > |
| 8 | +</picture> |
4 | 9 |
|
5 |
| -## 💾 Building and Installing pgvectorscale |
| 10 | +<h3>Use pgvectorscale to build scalable AI applications with higher performance, |
| 11 | +embedding search and cost-efficient storage. </h3> |
6 | 12 |
|
7 |
| -### From source |
| 13 | +[](https://docs.timescale.com/) |
| 14 | +[](https://timescaledb.slack.com/archives/C4GT3N90X) |
| 15 | +[](https://console.cloud.timescale.com/signup) |
| 16 | +</div> |
8 | 17 |
|
9 |
| -#### Prerequisites |
10 | 18 |
|
11 |
| -Building the extension requires valid rust, along with the postgres headers for whichever version of postgres you are running, and pgrx. We recommend installing rust using the official instructions: |
12 |
| -```shell |
13 |
| -curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh |
14 |
| -``` |
15 |
| - |
16 |
| -You should install the appropriate build tools and postgres headers in the preferred manner for your system. You may also need to install OpenSSL. For Ubuntu you can follow the postgres install instructions then run |
| 19 | +pgvectorscale complements [pgvector][pgvector], the open-source vector data extension for PostgreSQL, and introduces the following key innovations: |
| 20 | +- A DiskANN index: based on research from Microsoft |
| 21 | +- Statistical Binary Quantization: developed by Timescale researchers, This feature improves on standard |
| 22 | + Binary Quantization. |
17 | 23 |
|
18 |
| -```shell |
19 |
| -sudo apt-get install make gcc pkg-config clang postgresql-server-dev-16 libssl-dev |
20 |
| -``` |
| 24 | +Timescale’s benchmarks reveal that with pgvectorscale, PostgreSQL achieves **28x lower p95 latency**, and |
| 25 | +**16x higher query throughput** for approximate nearest neighbor queries at 99% recall. |
21 | 26 |
|
22 |
| -Next you need cargo-pgrx, which can be installed with |
23 |
| -```shell |
24 |
| -cargo install --locked cargo-pgrx |
25 |
| -``` |
| 27 | +<div align=center> |
26 | 28 |
|
27 |
| -You must reinstall cargo-pgrx whenever you update your Rust compiler, since cargo-pgrx needs to be built with the same compiler as pgvectorscale. |
| 29 | + |
28 | 30 |
|
29 |
| -Finally, setup the pgrx development environment with |
30 |
| -```shell |
31 |
| -cargo pgrx init --pg16 pg_config |
32 |
| -``` |
| 31 | +PostgreSQL costs are 21% those of Pinecone s1, just saying. |
| 32 | +</div> |
33 | 33 |
|
34 |
| -#### Building and installing the extension |
| 34 | +In contrast to pgvector, which is written in C, pgvectorscale is developed in [Rust][rust-language], |
| 35 | +offering the PostgreSQL community a new avenue for contributing to vector support. |
35 | 36 |
|
36 |
| -Download or clone this repository, and switch to the extension subdirectory, e.g. |
37 |
| -```shell |
38 |
| -git clone https://github.com/timescale/pgvectorscale && \ |
39 |
| -cd pgvectorscale/pgvectorscale |
40 |
| -``` |
| 37 | +Timescale offers the following high performance journeys: |
41 | 38 |
|
42 |
| -Then run |
43 |
| -```shell |
44 |
| -cargo pgrx install --release |
45 |
| -``` |
| 39 | +* **App developer and DBA**: try out pgvectorscale functionality in Timescale Cloud. |
| 40 | + * [Enable pgvectorscale in a Timescale service](#enable-pgvectorscale-in-a-timescale-service) |
| 41 | +* **Extension contributor**: contribute to pgvectorscale. |
| 42 | + * [Build pgvectorscale from source in a developer environment](./DEVELOPMENT.md) |
| 43 | +* **Everyone**: check the benchmark results for yourself. |
| 44 | + * [Test pgvectorscale performance](#test-pgvectorscale-performance) |
46 | 45 |
|
47 |
| -To initialize the extension after installation, enter the following into psql: |
| 46 | +## Enable pgvectorscale in a Timescale service |
48 | 47 |
|
49 |
| -```sql |
50 |
| -CREATE EXTENSION vectorscale; |
51 |
| -``` |
| 48 | +To enable pgvectorscale: |
52 | 49 |
|
53 |
| -## ✏️ Get Involved |
| 50 | +1. Create a new [Timescale Service](https://console.cloud.timescale.com/dashboard/create_services). |
54 | 51 |
|
55 |
| -The pgvectorscale project is still in it's early stage as we decide our priorities and what to implement. As such, now is a great time to help shape the project's direction! Have a look at the list of features we're thinking of working on and feel free to comment on the features, expand the list, or hop on the Discussions forum for more in-depth discussions. |
| 52 | + If you want to use an existing service, pgvectorscale is added as an available extension on the first maintenance window |
| 53 | + after the pgvectorscale release date. |
56 | 54 |
|
57 |
| -### 🔨 Testing |
58 |
| -See above for prerequisites and installation instructions. |
| 55 | +1. Connect to your Timescale service: |
| 56 | + ```bash |
| 57 | + psql -d "postgres://<username>:<password>@<host>:<port>/<database-name>" |
| 58 | + ``` |
59 | 59 |
|
60 |
| -You can run tests against a postgres version pg16 using |
61 |
| -```shell |
62 |
| -cargo pgrx test ${postgres_version} |
63 |
| -``` |
| 60 | +1. Create the pgvectorscale extension: |
64 | 61 |
|
65 |
| -To run all tests run: |
66 |
| -```shell |
67 |
| -cargo test -- --ignored && cargo pgrx test ${postgres_version} |
68 |
| -``` |
| 62 | + ```postgresql |
| 63 | + CREATE EXTENSION IF NOT EXISTS vectorscale CASCADE; |
| 64 | + ``` |
69 | 65 |
|
70 |
| -### 🐯 About Timescale |
| 66 | + The `CASCADE` automatically installs the dependencies. |
71 | 67 |
|
72 |
| -TimescaleDB is a distributed time-series database built on PostgreSQL that scales to over 10 million of metrics per second, supports native compression, handles high cardinality, and offers native time-series capabilities, such as data retention policies, continuous aggregate views, downsampling, data gap-filling and interpolation. |
| 68 | +## Test pgvectorscale performance |
73 | 69 |
|
74 |
| -TimescaleDB also supports full SQL, a variety of data types (numerics, text, arrays, JSON, booleans), and ACID semantics. Operationally mature capabilities include high availability, streaming backups, upgrades over time, roles and permissions, and security. |
| 70 | +To check the Timescale benchmarks in your pgvectorscale environment: |
75 | 71 |
|
76 |
| -TimescaleDB has a large and active user community (tens of millions of downloads, hundreds of thousands of active deployments, Slack channels with thousands of members). |
| 72 | +1. Jonetas, this is for you :-). |
| 73 | +
|
| 74 | +## Get involved |
| 75 | +
|
| 76 | +pgvectorscale is still at an early stage. Now is a great time to help shape the |
| 77 | +direction of this project; we are currently deciding priorities. Have a look at the |
| 78 | +list of features we're thinking of working on. Feel free to comment, expand |
| 79 | +the list, or hop on the Discussions forum. |
| 80 | +
|
| 81 | +## About Timescale |
| 82 | +
|
| 83 | +Timescale Cloud is a high-performance developer focused cloud that provides PostgreSQL services |
| 84 | +enhanced with our blazing fast vector search. Timescale services are built using TimescaleDB and |
| 85 | +PostgreSQL extensions, like this one. Timescale Cloud provides high availability, streaming |
| 86 | +backups, upgrades over time, roles and permissions, and great security. |
| 87 | +
|
| 88 | +TimescaleDB is an open-source time-series database designed for scalability and performance, |
| 89 | +built on top of PostgreSQL. It provides SQL support for time-series data, allowing users to |
| 90 | +leverage PostgreSQL's rich ecosystem while optimizing for high ingest rates and fast query |
| 91 | +performance. TimescaleDB includes features like automated data retention policies, compression |
| 92 | +and continuous aggregates, making it ideal for applications like monitoring, IoT, AI and |
| 93 | +real-time analytics. |
| 94 | +
|
| 95 | +
|
| 96 | +[pgvector]: https://github.com/pgvector/pgvector/blob/master/README.md |
| 97 | +[rust-language]: https://www.rust-lang.org/ |
0 commit comments