pg_catalog_rs is a PostgreSQL system catalog compatibility layer for Apache DataFusion.
It enables PostgreSQL clients (e.g., DBeaver, pgAdmin, JDBC, ODBC, BI platforms) to connect to a DataFusion-backed database by emulating the behavior of PostgreSQL's pg_catalog
schema.
This makes it possible to support tooling that expects PostgreSQL system metadata, even if the underlying engine is not Postgres.
Note: This is WIP heavily and API can change.
-
Emulates core system tables from the
pg_catalog
schema:pg_class
pg_attribute
pg_namespace
pg_type
pg_proc
-
Supports PostgreSQL-specific built-in functions:
pg_get_constraintdef(oid)
current_database()
has_schema_privilege(...)
-
Allows standard metadata queries used by:
- DBeaver, DataGrip, and other GUI tools
- BI tools using JDBC or ODBC
- Postgres CLI (
psql
)
-
Compatible with
pgwire
for Postgres wire protocol handling -
Fully in-memory and extensible via DataFusion APIs
Add the crate from GitHub to your Cargo.toml
:
[dependencies]
datafusion_pg_catalog = { git = "https://github.com/ybrs/pg_catalog" }
Create a SessionContext
preloaded with the catalog tables and register your own schema:
use std::collections::BTreeMap;
use pg_catalog_rs::{
get_base_session_context, register_user_database, register_schema,
register_user_tables, ColumnDef,
};
let (ctx, _log) = get_base_session_context(
Some("pg_catalog_data/pg_schema"),
"pgtry".to_string(),
"public".to_string(),
).await?;
register_user_database(&ctx, "crm").await?;
register_schema(&ctx, "crm", "public").await?;
let mut cols = BTreeMap::new();
cols.insert(
"id".to_string(),
ColumnDef { col_type: "int".to_string(), nullable: false },
);
register_user_tables(&ctx, "crm", "public", "users", vec![cols]).await?;
Then you can run queries like:
SELECT oid, relname FROM pg_class WHERE relnamespace = 2200;
SELECT attname FROM pg_attribute WHERE attrelid = 12345;
Or use DBeaver/psql to introspect the schema.
When executing SQL, call dispatch_query
to automatically
route catalog queries to the internal handler while forwarding all other
statements to your application logic.
use pg_catalog_rs::router::dispatch_query;
let result = dispatch_query(&ctx, "SELECT * FROM pg_class", |ctx, sql| async move {
ctx.sql(sql).await?.collect().await
}).await?;
- Works with the
pgwire
crate for wire protocol emulation - Can be combined with custom
MemTable
or real storage backends (Parquet, Arrow, etc.) - Designed to be embedded in hybrid SQL engines or compatibility layers
Launch a PostgreSQL-compatible endpoint using the provided helper:
use std::sync::Arc;
use pg_catalog_rs::{get_base_session_context, start_server};
#[tokio::main]
async fn main() -> anyhow::Result<()> {
let (ctx, _log) = get_base_session_context(
Some("pg_catalog_data/pg_schema"),
"pgtry".to_string(),
"public".to_string(),
).await?;
start_server(Arc::new(ctx), "127.0.0.1:5433", "pgtry", "public", None).await?;
Ok(())
}
If you already expose your own SQL server you can integrate pg_catalog_rs
without
starting the bundled pgwire endpoint. Build a SessionContext
and register your
tables, then call dispatch_query
from your handler. Catalog
queries will be executed internally and all other statements are forwarded.
use pg_catalog_rs::router::dispatch_query;
async fn handle_request(ctx: &SessionContext, sql: &str) -> DFResult<Vec<RecordBatch>> {
let (batches, _schema) = dispatch_query(ctx, sql, None, None, |ctx, sql| async move {
ctx.sql(sql).await?.collect().await
}).await?;
Ok(batches)
}
- โ No persistence โ catalog is in-memory only
- ๐ก Partial function support (more can be added)
- ๐ Schema reflection based on user-defined tables must be manually tracked
- โ No write-back support to catalog (read-only)
- Hook into
CREATE TABLE
to auto-populate metadata - Add missing catalog tables (
pg_index
,pg_constraint
, etc.) - Catalog persistence to disk or external store
- Enhanced type inference and function overloads
Functional tests are written in Python and executed with pytest.
After installing the dependencies from requirements.txt
, run:
pytest
Licensed under either of:
at your option.
Pull requests are welcome. Contributions that improve PostgreSQL compatibility or support new clients are especially appreciated.
Before submitting a change please run the Rust unit tests and the Python functional tests:
cargo test
pytest
This project is inspired by:
- PostgreSQL's system catalog architecture
- Apache DataFusion as the execution backend
pgwire
for wire protocol support