Skip to content

Commit c41cf41

Browse files
Type Model (#4)
This PR models almost all of the types that will be necessary for optimization. This includes: - generic relational algebra operators that allow us to use the same "type" for both expressions in the memo table and operators in the plans - logical / physical plans - scalar operators and expressions - partially materialized logical plans for rule binding - transformation rule + implementation rule trait and some empty structs that implement them I've named the crate itself `optd-core`. This can be subject to change, but I feel this is a reasonable default for now. ~~TODO: need to wait on #3 and #12 to be merged before proper CI checks can happen~~ Edit: I removed the `cargo rustdoc` check because its creating more problem than it would solve, see #14 --------- Co-authored-by: Alexis Schlomer <aschlome@andrew.cmu.edu>
1 parent 6ffbc5a commit c41cf41

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

44 files changed

+663
-39
lines changed

.github/workflows/check.yml

Lines changed: 0 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -75,23 +75,6 @@ jobs:
7575
# components: rustfmt
7676
# - name: cargo-semver-checks
7777
# uses: obi1kenobi/cargo-semver-checks-action@v2
78-
doc:
79-
# run docs generation on nightly rather than stable. This enables features like
80-
# https://doc.rust-lang.org/beta/unstable-book/language-features/doc-cfg.html which allows an
81-
# API be documented as only available in some specific platforms.
82-
runs-on: ubuntu-latest
83-
name: nightly / doc
84-
steps:
85-
- uses: actions/checkout@v4
86-
with:
87-
submodules: true
88-
- name: Install nightly
89-
uses: dtolnay/rust-toolchain@nightly
90-
- name: Install cargo-docs-rs
91-
uses: dtolnay/install@cargo-docs-rs
92-
- name: cargo docs-rs
93-
# TODO: Once we figure out the crates, rename this.
94-
run: cargo docs-rs -p optd-tmp
9578
hack:
9679
# cargo-hack checks combinations of feature flags to ensure that features are all additive
9780
# which is required for feature unification

Cargo.lock

Lines changed: 51 additions & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
11
[workspace]
2-
members = ["optd-tmp"]
2+
members = ["optd-core"]
33
resolver = "2"

optd-core/Cargo.toml

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
[package]
2+
name = "optd-core"
3+
version = "0.1.0"
4+
edition = "2021"
5+
6+
[dependencies]
7+
trait-variant = "0.1.2"
8+
9+
# Pin more recent versions for `-Zminimal-versions`.
10+
proc-macro2 = "1.0.60" # For a missing feature (https://github.com/rust-lang/rust/issues/113152).

optd-core/src/expression.rs

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
//! Types for logical and physical expressions in the optimizer.
2+
3+
use crate::memo::GroupId;
4+
use crate::operator::relational::logical::LogicalOperator;
5+
use crate::operator::relational::physical::PhysicalOperator;
6+
7+
/// A logical expression in the memo table.
8+
///
9+
/// References children using [`GroupId`]s for expression sharing
10+
/// and memoization.
11+
pub type LogicalExpression = LogicalOperator<GroupId, GroupId>;
12+
13+
/// A physical expression in the memo table.
14+
///
15+
/// Like [`LogicalExpression`] but with specific implementation
16+
/// strategies.
17+
pub type PhysicalExpression = PhysicalOperator<GroupId, GroupId>;

optd-core/src/lib.rs

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pub mod expression;
2+
pub mod memo;
3+
pub mod operator;
4+
pub mod plan;
5+
pub mod rules;

optd-core/src/memo.rs

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
//! Memo table implementation for query optimization.
2+
//!
3+
//! The memo table is a core data structure that stores expressions and their logical equivalences
4+
//! during query optimization. It serves two main purposes:
5+
//!
6+
//! - Avoiding redundant optimization by memoizing already explored expressions
7+
//! - Grouping logically equivalent expressions together to enable rule-based optimization
8+
//!
9+
//! # Structure
10+
//!
11+
//! - Each unique expression is assigned an expression ID (either [`LogicalExpressionId`],
12+
//! [`PhysicalExpressionId`], or [`ScalarExpressionId`])
13+
//! - Logically equivalent expressions are grouped together under a [`GroupId`]
14+
//! - Logically equivalent scalar expressions are grouped toegether under a [`ScalarGroupId`]
15+
//!
16+
//! # Usage
17+
//!
18+
//! The memo table provides methods to:
19+
//! - Add new expressions and get their IDs
20+
//! - Add expressions to existing groups
21+
//! - Retrieve expressions in a group
22+
//! - Look up group membership of expressions
23+
//! - Create new groups for expressions
24+
25+
use crate::expression::LogicalExpression;
26+
27+
/// A unique identifier for a logical expression in the memo table.
28+
#[repr(transparent)]
29+
#[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)]
30+
pub struct LogicalExpressionId(u64);
31+
32+
/// A unique identifier for a physical expression in the memo table.
33+
#[repr(transparent)]
34+
#[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)]
35+
pub struct PhysicalExpressionId(u64);
36+
37+
/// A unique identifier for a scalar expression in the memo table.
38+
#[repr(transparent)]
39+
#[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)]
40+
pub struct ScalarExpressionId(u64);
41+
42+
/// A unique identifier for a group of relational expressions in the memo table.
43+
#[repr(transparent)]
44+
#[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)]
45+
pub struct GroupId(u64);
46+
47+
/// A unique identifier for a group of scalar expressions in the memo table.
48+
#[repr(transparent)]
49+
#[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)]
50+
pub struct ScalarGroupId(u64);
51+
52+
/// TODO(alexis) Add fields & link to storage layer.
53+
pub struct Memo;
54+
55+
/// TODO(alexis) Stabilize API by first expanding the Python code.
56+
impl Memo {
57+
/// TODO(alexis) Add docs.
58+
pub async fn add_logical_expr_to_group(
59+
&mut self,
60+
_group_id: GroupId,
61+
_logical_expr: LogicalExpression,
62+
) -> LogicalExpressionId {
63+
todo!()
64+
}
65+
}

optd-core/src/operator/mod.rs

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
//! This module contains type definitions related to query plan operators, both relational (logical
2+
//! / physical) and scalar.
3+
4+
pub mod relational;
5+
pub mod scalar;
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
/// Logical filter operator that selects rows matching a condition.
2+
///
3+
/// Takes input relation (`Relation`) and filters rows using a boolean predicate (`Scalar`).
4+
#[derive(Clone)]
5+
pub struct Filter<Relation, Scalar> {
6+
pub child: Relation,
7+
pub predicate: Scalar,
8+
}
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
/// Logical join operator that combines rows from two relations.
2+
///
3+
/// Takes left and right relations (`Relation`) and joins their rows using a join condition
4+
/// (`Scalar`).
5+
#[derive(Clone)]
6+
pub struct Join<Relation, Scalar> {
7+
pub join_type: String,
8+
pub left: Relation,
9+
pub right: Relation,
10+
pub condition: Scalar,
11+
}

0 commit comments

Comments
 (0)