Skip to content

Conversation

@sgrebnov
Copy link
Collaborator

@sgrebnov sgrebnov commented Nov 4, 2024

Upgrade datafusion-federation crate to include the following improvement:
Run optimize_projections as part of federated plan optimization

@sgrebnov sgrebnov force-pushed the sgrebnov/upgrade-federation branch from 51926a4 to 6e15ccc Compare November 4, 2024 19:06
@sgrebnov sgrebnov force-pushed the sgrebnov/upgrade-federation branch from 5e7b03a to 0534092 Compare November 4, 2024 22:14
sgrebnov and others added 2 commits November 4, 2024 15:55
Co-authored-by: Phillip LeBlanc <phillip@leblanc.tech>
@sgrebnov sgrebnov merged commit df4bab1 into spiceai Nov 5, 2024
3 checks passed
@sgrebnov sgrebnov deleted the sgrebnov/upgrade-federation branch November 5, 2024 00:16
zeroxaa pushed a commit to ReByteAI/datafusion-table-providers that referenced this pull request Nov 22, 2024
zeroxaa added a commit to ReByteAI/datafusion-table-providers that referenced this pull request Nov 27, 2024
phillipleblanc added a commit that referenced this pull request Mar 12, 2025
* DuckDB streaming (#41)

* wip

* duckdb streaming

* clippy

* arrow to arrow stream

* error message

* fix: Support `INTERVAL` in SQLite (#85)

* poc: Support interval in SQLite using an AST analyzer

* Refactoring

* u64 -> i64

* fix: Support INTERVAL expressions in SQLite

* docs: Add comment about flattening arguments list

* refactor: Rename SQLiteVisitor to SQLiteIntervalVisitor

* test: Add some tests

---------

Co-authored-by: Phillip LeBlanc <phillip@leblanc.tech>

* Use DuckDB streaming

* Fixes

* Fix feature flagging

* Fix lint

* Add spiceai branch to pull_request

---------

Co-authored-by: peasee <98815791+peasee@users.noreply.github.com>
Co-authored-by: Phillip LeBlanc <phillip@leblanc.tech>

* Add feature flag to disable postgres federation

* fix: Disable federation in memory mode databases (#86)

* Only disable federation for tableproviderfactory

* SQLite: Validate expected indexes when attaching local datasets (#88)

* SQLite: Validate expected indexes when attaching local datasets

* Add test for indexes creation and retrieval (SQLite)

* Update warning messages

* SQLite: Validate expected primary keys when attaching local datasets (#89)

* Change to use Spice AI fork of sea_query for SQLite decimal support (#90)

* fix: Don't silence blocking task errors (#91)

* fix: Don't silence blocking task errors

* fix: Cover Ok(Err()) match arm for DuckDB writer handle

* refactor: Rename overloaded error e

* fix: Re-attach databases on each DuckDB query (#92)

* fix: Re-attach databases on each query

* Update src/sql/db_connection_pool/dbconnection/duckdbconn.rs

---------

Co-authored-by: Phillip LeBlanc <phillip@leblanc.tech>

* Correctly handle mysql timestamp() and datetime() types (#96)

* Correctly handle mysql timestamp() and datetime() types

* Restructure MySQL test, add test for timestamp() types

* Include test for datetime types

* Postgres enum support (#100)

* Postgres enum support

* Add enum test as part of integration test

* update

* Remove the duplicate function

* fix

---------

Co-authored-by: Phillip LeBlanc <phillip@leblanc.tech>

* Fix SQLite Invalid column type Real bug (#98)

* Prevent SQLite from writing incomplete data on errors (#101)

* DuckDBTableProviderFactory keeps track of opened instances (#105)

* Ignore CHECKPOINT errors (#107)

* Don't attempt to CHECKPOINT after writing to DuckDB (#108)

* SqliteTableProviderFactory keeps track of opened instances (#109)

* wip

* wip

* wip

* tweak

* Support all time() types in MySQL (#97)

* Support all time() types in MySQL

* Include test for time types

* Upgrade to Arrow 53, DataFusion 42 and DuckDB 1.1 (#111)

* Handle inconsistent scale in Postgres Numeric Type data (#110)

* Verify MySQL parameters and connections before creating connection pool (#113)

* Verify MySQL parameters and connections before creating connection pool

* Update

* Propagate MySQL wrong table error (#114)

* Fix MySQL timestamp type (#116)

* Postgres should respect target decimal precision and scale (#120)

* Update row -> arrow conversion for all MYSQL_TYPE_VAR_STRING and MYSQL_TYPE_STRING types (#118)

* Use Decimal256 instead of Decimal128 for MySQL decimal type (#115)

* Fix mysql blob & text types (#117)

* Add sqlite_busytimeout parameter as user configurable param (#121)

* Add sqlite_busytimeout parameter as user configurable param

* Remove debug log

* Fix lint, fix integration test

---------

Co-authored-by: Phillip LeBlanc <phillip@leblanc.tech>

* Use arrow dictionary type for mysql enum type (#119)

* Remove prefix for sqlite busy timeout param (#123)

* Support parsing sqlite busy_timeout durations with units (#124)

* Support retries when writing data to SQLite (#125)

* Implement write retry for DuckDB (#128)

* Preserve records batch order (update datafusion-federation) (#130)

* fix: add ast analyzer for mysql rank (#131)

* fix: Add AST analyzer for rewriting rank() in MySQL

* test: Add new test, fix other SQLite tests

* docs: Clarify what frame clauses are ignored in

* chore: Clippy

* fix: Remove NULLS FIRST/LAST in more MySQL Window functions (#133)

* Update datafusion-federation crate to include `unnest` support (#136)

* Update datafusion-federation to the latest (#137)

* Update datafusion-federation (improve filters pushdown) (#149)

* tests: Ensure all tests run on PRs, try to fix flaky DuckDB test (#150) (#152)

* Implement native schema inference for PostgreSQL (#151)

* wip

* test cleanup

* Implement native schema inference for PostgreSQL

* handle bpchar

* fix uuid

* Fix test

* align

* fix snapshot

* Fix DuckDB error messages (#154)

* SQLite: use projected schema when converting records (#158)

* feat: Enable in-memory federation (#159)

* feat: Enable federation for in-memory tables

* test: Update SQLite test

* test: Update tests

* Always read TimezoneTZ from PostgreSQL as UTC (#161)

* fix: Prevent absolute sequences in file paths (#160)

* fix: Prevent absolute sequences in file paths

* refactor: Make checks more robust

* refactor: Check for symlinks, make errors more robust

* test: Update test

* fix: Use path::absolute instead of canonicalize

* Remove restriction on file being in working directory (#163)

* Include unnecessary columns pruning step during federated plan creation (#162)

* feat: add duckdb checks for unsupported column types (#164)

* feat: Add DuckDB checks for unsupported column types

* deps: Update Cargo.toml

* fix: Make serde non-optional, let InvalidTypeAction be Copy

* fix: More features shenanigans

* fix: DuckDB boolean list support (#169)

* Upgrade to DataFusion 43 (#167)

* Upgrade to DataFusion 43

* Support Utf8View & BinaryView

* Support nested utf8view & binaryview

* Use DuckDB Dialect and update Datafusion patch (#170)

* Free up disk space for integration test (#171)

* Drop postgres containers once test is finished

* Add step to free disk space in integration test job

* update

* Fix `MySQLConnection::get_schema` for uppercase `TableReference` (#166)

* fix MySQLConnection::get_schema for uppercase TableReference

* Update mysqlconn.rs

* PR reviews

* fix clppy

* flatten

* Update mysqlconn.rs

---------

Co-authored-by: Phillip LeBlanc <phillip@leblanc.tech>

* Fix MySQL timestamp conversion when running via `MySQLSQLExec` (#173)

* Set MySQL session default time zone to UTC to match Datafusion (#174)

* Fix MySQL test (#181)

* Increase the waiting time for starting MySQL test container

* Update health check command

* Improve MySQL errors (#180)

* Further improve MySQL error to be concise and specific (#184)

* Further improve MySQL error to be concise and specifc

* Update src/sql/db_connection_pool/mysqlpool.rs

Co-authored-by: Scott Lyons <scottalyons@gmail.com>

---------

Co-authored-by: Scott Lyons <scottalyons@gmail.com>

* Update DuckDB Error messages (#182)

* Improve Postgres errors (#183)

* Separate MySQL error source to separate line (#185)

* refactor: Update dbconnection errors (#188)

* Handle invalid data types for Postgres (#191)

* Handle invalid data types for Postgres

* Fix lint issues

* Fix tests

* Fix integration test

* Fix writing to a Postgres table with a schema (#195)

* Fix insert statement when all columns are constraint columns (#196)

* Revert "Fix writing to a Postgres table with a schema (#195)"

This reverts commit afd31a7.

* Support Postgres table with a schema write (#197)

* Allow overriding the default DuckDB dialect (#201)

* Fix DuckDBDialect creation according to DataFusion DuckDBDialect update (#202)

* DuckDB: support for nested types in Lists (Struct, List, FixedSizeList) (#203)

* Fix datafusion federation (#200)

* Update datafusion-federation to fix unnest support (#206)

* Use random id (#205)

* Fix column name rewrite when column alias has same name as table (#207)

* fix: Don't silence disk full errors with SQLite (#208)

* fix: Don't silence disk full in SQLite

* chore: Remove commented out line

* fix: Validate schema for SQLite connections (#209)

* fix: SQLite validate schema to stop panicking

* fix: SQLite does not support dictionary

* chore: Fix clippy

* fix: SQLite does not support Map

* chore: Clippy

* fix: Optional dependencies

* chore: Clippy

* chore: More clippy

* refactor: Move SchemaValidator implementations into DB modules

* Fix dremio subquery unparsing (#210)

* fix: Don't panic on unsupported data insert with Postgres (#211)

* fix: update datafusion-federation with the fix to preserver OFFSET and LIMIT in logical plan (#212)

* fix: update datafusion-federation with the fix to preserver OFFSET and LIMIT in logical plan

* fix: update sql dependency

* fix: revert formatting

* Update `datafusion-federation` to support multi-level table references (#213)

* Update datafusion-federation

* Update federation

* Postgres: add schema validation for record batches during write (#215)

* Fix TableScan filter rewrite & column expressions rewrite (#214)

* Fix TableScan filter rewrite

* update datafusion federation patch

* Federation fix for outer ref columns (#217)

* Fix table_reference (#218)

* Revert "Fix table_reference (#218)"

This reverts commit 3d5336e.

* Revert "Federation fix for outer ref columns (#217)"

This reverts commit 7dc094e.

* Update federation to fix correlated subquery bug

* fix: Use Unparser for expr to sql (#226)

* fix: Use Unparser for expr to sql

* chore: Remove println

* fix: Always cast to BIGINT

* Fix MySQL docker image for PR tests (#225)

* chore: Clippy

* fix: Install SQLite3

---------

Co-authored-by: Phillip LeBlanc <phillip@leblanc.tech>

* fix: Revert using unparser for filter pushdown (#227)

* Update federation commit: Add support for more AST expressions for multi-level rewrites

* fix: Postgres LargeUtf8 is equivalent to Utf8 (#231)

* fix: Postgres LargeUtf8 is equivalent to Utf8

* fix: Normalize both schemas

* fix: Optimize postgres schema loop

* fix: Field Clone/Copy shenanigans

* fix: Test

* fix: Test

* MySQL: include column name when failed to get a row value (#232)

* MySQL: treat MySQL special “zero” date '0000-00-00' as NULL (#233)

* Rename `InvalidTypeAction` to `UnsupportedTypeAction` (#234)

* Handle JSONB as UnsupportedTypeAction::String (#235)

* Fix constraint verification for columns with uppercase letters (#237)

* Upgrade to DuckDB v1.2.0 (#239)

* DuckDB: Use temp table only for append with defined resolution strategy (#242)

* Fix arrow-rs and chrono's quarter() conflict (#244)

* Revert "DuckDB: Use temp table only for append with defined resolution strate…" (#243)

This reverts commit 39be511.

* DuckDB: fix error handling during record batch insertion (#245)

* Bump secrecy version (#248)

* Add memory_limit support for DuckDB (#251)

* Revert "Bump secrecy version (#248)" (#252)

This reverts commit a1f7173.

* cargo lock

* fix merge

* compiling

* Fix lint and test issues

* Fix unit tests

* Fix integration tests

---------

Co-authored-by: yfu <fevin86@gmail.com>
Co-authored-by: peasee <98815791+peasee@users.noreply.github.com>
Co-authored-by: Sevenannn <qianqliu@uw.edu>
Co-authored-by: Sergei Grebnov <sergei.grebnov@gmail.com>
Co-authored-by: Qianqian <130200611+Sevenannn@users.noreply.github.com>
Co-authored-by: Jack Eadie <jack.eadie0@gmail.com>
Co-authored-by: Scott Lyons <scottalyons@gmail.com>
Co-authored-by: Evgenii Khramkov <hey@ewgenius.me>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants