Delta Tables and Parquet Files #2882
hpatner
started this conversation in
Integrations
Replies: 1 comment
-
👋 @hpatner I think it should now be possible to work with Delta Lakes using https://clickhouse.com/docs/en/sql-reference/table-functions/deltalake and ClickHouse (livebook-dev/kino_db#83) cells. Alternatively, DuckDB with an extension (https://duckdb.org/2024/06/10/delta.html) might also work. Or a wrapper around https://github.com/delta-io/delta-kernel-rs could be used to get the data directly into Explorer quickly. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Would be cool if livebook had the ability to work with parquet files through the delta table format. It takes CSV files converts to parquet files and includes a JSON log. The JSON log records every operation allowing versioning and rollbacks (for time travel). Together, you get ACID transactions, metadata handling, and a great base for working with large datasets. Databricks and Microsoft Fabric have done implementations of this but it is open source and seems like it would be a nice fit for livebooks as storage could be local or based on an S3 instance without the need for a database engine for compute.
Beta Was this translation helpful? Give feedback.
All reactions