-
Notifications
You must be signed in to change notification settings - Fork 3
Description
(Sections correspond to sections in the blog post.)
The dev-prod dilemma
- Kedro-Datasets link is very old—can just point to stable or something?
- Update pandas API on Spark link: https://spark.apache.org/pandas-on-spark/
- May need to use an archive for BigQuery DataFrames link: https://web.archive.org/web/20250512141326/https://voltrondata.com/news/google-bigframes-ibis (or point to something else like https://cloud.google.com/bigquery/docs/reference/bigquery-dataframes, which was used here with a footnote: https://github.com/kedro-org/kedro-devrel/blob/main/blog_post_collaboration/kedro_ibis.md#the-dev-prod-dilemma)
- Nit: I find this a bit weird in brackets:
[Although that begs the question of how standardised is the SQL standard? Gil Forsyth's PyData NYC 2022 talk demonstrates challenges arising from differences between SQL dialects. Even the dbt-labs/jaffle_shop GitHub repository README disclaims, "If this steps fails, it might mean that you need to make small changes to the SQL in the models folder to adjust for the flavor of SQL of your target database"].
I think slightly better phrasing might be, as a separate paragraph in parentheses:
(Although that begs the question: how standardised is the SQL standard? Gil Forsyth's PyData NYC 2022 talk demonstrates challenges arising from differences between SQL dialects. Even the dbt-labs/jaffle_shop GitHub repository README disclaims, "If this steps fails, it might mean that you need to make small changes to the SQL in the models folder to adjust for the flavor of SQL of your target database.")
Creating a custom ibis.Table
dataset
Can we put a callout here? Or however it works, basically we can replace
with
Note: Since this article was originally published, Ibis datasets have been contributed to the official Kedro-Datasets repository. Find the complete dataset implementations on GitHub.
Configuring backends with the OmegaConfigLoader
using variable interpolation
I guess we can update the examples to simply replace
jaffle_shop.datasets.ibis.TableDataset
with
ibis.TableDataset
Building pipelines
I've significantly simplified the type hints by:
Remove from __future__ import annotations
(Python 3.8 is EOL)
Remove TYPE_CHECKING
import and TYPE_CHECKING
block
Replace ir.Table
with ibis.Table
Try it yourself
Nit: Change kedro viz run
to kedro viz
Metadata
Metadata
Assignees
Labels
Type
Projects
Status