-
Notifications
You must be signed in to change notification settings - Fork 3
Description
- June 3rd: Snowflake announces Polaris Catalog, an open source catalog for Apache Iceberg "in the next 90 days" (announcement)
- June 4th: Databricks announces it has agreed to acquire Tabular (announcement)
- June 13th: Databricks announces the open sourcing of Unity Catalog https://github.com/unitycatalog/unitycatalog (announcement)
At the moment, each platform offers only full read & write capabilities to their own catalog, and read-only capabilities for competitors:
(source)
And what's more important: data catalogs aren't new, but we're seeing catalogs created for different use cases and business needs: technical, business, and operational (source).
These are just some open source ones1 that have been in the news recently. But there's also Apache Nessie, the Hive Metastore, the Iceberg REST Catalog, probably others I'm missing. Then there are the commercial, vendor-driven ones.
And then we have... the Kedro Catalog!
We've sometimes got questions on "how does the Kedro Catalog compare to the Unity Catalog" - and the answer is that they're complementary, but this is not immediately clear to users (see kedro-org/kedro-plugins#542).
It's very clear that this is going to be a hot topic of discussion in the data engineering space in the coming months so we should have a good answer to how does Kedro interact with all these.
Footnotes
-
counting Polaris as open source ↩