Dynamicaly load dagster project modules and components #29247
milicevica23
started this conversation in
Ideas
Replies: 1 comment 3 replies
-
Is the primary objective here to be able to use a single template for all the different project variants? Or do you need more flexibility than that? |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
TLDR: The Dagster project loading should provide a mechanism to dynamically load Python modules where dagster definitions are defined and logically ignore components/modules
Let me first explain a general idea of what we do right now. Then, I will tackle some problems and, in the end, one possible solution
We are a platform team and want to make a template project for our company, which is then used in different teams across companies. We use https://cruft.github.io/cruft/ for templating https://cruft.github.io/cruft/, and the core of the project is dagster with other parts that are important for general infrastructure around the project
You can see more why we do this here: https://www.youtube.com/watch?v=GX508If6ol0&t=2625s
And also going more into what one project has here: https://www.youtube.com/watch?v=u5igKtEiKr8&t=24s
In the beginning, we were building a template project for one type of user group—data warehousing. Over time, data scientists joined, and the template project started to include a lot of stuff unrelated to the other user group.
Imagine one project having following and with the following structure

Over time, the user groups started to diverge in their needs, and the template project could no longer follow and fit into the project instance. And it started to be hard to update a project instance like the following picture
Data scientist started to delete dbt project because it was a overhead for the dependencies and compilation of the dbt project and loading of the dagster project started to take longer then needed
This is not ideal because maybe in the future they would need this peace of the code but right now they see it as an overhead and the question is how to solve this?
My thoughts:
Ideally there would be some way to define I want to load custom and bi module, and dagster under the hood would dynamically load those modules
You can imagine something similar as
pip install "fastapi[standard]"
where you dont install/start everything what fastapi has to offer but rather peaces of the code which you really need
In our case we use pixi package manager and use also multienvironment whe we can define a set of the dependencies needed for this feature https://pixi.sh/dev/workspace/multi_environment/
I could imagine having in the recursive loading functions a parameter which defines what submodules of the defs folder can be loaded in the code location
pixi enviroment definition file would look something like
and definitions.py something like
The idea is to dynamically load some modules which hold the definitions and therefore, make project usage simpler and more modular
Beta Was this translation helpful? Give feedback.
All reactions