Brand new user, still not sure if I'm conceptualizing things correctly. #28665

Evan0000000000 · 2025-03-21T18:45:29Z

Evan0000000000
Mar 21, 2025

Hi all,
I suspect this is going to sound like I'm using Dagster 'wrong' to some number of people, but bear with me as I'm hoping to accomplish some essential tasks first before moving on to ETL and more strongly data related items. This is for work so I don't have much flexibility and am trying to make the best of the tools I have available, of which Dagster is one.

Despite being new, I'm playing around with Ops as the things I have in mind don't seem to fit into the model of an Asset (i.e., they embody things I want to happen, not bodies of data)

An example might be keeping a directory of log files pruned to some size and recency requirements as a precursor to later work to analyze them.

I think I would set something like that up along these lines:

@op
def first_step(context: OpExecutionContext, config: CustomConfig) -> list[str] ...
# Scan a path and gather files that match some criteria

@op
def second_step(contextd: OpExecutionContext, config: CustomConfig): ...
# take output from first_step, compress those files and delete the originals

@job(config=RunConfig(ops={'step_one': CustomConfig,
                                                  'step_two': CustomConfig}))
def a_job(): 
    files_to_prune = first_step()
    second_step(files_to_prune)

(and then down the line I guess you might make a sensor for each file you're interested in and when that sensor goes off, run a job to refresh a corresponding asset and then prune the file)

I think the big thing tripping me up at this initial state is that my instinct as a Python dev is to want to put more intermediate logic in the job function, like logging, formatting an attribute of the config object into a more helpful Python type, stuff like that. I'm not clear on whether that's allowed, allowed-but-not-ideal, or something else as I think I understand correctly that the job function is really just to define relationships between ops and the configuration and other bits that go into them.

I also saw somewhere an example that created a graph that did have some intermediate operations and the graph was then referenced in a job, so is that the way to handle needing to do 'stuff' in between ops?

garethbrickman · 2025-03-21T21:13:30Z

garethbrickman
Mar 21, 2025

the job function is really just to define relationships between ops and the configuration and other bits that go into them.

That's correct, it should just be used as a wrapper for callings ops/graphs.

I also saw somewhere an example that created a graph that did have some intermediate operations and the graph was then referenced in a job

You may be thinking of op graphs.

I'm playing around with Ops as the things I have in mind don't seem to fit into the model of an Asset (i.e., they embody things I want to happen, not bodies of data)

You shouldn't feel constrained by this conception: ultimately both @ops and @assets are just wrappers for executing Python logic.
Ops are precursors to assets, but under the hood assets are actually just fundamentally ops themselves!

The benefit to using assets come with their richer metadata and can use special features like Declarative Automation that ops can't.

So my advice would be unless you run into a use case that absolutely requires complex op functionality like Dynamic Graphs, then you're overall better off building with assets from the outset.

P.S. you can even build an asset out of a graphs of ops!

1 reply

Evan0000000000 Mar 24, 2025
Author

Hey Gareth thanks for the insight; my impressions from reading the documentation follows what you're saying about assets having more interesting functionality, so I'd be happy to give them a shot instead.

My worry at the start was that defying the concepts might result in my goals not fitting well into the asset abstraction but it sounds like that's not likely to be the case.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Brand new user, still not sure if I'm conceptualizing things correctly. #28665

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Brand new user, still not sure if I'm conceptualizing things correctly. #28665

Uh oh!

Uh oh!

Evan0000000000 Mar 21, 2025

Replies: 1 comment · 1 reply

Uh oh!

garethbrickman Mar 21, 2025

Uh oh!

Evan0000000000 Mar 24, 2025 Author

Evan0000000000
Mar 21, 2025

Replies: 1 comment 1 reply

garethbrickman
Mar 21, 2025

Evan0000000000 Mar 24, 2025
Author