Sanity check on Dagster use case #31172

nicpottier · 2025-07-11T22:06:58Z

nicpottier
Jul 11, 2025

I feel like I might be trying to put a square peg in a round hole here so want to double check whether me using dagster for a particular use case makes any sense.

We are building a pipeline to turn PDFs into websites. That involves various steps of image extraction, text extraction, translation, upscaling, filtering etc.. at a high level a lot of these things feel like software defined assets and the dagster model of dependencies and lineage sure makes a lot of sense. I've played with putting together a prototype and I rather like the structure dagster is forcing us into, it feels maintainable and easy to reason about.

..except that we are building this as a tool that can run on a variety of user defined PDFs and that's where things start to feel wonky. I can make the PDF path and output directory part of a config and that's natural enough but it gets weird with IOManagers. I can't seem to figure out how to make an IOManager cleanly write its output to a different directory per "book". It doesn't cleanly fit into partitions because these are config defined and I can't access the config from the IOManager. It also feels like I need to have these outputs segmented somehow otherwise I could be overwriting other books assets when I rerun things.

I'm very much at the "I don't know what I don't know" stage of dagster but I'm starting to wonder if this is the right fit? I realize dagster is very much made for more typical ETL workflows which is very different, so perhaps I'm off the mark to use it at all.

For those who know Dagster better is this a fool's errand and should I be looking elsewhere?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Sanity check on Dagster use case #31172

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Sanity check on Dagster use case #31172

Uh oh!

nicpottier Jul 11, 2025

Replies: 0 comments

nicpottier
Jul 11, 2025