Replies: 1 comment 1 reply
-
I don't think it makes sense to run dags in separate threads. This:
Is something that should never happen. If you read best practices in our dags it's explicitly wrong to run time consuming operation when dag is being parsed. Don't. You can do long running operations inside your "execute" methods of tasks, yes, but not when the Dags are parsed to establish the structure of the dag and let the Dag be serialized for scheduling. Also threads in Python do something different than you think they do most likely. At least until Python 3.14 there is GIL (global interpreter lock) and Threads do not not help with running python code in parallel (they run concurrently but not in paralllel) - it only helps in case you have I/O operation and waiting for them by several threads essentially (which again is strictly "no-no" when it comes to best practices of Dag parsing). So I see no particular reason why you would like to run parsing of Dags in separate threads as it gives you nothing if you follow good practice. Every Dag file is parsed in a separate process already to achieve paralllelism, so if you want to achieve parallelism, make several Dag files, import the common code from them and you are done. There is absolutely no need (and again - it makes no sense IMHO) to run Dag parsing in separate threads. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
If you happen to have multiple threads that are creating DAGs via
with DAG(...) as dag:
, but don't use the corresponding dag variable, and instead implicitly rely on the DagContext/get_current_dag()
, you'll run into issues because the current DAG is global to the process, but is modified by each thread.Here's a pseudo-y example case of what I mean:
The best solution is probably to use the dag object and pass it down as needed instead of relying on the current dag.
Though I'm kind of curious is there a reason that the DagContext can't be thread-local? Would making that object's storage thread-local allow the original pattern to work? (Could it break other things?)
Beta Was this translation helpful? Give feedback.
All reactions