-
Notifications
You must be signed in to change notification settings - Fork 43
Description
Copy the original discussions from #317 (@dbmikus)
Make child workflows create OpenTelemetry child spans so that you can track execution across sub-workflows.
Testing:
- tested running against a local workflow that creates child works
Issues to fix:
Queue("...").enqueue_async
workflows do not create child spans
When I create child workflows via like so:
myqueue = Queue("myqueue", concurrency=25)
@DBOS.workflow()
async def wf():
await myqueue.enqueue_async(sub_wf)
@DBOS.workflow()
async def sub_wf():
pass
the sub_wf
workflows show up as new traces.
It would be useful to use standard OTel tooling for tracking workflows in DBOS, if possible.
I understand that there are pain points with OTel and very long-running traces. I've previously put traces on Kafka and SQS, but those were consumed relatively quickly. TBH, I'm not sure of the ramifications of having a trace that can exist for days. There might be no problems, or it might break OTel collection. There are ways to link two traces together, which might alleviate long-lived trace problems.
Another simpler solution is to make child workflows exist within the same trace as long as they are not executed on a different executor.
For context, we use OpenTelemetry for observability and sometimes data collection of our LLMs. We record some function inputs/outputs on OTel spans and also record log messages in the spans. Being able to debug the LLM flow across spans is very helpful, and other LLM ops products support ingesting OTel traces.