How can I debug disappearing tasks? #5291
Unanswered
jorendorff
asked this question in
Q&A
Replies: 1 comment
-
What we actually did was add timeouts. You can add a timeout to any specific chunk of the work and log if the timeout lapses or not; this lets you bisect in your code where the problem is happening. But in our case, as soon as we thought of this technique, we realized that we were doing some HTTP requests without timeouts. Adding them fixed the behavior. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I have some fairly complicated code that uses the (very beta) Rust Azure SDK to access Azure blob storage. We have Tokio blocking tasks using Rayon, Rayon threads submitting async tasks to Tokio, a hand-written semaphore to prevent too many requests from running at once -- it's a lot.
Recently we started seeing some of the tasks go silent and never come back, apparently while trying to talk to Azure. It happens once every few days, and it tends to happen on multiple hosts at once. The trouble could be in our code, the Azure SDK, Hyper, Tokio, our host VMs, Azure itself... We're really a bit baffled. We don't know how to narrow down the list of suspects.
When we connect with tokio-console, we see a list of tasks, but if the key is in here, it's not jumping out at us:
I gather the warning about tasks having lost their waker is often spurious.
Certainly the azure tasks are not being polled frequently (we didn't see the "Polls" column tick up).
Any ideas? What would you do to attack this problem?
Beta Was this translation helpful? Give feedback.
All reactions