Skip to content

Add IORuntime#liveFiberSnapshot #4444

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 23, 2025

Conversation

iRevive
Copy link
Contributor

@iRevive iRevive commented Jul 12, 2025

Fixes #2580, #3025. The implementation is based on #3038.

@iRevive iRevive requested review from djspiewak and armanbilge July 12, 2025 08:17
@iRevive iRevive force-pushed the feature/live-fiber-snapshot branch from 5ff02a2 to 9b4c259 Compare July 12, 2025 08:29
/**
* Mapping of worker threads to their currently active fibers.
*/
def workers: Map[WorkerInfo, List[FiberInfo]]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

workers will always be empty in Scala.js. Should we add FiberSnapshotPlatform and provide a platform-specific API?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah feels like a platform specific API would be better here unless it creates extensive contortions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

/**
* The list of all global (non-worker-local) fibers.
*/
def global: List[FiberInfo]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

global name feels odd. Should we use external or foreign instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

External feels best since we use that term a lot for such things

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@iRevive iRevive force-pushed the feature/live-fiber-snapshot branch from 9b4c259 to 757a237 Compare July 12, 2025 08:49
Copy link
Member

@djspiewak djspiewak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is very good. A couple high level comments

/**
* The underlying fiber.
*/
def fiber: Fiber[IO, ?, ?]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You know, this opens up some really weird possible patterns, like taking a live snapshot and then canceling random fibers. Technically that's valid but it might break some assumptions (e.g. in combinators which use local fibers as an implementation detail). Do we need to expose the actual Fiber instance in this way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need fiber in exactly one place:

def pretty: String = {
val id = System.identityHashCode(fiber).toHexString
val prefixedTrace = if (trace.toList.isEmpty) "" else "\n" + Tracing.prettyPrint(trace)
s"cats.effect.IOFiber@$id $state$prefixedTrace"
}

It might be helpful to include an identifier to indicate which specific fiber this FiberInfo refers to.
Perhaps we can replace fiber with a fiberId: String instead? That should be sufficient.

/**
* The list of all global (non-worker-local) fibers.
*/
def global: List[FiberInfo]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

External feels best since we use that term a lot for such things

/**
* Mapping of worker threads to their currently active fibers.
*/
def workers: Map[WorkerInfo, List[FiberInfo]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah feels like a platform specific API would be better here unless it creates extensive contortions

/**
* The underlying thread representing the worker.
*/
def thread: Thread
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this API (which is very fair) makes a decent symmetrical case for producing the Fiber in FiberInfo (as you've done). People are just going to have to be aware that this whole thing is a pretty serious footgun.

Copy link
Contributor Author

@iRevive iRevive Jul 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same story here.

It is used in exactly one place:

val workersStatuses = snapshot.workers.map {
case (worker, fibers) =>
val yielding = fibers.collect {
case fiber: FiberInfo if fiber.state == FiberInfo.State.Yielding => fiber
}.size
printFibers(fibers)(print)
s"${worker.thread} (#${worker.index}): $yielding enqueued"
}

Maybe we should use threadInfo: String instead, which can use Thread#toString under the hood? Sufficient enough for debugging and not enough to break something.

* } yield ()
* }}}
*/
def liveFiberSnapshot(): FiberSnapshot =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to add some documentation to justify why we aren't wrapping this in IO. In particular, the fact that we expect it to be needed from outside an IO context (for debugging and monitoring) makes it necessary to keep it unwrapped. We could hypothetically do SyncIO here instead if we want to be purists, but given that this is in the unsafe package I don't think it's strictly needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a few more bits:

   * This method is intended primarily for external tools, such as debuggers, monitoring agents,
   * or test frameworks. As such, it is '''not''' wrapped in [[cats.effect.IO]] or
   * [[cats.effect.SyncIO]] since this API is expected to be used outside the effect context.
   * Requiring an `IO` here would make integration with external tools more cumbersome or even
   * impossible.
   *
   * ....
   *
   * @note
   *   this method is part of the `unsafe` package and should be used with care. Its primary
   *   purpose is for runtime introspection, not for application logic.

@djspiewak
Copy link
Member

Thought about it more, and I kinda like having the threads and fibers. It's analogous to the APIs the JVM exposes on Thread itself.

@djspiewak djspiewak merged commit d2e396a into typelevel:series/3.x Jul 23, 2025
36 of 38 checks passed
@iRevive iRevive deleted the feature/live-fiber-snapshot branch July 23, 2025 15:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide within-process "public" API for fiber dumps
2 participants