Skip to content

[FR] Consider dial protocol tracing #411

@tallclair

Description

@tallclair

It would be cool to have tracing on the initial dial request, but I think this would require a protocol-level enhancement to DialResponse (grpc).

The initial flow (happy path) looks like:

client -> server -> agent -[ endpoint ]- agent -> server -> client

It would be great to have latency information for each hop. In other words, something like:

  1. Server records dial_req received timestamp (and server ID?)
  2. Agent records dial_req received timestamp (and agent ID?)
  3. Agent records endpoint dial complete timestamp
  4. Agent includes traces in the DialResponse
  5. Server records dial_resp received timestamp
  6. Server adds request & response received timestamps to DialResponse
  7. Client constructs the full trace, records latency metrics for each hop, logs full trace at high verbosity

It looks like there's OpenCensus gRPC integration that we should investigate, but I'm not sure if it would be work with our multiplexed streams. At the very least, we should make sure our design fits the OpenCensus tracing spec

Metadata

Metadata

Assignees

No one assigned

    Labels

    lifecycle/frozenIndicates that an issue or PR should not be auto-closed due to staleness.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions