Skip to content

[rush] Differentiate remote and local execution in telemetry. #4755

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Oct 29, 2024

Conversation

aramissennyeydd
Copy link
Contributor

Summary

Fixes #4737. My goal is to address data skew questions before we go ahead with #4680 which just adjusts the data skew.

Details

There is no great way currently to determine if telemetry for an operation was generated from the current machine or a remote machine. This is likely to cause data skew depending on how you ingest the Rush telemetry, either

  1. You restore duration from nonCachedDurationMs, which causes multiple events with the same duration (+/- a few milliseconds) if you emit events from each cobuild agent. That messes with averages and whatnot when aggregating your data.
  2. You calculate duration from startTimestampMs and endTimestampMs which causes massive spikes in duration collected across your agents, as all but the primary agents report 0.05s and the primary agent reports 15.00s. That also messes with averages and whatnot during aggregation.

I propose a new wasExecutedOnThisMachine flag that monorepo maintainers can then use in their plugins to decide whether or not they want to process the given operation's data.

How it was tested

Tested in this repository, using the sharded-repo sandbox.

Impacted documentation

Anything where Rush describes writing your own telemetry plugin.

@UberMouse
Copy link
Contributor

bump on this would love to get it merged!

@iclanton iclanton enabled auto-merge (squash) October 29, 2024 17:34
@iclanton iclanton merged commit 5a5b413 into microsoft:main Oct 29, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Closed
Development

Successfully merging this pull request may close these issues.

[rush] duplicated cobuild telemetry leading to data skew
4 participants