Skip to content

Plan-Execute-Reflect Agent Tracing #3964

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: feature/agent-tracing
Choose a base branch
from

Conversation

chriswlai
Copy link

@chriswlai chriswlai commented Jul 7, 2025

Description

Adds tracing to the Plan-Execute-Reflect agent. Uses custom index mapping for storage. Basic connection with Conversational agent. Also re-structures the tracing class to be more flexible for non-agent tracing.

nLJ1Ri8m33slNv5Z4OKls04qCGa9q2JGxY6r85kRLeuxiJ7-VMvQGcdfj6c7zHBPVizvzbCpiIIHYZG9vDOIIaL29T9QCQt3vMB31w0u1eA_aQX3SaUTMouUGLA1C3Docq2y1Y9jTY9DRkO3HniAT_SwboOqCeL8I2BKsuB03ch4yQpgz0qu4hb4CAvuWmWcTZJhogSWg0Pi3JOKEw0g4mv-hEztegwLL3bjNa0vr4Dc20xa

Related Issues

Resolves #3971

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@chriswlai chriswlai requested a deployment to ml-commons-cicd-env-require-approval July 7, 2025 07:59 — with GitHub Actions Waiting
@chriswlai chriswlai requested a deployment to ml-commons-cicd-env-require-approval July 7, 2025 07:59 — with GitHub Actions Waiting
@chriswlai chriswlai requested a deployment to ml-commons-cicd-env-require-approval July 7, 2025 07:59 — with GitHub Actions Waiting
@chriswlai chriswlai requested a deployment to ml-commons-cicd-env-require-approval July 7, 2025 07:59 — with GitHub Actions Waiting
@chriswlai chriswlai requested a deployment to ml-commons-cicd-env-require-approval July 7, 2025 17:28 — with GitHub Actions Waiting
@chriswlai chriswlai requested a deployment to ml-commons-cicd-env-require-approval July 7, 2025 17:28 — with GitHub Actions Waiting
@chriswlai chriswlai requested a deployment to ml-commons-cicd-env-require-approval July 7, 2025 17:28 — with GitHub Actions Waiting
@chriswlai chriswlai requested a deployment to ml-commons-cicd-env-require-approval July 7, 2025 17:28 — with GitHub Actions Waiting
@chriswlai chriswlai had a problem deploying to ml-commons-cicd-env-require-approval July 14, 2025 18:16 — with GitHub Actions Error
@chriswlai chriswlai had a problem deploying to ml-commons-cicd-env-require-approval July 14, 2025 18:16 — with GitHub Actions Error
@chriswlai chriswlai had a problem deploying to ml-commons-cicd-env-require-approval July 14, 2025 18:16 — with GitHub Actions Failure
@chriswlai chriswlai had a problem deploying to ml-commons-cicd-env-require-approval July 14, 2025 18:16 — with GitHub Actions Failure
@chriswlai chriswlai requested a deployment to ml-commons-cicd-env-require-approval July 15, 2025 19:06 — with GitHub Actions Waiting
@chriswlai chriswlai requested a deployment to ml-commons-cicd-env-require-approval July 15, 2025 19:06 — with GitHub Actions Waiting
@chriswlai chriswlai requested a deployment to ml-commons-cicd-env-require-approval July 15, 2025 19:06 — with GitHub Actions Waiting
@chriswlai chriswlai requested a deployment to ml-commons-cicd-env-require-approval July 15, 2025 19:06 — with GitHub Actions Waiting
Signed-off-by: chrislai <chrlaii@amazon.com>
Signed-off-by: chrislai <chrlaii@amazon.com>
@chriswlai chriswlai requested a deployment to ml-commons-cicd-env-require-approval July 22, 2025 18:34 — with GitHub Actions Waiting
@chriswlai chriswlai requested a deployment to ml-commons-cicd-env-require-approval July 22, 2025 18:34 — with GitHub Actions Waiting

if ("aws.bedrock".equalsIgnoreCase(provider)) {
// Bedrock/Claude format: input_tokens, output_tokens (or inputTokens, outputTokens)
if (usage.containsKey("input_tokens")) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wondering if you check on contains "input_tokens", but if the usage has "input_tokens_meta_data" but no "input_tokens", this will through key not found exception or getting null while toString(), can you verify this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed by adding fallback

if (tensors != null && !tensors.isEmpty()) {
var tensor = tensors.get(0);
// Try result
if (tensor.getResult() != null) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you test it with local model output, even though most of the users should use remote model, but what if a users accidentally passed over a local model output, what would happen here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, very good point. addressed as well by adding fallback for local model

Signed-off-by: chrislai <chrlaii@amazon.com>
@chriswlai chriswlai requested a deployment to ml-commons-cicd-env-require-approval July 23, 2025 20:38 — with GitHub Actions Waiting
@chriswlai chriswlai requested a deployment to ml-commons-cicd-env-require-approval July 23, 2025 20:38 — with GitHub Actions Waiting
@chriswlai chriswlai requested a deployment to ml-commons-cicd-env-require-approval July 23, 2025 20:38 — with GitHub Actions Waiting
@chriswlai chriswlai requested a deployment to ml-commons-cicd-env-require-approval July 23, 2025 20:38 — with GitHub Actions Waiting
"version": 1,
"template": {
"mappings": {
"date_detection": false,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious, what is this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disables automatic date detection in OpenSearch to ensure no field gets automatically mapped to date if we don't want it to

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if a cluster with this index name already exists? do you think we should give a setting to configure this index?

}
}
},
"serviceName": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

snake case for consistency

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately it conflicts with OTel format, so must keep the standard of camel case

Comment on lines 250 to 251
Map<String, String> agentAttributes = MLAgentTracer.createAgentTaskAttributes(mlAgent.getName(), apiParams.get(QUESTION_FIELD));
Span agentTaskSpan = MLAgentTracer.getInstance().startSpan(MLAgentTracer.AGENT_TASK_PER_SPAN, agentAttributes, null);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since attributes depend on the type of span, can we have methods like startTaskSpan? What do you think?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, fixed

Comment on lines 487 to 492
Double inputTokens = planResultInfo.usage != null && planResultInfo.usage.get("inputTokens") instanceof Number
? ((Number) planResultInfo.usage.get("inputTokens")).doubleValue()
: null;
Double outputTokens = planResultInfo.usage != null && planResultInfo.usage.get("outputTokens") instanceof Number
? ((Number) planResultInfo.usage.get("outputTokens")).doubleValue()
: null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move attribute creation/setting into methods, keep each method small and reusable

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed

Comment on lines 540 to 542
phaseInputTokens.set(0.0);
phaseOutputTokens.set(0.0);
phaseTotalTokens.set(0.0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use a "reset" method?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, good point, addressed

Comment on lines 836 to 838
// if (!allParams.containsKey(LLM_RESPONSE_FILTER) || allParams.get(LLM_RESPONSE_FILTER).isEmpty()) {
// throw new IllegalArgumentException("llm_response_filter not found. Please provide the path to the model output.");
// }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why has this been commented out?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch

Map<String, String> spanContextMap = new HashMap<>();
MLAgentTracer.getInstance().injectSpanContext(executeStepSpan, spanContextMap);
reactParams.putAll(spanContextMap);
log.info("[AGENT_TRACE] PER Agent - Injected parent SpanContext: {}", spanContextMap);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use debug logs

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed

Signed-off-by: chrislai <chrlaii@amazon.com>
@chriswlai chriswlai temporarily deployed to ml-commons-cicd-env-require-approval July 24, 2025 00:38 — with GitHub Actions Inactive
@chriswlai chriswlai had a problem deploying to ml-commons-cicd-env-require-approval July 24, 2025 00:38 — with GitHub Actions Failure
@chriswlai chriswlai temporarily deployed to ml-commons-cicd-env-require-approval July 24, 2025 00:38 — with GitHub Actions Inactive
@chriswlai chriswlai had a problem deploying to ml-commons-cicd-env-require-approval July 24, 2025 00:38 — with GitHub Actions Error
@chriswlai chriswlai requested a deployment to ml-commons-cicd-env-require-approval July 24, 2025 02:04 — with GitHub Actions Waiting
@chriswlai chriswlai requested a deployment to ml-commons-cicd-env-require-approval July 24, 2025 02:04 — with GitHub Actions Waiting

newSpan.addAttribute("thread.name", Thread.currentThread().getName());
} catch (Exception e) {
log.warn("Failed to create root span, falling back to normal span creation", e);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the consequence if it failed to create root span and become a normal span? is it we lost the relationship of a root span?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but the opposite actually. If we fail to create a root span, the span will still be created except it will show to have a parent span, but that span doesn't exist.

@@ -0,0 +1,182 @@
package org.opensearch.ml.engine.algorithms.agent.tracing;

import static org.junit.Assert.*;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid import *

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, addressed

private Tracer mockTracer;
private MLFeatureEnabledSetting mockFeatureSetting;

@Before
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add proper java doc for you class and test method

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, thanks. done

@@ -0,0 +1,180 @@
package org.opensearch.ml.engine.algorithms.agent.tracing;

import static org.junit.Assert.*;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid *

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed

package org.opensearch.ml.engine.algorithms.agent.tracing;

import static org.junit.Assert.*;
import static org.mockito.Mockito.*;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed

import org.opensearch.ml.common.output.model.ModelTensors;
import org.opensearch.telemetry.tracing.Span;

public class MLAgentTracerStaticUtilsTests {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

java doc

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@SuppressWarnings("unchecked")
Map<String, Object> usage = (Map<String, Object>) usageObj;

if ("aws.bedrock".equalsIgnoreCase(provider)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are using equalsIgnoreCase for "aws.bedrock" but you are not getting "aws.bedrock".

I think "aws.bedrock" would be better to be "containsKey"

but the rest of the key that you are actually getting, for example, "input_tokens" and "prompt_tokens", before you are getting the keys, it makes more sense to check equalsIgnoreCase

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, very good point. fixed

Signed-off-by: chrislai <chrlaii@amazon.com>
@chriswlai chriswlai temporarily deployed to ml-commons-cicd-env-require-approval July 25, 2025 16:48 — with GitHub Actions Inactive
@chriswlai chriswlai temporarily deployed to ml-commons-cicd-env-require-approval July 25, 2025 16:48 — with GitHub Actions Inactive
@chriswlai chriswlai temporarily deployed to ml-commons-cicd-env-require-approval July 25, 2025 16:48 — with GitHub Actions Inactive
@chriswlai chriswlai temporarily deployed to ml-commons-cicd-env-require-approval July 25, 2025 16:48 — with GitHub Actions Inactive
@chriswlai chriswlai temporarily deployed to ml-commons-cicd-env-require-approval July 25, 2025 17:53 — with GitHub Actions Inactive
@chriswlai chriswlai deployed to ml-commons-cicd-env-require-approval July 25, 2025 17:53 — with GitHub Actions Active
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants