Skip to content

Conversation

@Vivekbhadauria1
Copy link
Contributor

@Vivekbhadauria1 Vivekbhadauria1 commented Oct 29, 2025

Summary

Add comprehensive observability CLI (agentcore obs) for querying CloudWatch spans, traces, and runtime logs with rich terminal visualization.

Commands Added

agentcore obs show                    # Show latest trace from config session
agentcore obs show --last N           # Show Nth most recent trace
agentcore obs show --trace-id <id>    # Show specific trace with full details
agentcore obs show --session-id <id>  # Show session summary table
agentcore obs show --all              # Show all traces in session
agentcore obs show --errors           # Show only failed traces
agentcore obs show --simple           # Minimal view without verbose metadata
agentcore obs show -o trace.json      # Export to JSON
agentcore obs list                    # List all traces in a table

Architecture

  • ObservabilityClient: CloudWatch Logs Insights query execution with batch optimization
  • CloudWatchQueryBuilder: Query construction with camelCase fields (traceId, spanId)
  • Telemetry Models: Span/RuntimeLog/TraceData with hierarchy building and serialization
  • TraceVisualizer: Rich terminal rendering with Tree and Table displays

Features

  • ✅ Hierarchical trace visualization with span parent-child relationships
  • ✅ Batch query optimization for runtime logs (single query for multiple traces)
  • ✅ Field deduplication (session_id, service_name) when common across spans
  • ✅ Clean JSON export (removed redundant pre-computed fields)
  • ✅ Error highlighting and exception display with stack traces
  • ✅ Token usage and model information display
  • ✅ Message deduplication in span hierarchies (delta display)

Test Coverage

  • 115 comprehensive parameterized unit tests (0.68s execution)
    • ObservabilityClient (18 tests): AWS integration, batch queries, error handling
    • CloudWatchQueryBuilder (25 tests): Query construction, field naming
    • Telemetry Models (27 tests): Parsing, hierarchy building, serialization
    • TraceVisualizer (45 tests): Rendering, formatting, deduplication

TODO

  • Add Python/notebook interface for observability
  • Add CLI command integration tests (low coverage for CLI commands currently)

Test Plan

# Run all observability tests
pytest tests/operations/observability/ -v

# Try the CLI
agentcore obs list
agentcore obs show --last 1

Vivekbhadauria1 and others added 3 commits October 28, 2025 23:22
Add comprehensive observability CLI (agentcore obs) for querying CloudWatch
spans, traces, and runtime logs with rich terminal visualization.

Commands Added:
  agentcore obs show                    # Show latest trace from config session
  agentcore obs show --last N           # Show Nth most recent trace
  agentcore obs show --trace-id <id>    # Show specific trace with full details
  agentcore obs show --session-id <id>  # Show session summary table
  agentcore obs show --all              # Show all traces in session
  agentcore obs show --errors           # Show only failed traces
  agentcore obs show --simple           # Minimal view without verbose metadata
  agentcore obs show -o trace.json      # Export to JSON
  agentcore obs list                    # List all traces in a table

Architecture:
- ObservabilityClient: CloudWatch Logs Insights query execution
- CloudWatchQueryBuilder: Query construction with camelCase fields (traceId, spanId)
- Telemetry Models: Span/RuntimeLog/TraceData with hierarchy building
- TraceVisualizer: Rich terminal rendering with Tree and Table displays

Test Coverage:
- 115 comprehensive unit tests (0.68s execution)
- ObservabilityClient (18): AWS integration, batch queries, error handling
- CloudWatchQueryBuilder (25): Query construction, field naming
- Telemetry Models (27): Parsing, hierarchy building, serialization
- TraceVisualizer (45): Rendering, formatting, deduplication

TODO: Add Python/notebook interface for observability

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Add 29 parameterized unit tests for CLI observability commands to improve
coverage from 7.69% to significantly higher.

Tests Added:
- Helper functions (15 tests):
  - _get_default_time_range: time range calculation with various days
  - _get_agent_config_from_file: config loading, missing fields, errors
  - _create_observability_client: CLI args, config file, error handling
  - _export_trace_data_to_json: successful export and error handling

- show command (8 tests):
  - Trace ID and session ID flows
  - Error validation (both IDs, incompatible flags)
  - Config file session handling
  - Exception handling

- list_traces command (6 tests):
  - Session ID and config file flows
  - Empty results handling
  - Error filtering
  - Exception handling

Total Test Count: 144 tests (115 operations + 29 CLI)
All tests passing in 0.61s

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Added comprehensive test coverage for observability modules:
- CLI commands: Added 40 tests covering helper functions, show/list commands, and internal view functions
- Telemetry models: Added 13 edge case tests for RuntimeLog and Span parsing
- TraceVisualizer: Added 8 verbose formatting tests for messages/events/exceptions

Coverage improvements:
- trace_visualizer.py: 67% → 89% (+22%)
- telemetry.py: 78% → 88% (+10%)
- Overall: 84% → 90.12%

Total tests: 1226 passed

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@shovalLior
Copy link

we should expand this to support all primitives and not only runtime

@shovalLior
Copy link

agentcore obs --help doesn't show how to list all sessions and select a session

return None

event_name = attributes.get("event.name", "")
if not event_name.startswith("gen_ai."):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This filter appears to be specific to Bedrock LLM generated telemetry. Any reason we do not want to process other agent generated messages such as invocations, event loops, etc?


# Extract role from event name
role = None
if "system.message" in event_name:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

appears to be tailored to Bedrock.

@rajeshkumarravi
Copy link

are there plans to display span event body? currently only span chain is printed. Looks great.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants