Skip to content

CLI Troubleshooting and Diagnostics #4030

@manno

Description

@manno

As an On-call Engineer or Developer, I want to run a comprehensive diagnostic CLI command that helps troubleshoot common issues and generate information for support tickets.

First and foremost, the CLI should gather relevant diagnostic information, like resource statuses, Fleet's k8s events and logs.

Additional, detailed diagnostics could check agent registration, resource dependencies, and communication problems, so that root causes can be identified.

Acceptance Criteria:

  • Ability to generate a diagnostic bundle (logs, resource YAMLs) for offline analysis and sharing with support.

  • A fleet troubleshoot or fleet doctor command acts as the main entry point to validate the current cluster.

    • Fetch metrics and analyze queue values (are queues healthy)?
    • provide a summary of checks performed (pass/fail), highlights potential issues, and suggests corrective actions or further commands for deeper investigation (e.g., "try kubectl logs -n cattle-fleet-system fleet-agent-<pod-id> on cluster X").

Metadata

Metadata

Assignees

Projects

Status

🏗 In progress

Relationships

None yet

Development

No branches or pull requests

Issue actions