Skip to content

Implement Kubernetes Validation Webhooks for CRDs #195

@chrisguidry

Description

@chrisguidry

Problem Statement

Currently, the prefect-operator relies on post-creation validation during reconciliation and status conditions to catch configuration errors. This approach has several drawbacks:

  1. Late feedback: Users don't discover configuration errors until after resources are created and processed
  2. Resource churn: Invalid configurations consume cluster resources and trigger unnecessary reconciliation loops
  3. Poor UX: Error messages are buried in status conditions rather than appearing immediately during kubectl apply

As noted by @tinkerborg in PR #194, we have a TODO for implementing admission webhooks.

Current Validation Issues

1. PrefectServer Storage Backend Validation

Location: api/v1/prefectserver_types.go:51-59

  • Mutually exclusive storage backends: Users can specify multiple storage backends (ephemeral, sqlite, postgres), but controller silently uses precedence order
  • Missing PostgreSQL connection validation: No validation that required fields are present when postgres is specified
  • SQLite storage validation: No validation of storageClassName or reasonable size values

2. PrefectDeployment Conversion Validation

Location: internal/prefect/convert.go:29-130

  • Invalid JSON validation: Multiple fields (Parameters, JobVariables, ParameterOpenApiSchema, PullSteps) can contain invalid JSON that fails during conversion
  • Date format validation: Schedule AnchorDate fields must be RFC3339 format but aren't validated
  • Required field validation: Entrypoint is required but not enforced
  • Reference validation: WorkPoolReference can point to non-existent resources
  • Parameter schema validation: When parameterOpenApiSchema is provided, parameters should be validated against it

3. PrefectWorkPool Type and Configuration Validation

Location: api/v1/prefectworkpool_types.go:38-39

  • Unsupported work pool types: Type field accepts any string but image selection logic at api/v1/prefectworkpool_types.go:101-114 only handles "kubernetes" specially
  • Configuration mismatches: No validation that settings match the work pool type
  • Resource validation: No bounds checking on resource requests/limits

4. PrefectServerReference API Key Validation

Location: api/v1/server_reference.go:50-57

  • Mutually exclusive API key sources: APIKeySpec allows both Value and ValueFrom to be set, but runtime logic at api/v1/prefectworkpool_types.go:152-162 silently prefers ValueFrom
  • Missing required fields for Prefect Cloud: When using Prefect Cloud, both AccountID and WorkspaceID are required
  • Invalid server reference combinations: Users can specify both RemoteAPIURL and in-cluster Name simultaneously

5. Additional Cross-Field Validation

  • Prefect Cloud configuration: RemoteAPIURL pointing to Prefect Cloud should require AccountID and WorkspaceID
  • Resource name validation: Some fields may have naming constraints (e.g., work pool names with "prefect" prefix are handled specially at api/v1/prefectworkpool_types.go:118-120)
  • Environment variable conflicts: Settings that conflict with operator-managed environment variables
  • Secret/ConfigMap reference validation: Validate that referenced secrets/configmaps exist and contain required keys

Issue researched and authored by Claude

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions