Skip to content

OSDOCS#14808: HCP updates additions and updates #94496

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

xenolinux
Copy link
Contributor

@xenolinux xenolinux commented Jun 10, 2025

@xenolinux xenolinux added this to the Continuous Release milestone Jun 10, 2025
@openshift-ci openshift-ci bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jun 10, 2025
@ocpdocs-previewbot
Copy link

ocpdocs-previewbot commented Jun 10, 2025

🤖 Tue Jul 15 11:48:52 - Prow CI generated the docs preview:

https://94496--ocpdocs-pr.netlify.app/openshift-enterprise/latest/hosted_control_planes/hcp-updating.html

@xenolinux xenolinux force-pushed the hcp-updating branch 2 times, most recently from 43c324e to 9bd2769 Compare June 24, 2025 09:59
@openshift-ci openshift-ci bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 24, 2025
@xenolinux xenolinux force-pushed the hcp-updating branch 3 times, most recently from bb5b9d0 to 5d86879 Compare June 24, 2025 10:52
@jiezhao16
Copy link

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jun 24, 2025
@@ -50,7 +31,7 @@ version:
- eus-4.16
- fast-4.16
- stable-4.16
image: quay.io/openshift-release-dev/ocp-release@sha256:b7517d13514c6308ae16c5fd8108133754eb922cd37403ed27c846c129e67a9a
image: quay.io/openshift-release-dev/ocp-release@sha256:xxxxx
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there placeholder conventions in openshift-docs? xxxxx seems like a surprising choice. I see you using ... a few lines above; maybe you could use sha256:... here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... is reserved for the YAML file to show continuous code flow before and after. Using <id> placeholder instead of xxxxx

----

+
.Example output
[source,yaml]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure who's QE for HyperShift-update docs, but jsonpath output is single-line JSON. You can get multi-line JSON output with kubernetes/kubernetes#89660 (kubernetes/kubernetes@ba386ab)'s jsonpath-as-json, but I'm not aware of a jsonpath-as-yaml. So you either need to stick with -o yaml in the command to match your output-syntax claim here, or otherwise adjust your output-syntax claim and/or the query to be internally consistent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Sticking with -o yaml


. Set your desired channel in your hosted cluster by using one of the following ways:

.. Modify the `HostedCluster` CR by running the following command:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: do we say "CR" in docs? I'd have expected "resource" here, to avoid the acronym (and hopefully make this more accessible for new readers) and drop the C/custom (because "is this resource custom?" doesn't seem relevant to this procedure).

Copy link
Contributor Author

@xenolinux xenolinux Jun 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Addressed; removed the acronym CR

publicZoneID: <public_zone_id>
# ...
----
<1> Replace `<4.y>` with the {product-title} release version you specified in `spec.release`. For example, if you set the `spec.release` to `ocp-release:4.18.4-multi`, you must set `spec.channel` to `stable-4.18`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: If we know what we want to replace, I prefer oc patch ... to oc edit .... It gives you a conveniently copy/paste-able log in your terminal for "this is what I changed", and it avoids having to hunt out your particular property from the other information edit would show you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. That's a good point. Sticking with oc patch

----
<1> Replace `<release_payload>` with the {product-title} release payload that you want to use.

. Set your desired channel in your hosted cluster by using one of the following ways:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should set channel before requesting an update (by bumping release). channel tells the cluster what kind of updates you're interested in, and setting channel and then waiting some time (minutes or less?) will give you the cluster and OpenShift Update Service (or a custom updateService) opinions of your update options. You can chose your target release from among those options, or strike out on your own (although personally, striking out on your own seems like it involves taking on more responsibility than I'd want for my own clusters). But your current position here bumping channel after setting release doesn't make sense to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Changed the current position of the setting channel instruction

[source,terminal]
----
$ oc get -n <hosted_cluster_namespace> hostedcluster <hosted_cluster_name> -o yaml
$ oc get -n <hosted_cluster_namespace> hostedcluster <hosted_cluster_name> -o jsonpath='{.status.version.availableUpdates}'
Copy link
Member

@wking wking Jun 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The section heading here is currently Setting channels in a hosted cluster. For that, you want to:

  1. Check the channels currently available for your current release. Something like oc -n <hosted_cluster_namespace> get -o jsonpath='{.status.version.desired.channels}' hostedcluster<hosted_cluster_name>.
  2. Of those channels, select the one that matches your intended semantics, and set it with oc -n <hosted_cluster_namespace> patch hostedcluster <hosted_cluster_name> --type=json -p '[{"op": "add", "path": "/spec/channel", "value": "stable-<4.y>"}]' or whatever.

You don't need to look at availableUpdates (that's something you'd do pre-update to consume the information setting a channel helped you configure). And you don't need to bump spec.release (that's something you'd do once you wanted to launch an update). And you don't need to run oc adm upgrade status (that's something you can optionally do to keep an eye on an in-progress update).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wking So do i need to include the following steps in a different section as these are not part of setting channel?

  • Checking availableUpdates
  • Bumping spec.release
  • Checking in-progress status with oc adm upgrade status

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Checking availableUpdates
  • Bumping spec.release
  • Checking in-progress status with oc adm upgrade status

Yeah, those are all part of Updating a cluster by using the CLI. The standalone/ClusterVersion-update docs currently include setting your desired channel as an early step in the update process, and while I see channel-setting and update-launching as distinct, I don't see many issues with combining in a single section like that. But if they're combined, the name of the combined section should focus on the fact that an update is being launched, because that has a much larger cluster impact (bug fixes, new functionality, etc.) compared to changing the channel (which just gets you different update advice).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Updated the PR

@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Jun 26, 2025
Copy link

openshift-ci bot commented Jun 26, 2025

New changes are detected. LGTM label has been removed.

@xenolinux xenolinux force-pushed the hcp-updating branch 6 times, most recently from 009bdc8 to 18bf450 Compare July 1, 2025 11:16
@xenolinux xenolinux added the peer-review-needed Signifies that the peer review team needs to review this PR label Jul 1, 2025
@stevsmit stevsmit removed the peer-review-needed Signifies that the peer review team needs to review this PR label Jul 1, 2025

.Verification

* Verify that the` history` field in the `HostedCluster` resource indicates a status as `Completed`. Run the following command:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Verify that the` history` field in the `HostedCluster` resource indicates a status as `Completed`. Run the following command:
* Verify that the `history` field in the `HostedCluster` resource indicates a status as `Completed`. Run the following command:

@xenolinux
Copy link
Contributor Author

A few comments and things to consider. This one looks great!

Thanks for the review. Updated PR as per the comments

@xenolinux xenolinux force-pushed the hcp-updating branch 2 times, most recently from 81a6513 to 69bfc24 Compare July 2, 2025 10:22

The available updates are not fetched from the Cluster Version Operator (CVO) of a hosted cluster. The list of the available updates can be different from the available updates from the following fields of the `HostedCluster` custom resource (CR):
The available updates are not fetched from the Cluster Version Operator (CVO) of a hosted cluster. The initial `HostedCluster` resource does not have any information in the `status.version.availableUpdates` and `status.version.conditionalUpdates` fields. After you set the `spec.channel` field to the stable {product-title} release version, the HyperShift Operator reconciles the `HostedCluster` resource and updates the `status.version` field with the available and conditional updates.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The available updates are not fetched from the Cluster Version Operator (CVO) of a hosted cluster.

This sentence predates your pull request, but I'm not sure what it's trying to say. When you set an OpenShift Update Service channel per these docs, that will propagate down to the CVO, which will calculate available updates, and those will propagate back up to HostedCluster's status. Maybe we should drop or edit the sentence?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to drop the sentence

+
[source,terminal]
----
$ oc -n <hosted_cluster_namespace> get -o jsonpath='{.status.version.desired.channels}' hostedcluster<hosted_cluster_name>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
$ oc -n <hosted_cluster_namespace> get -o jsonpath='{.status.version.desired.channels}' hostedcluster<hosted_cluster_name>
$ oc -n <hosted_cluster_namespace> get -o jsonpath='{.status.version.desired.channels}' hostedcluster <hosted_cluster_name>

+
[source,terminal]
----
$ oc patch -n <hosted_cluster_namespace> hostedcluster <hosted_cluster_name> --type=merge -p '{"spec":{"release":{"image":"quay.io/openshift-release-dev/ocp-release@sha256:<release_payload>"}}}' <1>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole pullspec (quay.io/...) will be in availableUpdates and conditionalUpdates, so I don't think you need to template that portion here. You can just say:

$ oc patch -n <hosted_cluster_namespace> hostedcluster <hosted_cluster_name> --type=merge -p '{"spec":{"release":{"image":"<release_payload>"}}}' <1>

and in the <1> notes, suggest they use the pullspec they've found in availableUpdates or conditionalUpdates.

+
[source,terminal]
----
$ oc patch -n <hosted_cluster_namespace> hostedcluster <hosted_cluster_name> --type=merge -p '{"spec":{"channel":"stable-<4.y>"}}' <1>
Copy link
Member

@wking wking Jul 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to be shifted earlier, because without a channel being set, you won't have the availableUpdates or conditionalUpdates information you used to get the pullspec for the spec.release.image patch. See the standalone update docs, where channel-setting is step 2 ("Based on your organization requirements, set the appropriate update channel...") and requesting an update is step 3 ("Apply an update...").

There's currently a bootstrapping issue with initial channel selection, where you have to guess a channel that includes your current 4.y.z release before you can learn what other channels are available. In standalone, that bootstrapping is handled by the installer's default-channel preference. RFE-6922 is up asking HyperShift to set similar default-channel preferences, but in the absence of that, it may be worth a distinct section on selecting an initial channel.

Copy link
Contributor Author

@xenolinux xenolinux Jul 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a distinct section on selecting an initial channel.

@wking Can this be addressed in a separate jira? And is this section needed for the upcoming GA i.e 14th July?

Rest is addressed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And is this section needed for the upcoming GA i.e 14th July?

None of this is new. Is there a push to doc something specifically for a GA? I thought we were just async filling in some details that we hadn't spent the time to fill in previously?

[source,terminal]
----
$ oc get -n <hosted_cluster_namespace> hostedcluster <hosted_cluster_name> -o yaml
$ oc get hostedcluster <hosted_cluster_name> -o jsonpath='{.status.version.history}'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

history will tell you about whether the update is in-progress or completed, but won't tell you anything about how your update is going. For that, you probably want the ClusterVersionProgressing and ClusterVersionSucceeding conditions, to hear about how the CVO thinks its portion of the update is going. The HostedControlPlane controller also has its own portions of the update outside of the CVO's portion, and I'm not sure what the HyperShift maintainers want you to look at to see how those are going.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed


.Verification

* Verify that the update service URL is successfully set in the `updateService` field by running the following command:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That just shows you "your spec patch worked". You can check that, but it seems likely that the patch command would exit non-zero with an error message if it didn't work, so I dunno if checking the spec after a successful patch adds much protection. For the troubleshooting factors listed above, you probably want to look at the ClusterVersionRetrievedUpdates condition, and also check status.version.desired for metadata like channels to show that the CVO is able to reach the configured Update Service and retrieve information about the current target release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

@xenolinux xenolinux force-pushed the hcp-updating branch 3 times, most recently from f7da41b to a837f00 Compare July 9, 2025 12:28
@xenolinux xenolinux added the peer-review-done Signifies that the peer review team has reviewed this PR label Jul 10, 2025

You can see available updates in the `HostedCluster.Status` field of the `HostedCluster` custom resource (CR).
You can update the hosted cluster by setting a channel in your `HostedCluster` resource or by using the `oc patch` command.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: sometimes folks get stressed about changing their channel, so I think it's worth crisping up the language here. Changing the channel only changes the update advice that the cluster collects from the OpenShift Update Service, evaluates, and pushes through and up the chain to HostedCluster's status.version.*. It doesn't cause any new code to roll out anywhere. If you're about to use HostedCluster status.version.* information to select a new target release and launch an update, you want to be using a channel that matches your desired semantics. But things only start updating when you patch spec.release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed - Rephrased the language.

----
$ oc patch -n <hosted_cluster_namespace> hostedcluster <hosted_cluster_name> --type=merge -p '{"spec":{"channel":"stable-<4.y>"}}' <1>
----
<1> Replace `<4.y>` with the {product-title} release version that you specified in `spec.release`. For example, if you set the `spec.release` to `ocp-release:4.18.4-multi`, you must set `spec.channel` to `stable-4.18`.
Copy link
Member

@wking wking Jul 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you have options beyond stable, and also beyond strictly matching your current 4.y. For example, stable-4.18 doesn't include any 4.19 or later releases, and some of the folks currently running 4.18.z releases will eventually want to update to 4.19 releases. Right now, 4.18.4 is in:

$ curl -s 'https://api.openshift.com/api/upgrades_info/graph?channel=stable-4.18' | jq -r '.nodes[] | select(.version == "4.18.4").metadata["io.openshift.upgrades.graph.release.channels"]'
candidate-4.18,eus-4.18,fast-4.18,stable-4.18,candidate-4.19,fast-4.19,candidate-4.20

[source,terminal]
----
$ oc get -n <hosted_cluster_namespace> hostedcluster <hosted_cluster_name> -o yaml
$ oc get hostedcluster <hosted_cluster_name> -o jsonpath='{.status.conditions}'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jsonpath will dump these all out as one line, which can be hard reading. You may want to use jsonpath-as-json to get multiple lines for easier reading. I don't have a HostedCluster I can access at the moment, but just showing that rendering difference on a ClusterVersion in a standalone cluster:

$ oc get clusterversion version -o jsonpath='{.status.conditions}'
[{"lastTransitionTime":"2025-07-07T14:38:59Z","status":"True","type":"RetrievedUpdates"},{"lastTransitionTime":"2025-04-28T19:50:01Z","message":"Capabilities match configured spec","reason":"AsExpected","status":"False","type":"ImplicitlyEnabledCapabilities"},{...
$ oc get clusterversion version -o jsonpath-as-json='{.status.conditions}'
[
    [
        {
            "lastTransitionTime": "2025-07-07T14:38:59Z",
            "status": "True",
            "type": "RetrievedUpdates"
        },
        {
            "lastTransitionTime": "2025-04-28T19:50:01Z",
            "message": "Capabilities match configured spec",
            "reason": "AsExpected",
            "status": "False",
            "type": "ImplicitlyEnabledCapabilities"
        },
        ...

$ oc adm upgrade status
----
+
For more information about how to use the `oc adm upgrade status` command, see "Gathering cluster update status using oc adm upgrade status (Technology Preview)".
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while it's tech-preview, you'll also need to set environment variables to get status to work. If you want to inline the oc adm upgrade status, you may also want to inline that environment variable advice. Alternatively, you can probably drop the inline oc adm upgrade status and just link over to that section of the standalone update docs, and folks can find both the environment variable information and the subsequent oc adm upgrade status invocation over there.


* `ClusterVersionSucceeding`: Indicates if the cluster is successfully updating.
* `ClusterVersionProgressing`: Indicates if the update is actively progressing.
* `ClusterVersionAvailable`: Confirms the availability of the updated version.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up above, you said:

Check the status of the ClusterVersionProgressing and ClusterVersionSucceeding conditions...

but here you're also mentioning ClusterVersionAvailable. At the moment, ClusterVersionAvailable is a pretty boring condition, and it mostly (always?) True for all clusters that have finished installing. Official Available docs (which by the time they get up to HostedCluster has become ClusterVersionAvailable), if you want to lift any of that wording. But I think you want to be consistent about which condition types you point to during this step, regardless of whether you include ClusterVersionAvailable` in that set.

Copy link
Contributor Author

@xenolinux xenolinux Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To eliminate the conflict - removed ClusterVersionAvailable : Confirms the availability of the updated version.

+
[source,terminal]
----
$ oc get hostedcluster <hosted_cluster_name> -o jsonpath='{.status.version.history}'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another place where I think you probably want the multi-line jsonpath-as-json instead of the single-line jsonpath.

+
[source,terminal]
----
$ oc get hostedcluster <hosted_cluster_name> -o jsonpath='{.status.conditions}'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can exclude the other conditions here to make it easier to find the one you're asking for with something like:

$ oc -n <hosted_cluster_namespace> get -o jsonpath-as-json='{.status.conditions[?(@.type == "ClusterVersionRetrievedUpdates")]}' hostedcluster <hosted_cluster_name>

+
[TIP]
====
If you face issues, consider the following troubleshooting factors:
Copy link
Member

@wking wking Jul 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

troubleshooting sounds like its own section or something for the Verification section, and not part of the happy-path Procedure section

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed - created a separate section/module to cover troubleshooting factors.

@xenolinux xenolinux force-pushed the hcp-updating branch 2 times, most recently from 5aa29a8 to 5710783 Compare July 15, 2025 11:34
Copy link

openshift-ci bot commented Jul 15, 2025

@xenolinux: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants