-
Notifications
You must be signed in to change notification settings - Fork 1.8k
OBSDOCS-1701: Upgrading to Logging 6 steps Final #93577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: enterprise-4.18
Are you sure you want to change the base?
Conversation
@theashiot: This pull request references OBSDOCS-1701 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Skipping CI for Draft Pull Request. |
/test all |
🤖 Mon May 26 18:52:17 - Prow CI generated the docs preview: https://93577--ocpdocs-pr.netlify.app/ |
|
||
. In the *Change Subscription Update Channel* window, select the latest major version update channel, *stable-6.x*, and click *Save*. Note the `cluster-logging.v6.y.z` version. | ||
|
||
. Wait for a few seconds, and then go to *Operators* -> *Installed Operators* to verify that the {clo} version matches the latest `cluster-logging.v5.9.<z>` version. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's being upgraded to Logging 6, then, the version when it's being verified should be v6.y.z and not 5.9.z
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
|
||
. On the *Operators* -> *Installed Operators* page, wait for the *Status* field to report *Succeeded*. | ||
|
||
. Check if the `LokiStack` custom resource contains the `v13` schema version and add it if it is missing. For correctly adding the `v13` schema version, see "Upgrading the LokiStack storage schema". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LokiStack has not a place here. The ClusterLogging Operator is not the Red Hat LokiStack operator. They are 2 different operators and upgrading the CLO doesn't upgrade the LokiStack Operator.
This step is for when upgrading the Red Hat Loki Operator only and it should be removed from here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
* You have administrator permissions. | ||
* You have access to the {product-title} web console and are viewing the *Administrator* perspective. | ||
|
||
.Procedure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, upgrading to Logging 6, even if using Vector in Logging 5, implies that the checkpoints used are in a different path and this has a big impact. The same reported for the change between Fluentd and Vector reported in https://issues.redhat.com/browse/OBSDA-540 where it's said:
Also, indicate for the migration, that all the logs not compressed will be reprocessed by Vector that can lead to:
- have duplicated logs in the moment of the migration
- 429 Too many requests in the Log storage receiving the logs or reaching Rate Limit
- problems on the log store on disk and performance as consequence of re-reading and processing all old logs the collector
- impact in the Kube API
- a peak of memory and cpu in Vector until all the old logs are processed (these logs can be several GB per node). This also could lead to a big impact.
Then, it's provided the steps for moving the Vector checkpoints to the new path before upgrading or it's highlighted this "impact" here as it's reported issues with this upgrade
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @theashiot ,
In the article https://access.redhat.com/articles/7089860 in step "Step 4: Delete the ClusterLogging instance and deploy the ClusterLogForwarder observability Custom Resource", it was incorporated the checkpoints migration. If it's confirmed those steps, then, we could probably incorporate to this PR and close all
[id="logging-upgrading-clo_{context}"] | ||
= Updating the {clo} | ||
|
||
To update the {clo} to a new major release version, you must modify the update channel for the Operator subscription. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not indicated how to create the new CR with the observability API. Upgrading to Logging 6 is not only upgrading the Logging operator. This manual "migration" is what's tried to be explained in https://docs.redhat.com/en/documentation/openshift_container_platform/4.16/html/logging/logging-6-0 points 2.4.5, 2.4.6, 2.4.7...2.4.11, and next.
Basically, what's needed to know before doing any step for upgrading is how to transform the current configuration to the new API in Logging 6. The changes in the configuration between Logging 5 and Logging 6 are detailed in - https://github.com/openshift/cluster-logging-operator/blob/master/docs/administration/upgrade/v6.0_changes.adoc
|
||
. On the *Operators* -> *Installed Operators* page, wait for the *Status* field to report *Succeeded*. | ||
|
||
. Check if the `LokiStack` custom resource contains the `v13` schema version and add it if it is missing. For correctly adding the `v13` schema version, see "Upgrading the LokiStack storage schema". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "Upgrading the LokiStack storage schema" misses the link to the section
[id="uninstall-es-operator_{context}"] | ||
= Uninstalling Elasticsearch | ||
|
||
You can unintall Elasticsearch by using the {product-title} web console. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"unintall" has a typo, it's -> uninstall
It should be mentioned that you can uninstall Elasticsearch is it's not used for other component as "Jaeger, Service Mesh or Kiali"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
[id="creating-and-configuring-a-service-account-for-the-log-collector_{context}"] | ||
= Creating and configuring a service account for the log collector | ||
|
||
Create a service account for the log collector and assign it the necessary roles and permissions to collect logs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be only needed if not desired to use the legacy account "logcollector". It's explained in https://github.com/openshift/cluster-logging-operator/blob/master/docs/administration/upgrade/v6.0_changes.adoc in section "Legacy openshift-logging"
$ oc create sa logging-collector -n openshift-logging | ||
---- | ||
|
||
. Bind the `Cluster Role` role to the service account to be able to write the logs to the Red{nbsp}Hat LokiStack |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cluster Role should be probably all together like "ClusterRole", the same for "service account" as serviceAccount.
Then, probably, it should be like "Bind the ClusterRole
to the serviceAccount
to be able to write the logs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
$ oc adm policy add-cluster-role-to-user collect-infrastructure-log | ||
s -z logging-collector -n openshift-logging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be a single line and without the "s":
$ oc adm policy add-cluster-role-to-user collect-infrastructure-log -z logging-collector -n openshift-logging
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
* You installed the {oc-first}. | ||
|
||
.Procedure | ||
. Make each step an instruction. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In article https://access.redhat.com/articles/7089860 step 5 are described cli steps. Needed to verify by engineering.
Needed to highlight that when removed the "old" visualization and until not configured the COO UI Plugin, the Console Log visualization will be lost
|
||
. Go to the *Administration* -> *Custom Resource Definitions* page. | ||
|
||
. Click the Options menu {kebab} next to *Elasticsearch*, and select *Delete Custom Resource Definition*. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is missing to delete the CRD Kibana and also deleting the Elasticsearch PVC
|
||
. Click the Options menu {kebab} next to *Elasticsearch*, and select *Delete Custom Resource Definition*. | ||
|
||
. Delete the object storage secret. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Elasticsearch has not "object storage secret"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
+ | ||
[source,terminal] | ||
---- | ||
$ oc delete clusterlogging <CR name> -n <namespace> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably, it should good to provide the command to list all before. It's:
$ oc get clusterloggings.logging.openshift.io -A
+ | ||
[source,terminal] | ||
---- | ||
$ oc get pods -l component=collector -n <namespace> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do in this way, or we can list in all the namespaces that sounds better as nobody will verify ns by ns:
$ oc get pods -l component=collector -A
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
+ | ||
[source,terminal] | ||
---- | ||
$ oc get pods -l component=collector -n <namespace> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This command is checking again the previous step, if pods are running, but not if a clusterLogForwader CR exists. I'd suggest to use:
$ oc get clusterlogforwarders.logging.openshift.io -A
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
namespace: openshift-logging | ||
spec: | ||
serviceAccount: | ||
name: collector |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the steps is documented to create the serviceAcount "logging-collector" and here, it's set "collector", then, this won't work.
If it's used the legacy serviceAccount, it should be "logcollector", if it's created a new serviceAccount, the name set in the steps here is "logging-collector"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to <Name_of_service_account>
[id="deploying-a-clusterlogforwarder-observability-custom-resource_{context}"] | ||
= Deploying a ClusterLogForwarder observability custom resource | ||
|
||
Deploying a ClusterLogForwarder observability custom resource (CR) by using the {oc-first} command. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example is only if LokiStack is installed and desired to store the logs on it. If someone is not storing the logs in the Red Hat Log Store, this example is not valid. Then, it's needed to highlight this. Similar to how it's documented in the article https://access.redhat.com/articles/7089860 where it's said:
"Assuming the current stack looks like the below that represents a fully managed OpenShift Logging stack (Vector, Loki) including collection, forwarding and storage."
256d9ab
to
c1bc50f
Compare
/test all |
@theashiot: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Version(s): 4.18
Issue: OBSDOCS-1701
Link to docs preview:
QE review:
Additional information: