stackabletech · razvan · Sep 30, 2024 · Sep 24, 2024 · Sep 30, 2024 · Sep 30, 2024
diff --git a/docs/modules/kafka/pages/getting_started/first_steps.adoc b/docs/modules/kafka/pages/getting_started/first_steps.adoc
@@ -1,7 +1,9 @@
 = First steps
 :description: Deploy and verify a Kafka cluster on Kubernetes with Stackable Operators, including ZooKeeper setup and data testing using kcat.
+:kcat-install: https://github.com/edenhill/kcat#install
 
-After going through the xref:getting_started/installation.adoc[] section and having installed all the operators, you will now deploy a Kafka cluster and the required dependencies. Afterwards you can <<_verify_that_it_works, verify that it works>> by producing test data into a topic and consuming it.
+After going through the xref:getting_started/installation.adoc[] section and having installed all the operators, you now deploy a Kafka cluster and the required dependencies.
+Afterward you can <<_verify_that_it_works, verify that it works>> by producing test data into a topic and consuming it.
 
 == Setup
 
@@ -10,7 +12,8 @@ Two things need to be installed to create a Kafka cluster:
 * A ZooKeeper instance for internal use by Kafka
 * The Kafka cluster itself
 
-We will create them in this order, each one is created by applying a manifest file. The operators you just installed will then create the resources according to the manifest.
+Create them in this order by applying the corresponding manifest files.
+The operators you just installed then create the resources according to the manifest.
 
 === ZooKeeper
 
@@ -58,11 +61,12 @@ and apply it:
 include::example$getting_started/getting_started.sh[tag=install-kafka]
 ----
 
-This will create the actual Kafka instance.
+This creates the actual Kafka instance.
 
 == Verify that it works
 
-Next you will produce data into a topic and read it via https://github.com/edenhill/kcat#install[kcat]. Depending on your platform you may need to replace `kafkacat` in the commands below with `kcat`.
+Next you produce data into a topic and read it via {kcat-install}[kcat].
+Depending on your platform you may need to replace `kafkacat` in the commands below with `kcat`.
 
 First, make sure that all the Pods in the StatefulSets are ready:
 

diff --git a/docs/modules/kafka/pages/getting_started/index.adoc b/docs/modules/kafka/pages/getting_started/index.adoc
@@ -1,19 +1,19 @@
 = Getting started
 :description: Start with Apache Kafka using Stackable Operator: Install, set up Kafka, and manage topics in a Kubernetes cluster.
 
-This guide will get you started with Apache Kafka using the Stackable Operator.
-It will guide you through the installation of the Operator and its dependencies, setting up your first Kafka instance and create, write to and read from a topic.
+This guide gets you started with Apache Kafka using the Stackable Operator.
+It guides you through the installation of the Operator and its dependencies, setting up your first Kafka instance and create, write to and read from a topic.
 
 == Prerequisites
 
-You will need:
+You need:
 
 * a Kubernetes cluster
 * kubectl
 * optional: Helm
 * https://github.com/edenhill/kcat#install[kcat] for testing
 
-Resource sizing depends on cluster type(s), usage and scope, but as a starting point we recommend a minimum of the following resources for this operator:
+Resource sizing depends on cluster type(s), usage and scope, but as a starting point a minimum of the following resources is recommended for this operator:
 
 * 0.2 cores (e.g. i5 or similar)
 * 256MB RAM

diff --git a/docs/modules/kafka/pages/getting_started/installation.adoc b/docs/modules/kafka/pages/getting_started/installation.adoc
@@ -1,21 +1,20 @@
 = Installation
 :description: Install Stackable Operator for Apache Kafka using stackablectl or Helm, including dependencies like ZooKeeper and required operators for Kubernetes.
 
-On this page you will install the Stackable Operator for Apache Kafka and operators for its dependencies - ZooKeeper -
+Install the Stackable Operator for Apache Kafka and operators for its dependencies -- ZooKeeper --
 as well as the commons, secret and listener operator which are required by all Stackable Operators.
 
-== Stackable Operators
+There are multiple ways to install the Stackable Operator for Apache Kafka.
+xref:management:stackablectl:index.adoc[] is the preferred way, but Helm is also supported.
+OpenShift users may prefer installing the operator from the RedHat Certified Operator catalog using the OpenShift web console.
 
-There are 2 ways to install Stackable Operators:
-
-. Using xref:management:stackablectl:index.adoc[stackablectl]
-. Using Helm
-
-=== stackablectl
-
-The `stackablectl` command line tool is the recommended way to interact with operators and dependencies. Follow the
-xref:management:stackablectl:installation.adoc[installation steps] for your platform if you choose to work with
-`stackablectl`.
+[tabs]
+====
+stackablectl::
++
+--
+The `stackablectl` command line tool is the recommended way to interact with operators and dependencies.
+Follow the xref:management:stackablectl:installation.adoc[installation steps] for your platform if you choose to work with `stackablectl`.
 
 After you have installed `stackablectl`, run the following command to install all operators necessary for Kafka:
 
@@ -24,16 +23,18 @@ After you have installed `stackablectl`, run the following command to install al
 include::example$getting_started/getting_started.sh[tag=stackablectl-install-operators]
 ----
 
-The tool will show
+The tool prints
 
 [source]
 include::example$getting_started/install_output.txt[]
 
 TIP: Consult the xref:management:stackablectl:quickstart.adoc[] to learn more about how to use `stackablectl`.
+--
 
-=== Helm
-
-You can also use Helm to install the operators. Add the Stackable Helm repository:
+Helm::
++
+--
+Add the Stackable Helm repository:
 
 [source,bash]
 ----
@@ -47,8 +48,9 @@ Then install the Stackable Operators:
 include::example$getting_started/getting_started.sh[tag=helm-install-operators]
 ----
 
-Helm will deploy the operators in a Kubernetes Deployment and apply the CRDs for the Apache Kafka service (as well as
-the CRDs for the required operators). You are now ready to deploy Apache Kafka in Kubernetes.
+Helm deploys the operators in a Kubernetes Deployment and apply the CRDs for the Apache Kafka service (as well as the CRDs for the required operators).
+--
+====
 
 == What's next
 

diff --git a/docs/modules/kafka/pages/index.adoc b/docs/modules/kafka/pages/index.adoc
@@ -21,7 +21,7 @@ It is commonly used for real-time data processing, data ingestion, event streami
 
 == Getting started
 
-Follow the xref:kafka:getting_started/index.adoc[] which will guide you through installing The Stackable Kafka and ZooKeeper operators, setting up ZooKeeper and Kafka and testing your Kafka using `kcat`.
+Follow the xref:kafka:getting_started/index.adoc[] which guides you through installing The Stackable Kafka and ZooKeeper operators, setting up ZooKeeper and Kafka and testing your Kafka using `kcat`.
 
 == Resources
 
@@ -45,7 +45,7 @@ Kafka requires xref:zookeeper:index.adoc[Apache ZooKeeper] for coordination purp
 == Connections to other products
 
 Since Kafka often takes on a bridging role, many other products connect to it.
-In the <<demos, demos>> below you will find example data pipelines that use xref:nifi:index.adoc[Apache NiFi with the Stackable operator] to write to Kafka and xref:nifi:index.adoc[Apache Druid with the Stackable operator] to read from Kafka.
+In the <<demos, demos>> below you find example data pipelines that use xref:nifi:index.adoc[Apache NiFi with the Stackable operator] to write to Kafka and xref:nifi:index.adoc[Apache Druid with the Stackable operator] to read from Kafka.
 But you can also connect using xref:spark-k8s:index.adoc[Apache Spark] or with a custom Job written in various languages.
 
 == [[demos]]Demos

diff --git a/docs/modules/kafka/pages/reference/commandline-parameters.adoc b/docs/modules/kafka/pages/reference/commandline-parameters.adoc
@@ -23,7 +23,7 @@ stackable-kafka-operator run --product-config /foo/bar/properties.yaml
 
 *Multiple values:* false
 
-The operator will **only** watch for resources in the provided namespace `test`:
+The operator **only** watches for resources in the provided namespace `test`:
 
 [source]
 ----

diff --git a/docs/modules/kafka/pages/reference/environment-variables.adoc b/docs/modules/kafka/pages/reference/environment-variables.adoc
@@ -36,7 +36,7 @@ docker run \
 
 *Multiple values:* false
 
-The operator will **only** watch for resources in the provided namespace `test`:
+The operator **only** watches for resources in the provided namespace `test`:
 
 [source]
 ----

diff --git a/docs/modules/kafka/pages/usage-guide/configuration-environment-overrides.adoc b/docs/modules/kafka/pages/usage-guide/configuration-environment-overrides.adoc
@@ -2,16 +2,16 @@
 
 The cluster definition also supports overriding configuration properties and environment variables, either per role or per role group, where the more specific override (role group) has precedence over the less specific one (role).
 
-IMPORTANT: Overriding certain properties which are set by operator (such as the ports) can interfere with the operator and can lead to problems.
+IMPORTANT: Overriding operator-set properties (such as the ports) can interfere with the operator and can lead to problems.
 
 == Configuration Properties
 
 For a role or role group, at the same level of `config`, you can specify: `configOverrides` for the following files:
 
-- `server.properties`
-- `security.properties`
+* `server.properties`
+* `security.properties`
 
-For example, if you want to set the `auto.create.topics.enable` to disable automatic topic creation, it can be configured in the `KafkaCluster` resource like so:
+For example, if you want to set the `auto.create.topics.enable` to disable automatic topic creation, it can be configured in the KafkaCluster resource like so:
 
 [source,yaml]
 ----
@@ -43,9 +43,13 @@ For a full list of configuration options we refer to the Apache Kafka https://ka
 
 === The security.properties file
 
-The `security.properties` file is used to configure JVM security properties. It is very seldom that users need to tweak any of these, but there is one use-case that stands out, and that users need to be aware of: the JVM DNS cache.
+The `security.properties` file is used to configure JVM security properties.
+It is very seldom that users need to tweak any of these, but there is one use-case that stands out, and that users need to be aware of: the JVM DNS cache.
 
-The JVM manages it's own cache of successfully resolved host names as well as a cache of host names that cannot be resolved. Some products of the Stackable platform are very sensible to the contents of these caches and their performance is heavily affected by them. As of version 3.4.0 Apache Kafka performs poorly if the positive cache is disabled. To cache resolved host names, you can configure the TTL of entries in the positive cache like this:
+The JVM manages it's own cache of successfully resolved host names as well as a cache of host names that cannot be resolved.
+Some products of the Stackable platform are very sensible to the contents of these caches and their performance is heavily affected by them.
+As of version 3.4.0 Apache Kafka performs poorly if the positive cache is disabled.
+To cache resolved host names, you can configure the TTL of entries in the positive cache like this:
 
 [source,yaml]
 ----

diff --git a/docs/modules/kafka/pages/usage-guide/operations/graceful-shutdown.adoc b/docs/modules/kafka/pages/usage-guide/operations/graceful-shutdown.adoc
@@ -6,8 +6,8 @@ You can configure the graceful shutdown as described in xref:concepts:operations
 
 As a default, Kafka brokers have `30 minutes` to shut down gracefully.
 
-The Kafka broker process will receive a `SIGTERM` signal when Kubernetes wants to terminate the Pod.
-After the graceful shutdown timeout runs out, and the process still didn't exit, Kubernetes will issue a `SIGKILL` signal.
+The Kafka broker process receives a `SIGTERM` signal when Kubernetes wants to terminate the Pod.
+After the graceful shutdown timeout runs out, and the process is still running, Kubernetes issues a `SIGKILL` signal.
 
 This is equivalent to executing the `bin/kafka-server-stop.sh` command, which internally executes `kill <kafka-pid>` (https://github.com/apache/kafka/blob/2c6fb6c54472e90ae17439e62540ef3cb0426fe3/bin/kafka-server-stop.sh#L34[code]).
 
@@ -25,15 +25,15 @@ The broker logs the received signal as shown in the log below:
 
 The https://kafka.apache.org/35/documentation/#basic_ops_restarting[Kafka documentation] does a very good job at explaining what happens during a graceful shutdown:
 
-The Kafka cluster will automatically detect any broker shutdown or failure and elect new leaders for the partitions on that machine.
-This will occur if either a server fails or it is brought down intentionally for maintenance or configuration changes.
+The Kafka cluster automatically detects any broker shutdown or failure and elect new leaders for the partitions on that machine.
+This occurs if either a server fails or it is brought down intentionally for maintenance or configuration changes.
 For the latter cases Kafka supports a more graceful mechanism for stopping a server than just killing it.
-When a server is stopped gracefully it has two optimizations it will take advantage of:
+When a server is stopped gracefully it has two optimizations it takes advantage of:
 
-1. It will sync all its logs to disk to avoid the need for any log recovery when it restarts (i.e. validating the checksum for all messages in the tail of the log). Log recovery takes time, so this speeds up intentional restarts.
-2. It will migrate any partitions the broker is the leader of, to other replicas prior to shutting down. This will make the leadership transfer faster and minimize the time each partition is unavailable to a few milliseconds.
+1. It syncs all its logs to disk to avoid the need for any log recovery when it restarts (i.e. validating the checksum for all messages in the tail of the log). Log recovery takes time, so this speeds up intentional restarts.
+2. It migrates any partitions the broker is the leader of, to other replicas prior to shutting down. This makes the leadership transfer faster and minimize the time each partition is unavailable to a few milliseconds.
 
-Note that controlled shutdown will only succeed if all the partitions hosted on the broker have replicas (i.e. the replication factor is greater than 1 and at least one of these replicas is alive).
+Note that controlled shutdown only succeeds if all the partitions hosted on the broker have replicas (i.e. the replication factor is greater than 1 and at least one of these replicas is alive).
 This is generally what you want since shutting down the last replica would make that topic partition unavailable.
 
 This operator takes care of that by only allowing a certain number of brokers to be offline as described in xref:usage-guide/operations/pod-disruptions.adoc[].

diff --git a/docs/modules/kafka/pages/usage-guide/operations/index.adoc b/docs/modules/kafka/pages/usage-guide/operations/index.adoc
@@ -2,4 +2,4 @@
 
 This section of the documentation is intended for the operations teams that maintain a Stackable Data Platform installation.
 
-Please read the xref:concepts:operations/index.adoc[Concepts page on Operations] that contains the necessary details to operate the platform in a production environment.
+Read the xref:concepts:operations/index.adoc[Concepts page on Operations] that contains the necessary details to operate the platform in a production environment.
diff --git a/docs/modules/kafka/pages/usage-guide/operations/pod-disruptions.adoc b/docs/modules/kafka/pages/usage-guide/operations/pod-disruptions.adoc
@@ -3,8 +3,8 @@
 
 You can configure the permitted Pod disruptions for Kafka nodes as described in xref:concepts:operations/pod_disruptions.adoc[].
 
-Unless you configure something else or disable our PodDisruptionBudgets (PDBs), we write the following PDBs:
+Unless you configure something else or disable the default PodDisruptionBudgets (PDBs), the operator writes the following PDBs:
 
 == Brokers
-We only allow a single Broker to be offline at any given time, regardless of the number of replicas or `roleGroups`.
+Allow only a single Broker to be offline at any given time, regardless of the number of replicas or `roleGroups`.
 This is because we can not make any assumptions about topic replication factors.
diff --git a/docs/modules/kafka/pages/usage-guide/operations/znode-id.adoc b/docs/modules/kafka/pages/usage-guide/operations/znode-id.adoc
@@ -1,8 +1,9 @@
 = Cluster ID
 
-Kafka has an internal check to ensure that a broker cannot join a different cluster to the one in which it was previously registered  (this is important to avoid various kinds of metadata inconsistencies in the cluster). The clusterId is stored locally after initial registration and is verified upon cluster startup that it still matches what is in ZooKeeper.
+Kafka has an internal check to ensure that a broker cannot join a different cluster to the one in which it was previously registered (this is important to avoid various kinds of metadata inconsistencies in the cluster).
+The clusterId is stored locally after initial registration and is verified upon cluster startup that it still matches what is in ZooKeeper.
 
-This clusterId is stored in the `meta.properties` file in the folder specified by the `log.dirs` setting: this is persisted on a PVC created by Kafka. This PVC is not removed when the Kafka ZNode is deleted, which means that there are circumstances where this internal check will fail with the following error:
+This clusterId is stored in the `meta.properties` file in the folder specified by the `log.dirs` setting: this is persisted on a PVC created by Kafka. This PVC is not removed when the Kafka ZNode is deleted, which means that there are circumstances where this internal check fails with the following error:
 
 [source,bash]
 ----
@@ -13,12 +14,15 @@ The Cluster ID <new Cluster ID> doesn't match stored clusterId <old Cluster ID>
 
 === Restarting a Kafka cluster
 
-When re-starting a Kafka cluster, ensure that the Kafka ZNode is not removed: upon restart the cluster will attempt to register with the ZooKeeper cluster referenced in the ZNode and will check that the cluster IDs match. As the `meta.properties` file has not been changed this should not cause any problems.
+When re-starting a Kafka cluster, ensure that the Kafka ZNode is not removed: upon restart the cluster attempts to register with the ZooKeeper cluster referenced in the ZNode and checks that the cluster IDs match. As the `meta.properties` file has not been changed this should not cause any problems.
 
 === Replacing an existing ZNode
 
-If the ZNode has been removed, then the Kafka PVC prefixed with `log-dirs-` will also have to be removed. This will result in the loss of topic metadata but is unavoidable since Kafka will need to re-register with ZooKeeper anyway. For instance, this will apply when breaking changes have been made to the ZooKeeper operator.
+If the ZNode has been removed, then the Kafka PVC prefixed with `log-dirs-` also has to be removed.
+This results in the loss of topic metadata but is unavoidable since Kafka needs to re-register with ZooKeeper anyway.
+For instance, this applies when breaking changes have been made to the ZooKeeper operator.
 
 === Updating the SDP release
 
-Depending on the scope of any breaking changes, it may be possible to upgrade SDP and re-create clusters without having to touch the Kafka PVCs. In cases where deleting the aforementioned PVC is unavoidable this will also result in the loss of topic offset metadata.
+Depending on the scope of any breaking changes, it may be possible to upgrade SDP and re-create clusters without having to touch the Kafka PVCs.
+In cases where deleting the aforementioned PVC is unavoidable this also results in the loss of topic offset metadata.
diff --git a/docs/modules/kafka/pages/usage-guide/security.adoc b/docs/modules/kafka/pages/usage-guide/security.adoc
@@ -105,7 +105,7 @@ spec:
 == [[authorization]]Authorization
 
 If you wish to include integration with xref:opa:index.adoc[Open Policy Agent] and already have an OPA cluster, then you can include an `opa` field pointing to the OPA cluster discovery `ConfigMap` and the required package.
-The package is optional and will default to the `metadata.name` field:
+The package is optional and defaults to the `metadata.name` field:
 
 [source,yaml]
 ----

diff --git a/docs/modules/kafka/pages/usage-guide/storage-resources.adoc b/docs/modules/kafka/pages/usage-guide/storage-resources.adoc
@@ -17,9 +17,9 @@ brokers:
               capacity: 2Gi
 ----
 
-In the above example, all Kafka brokers in the default group will store data (the location of the property `log.dirs`) on a `2Gi` volume.
+In the above example, all Kafka brokers in the default group store data (the location of the property `log.dirs`) on a `2Gi` volume.
 
-If nothing is configured in the custom resource for a certain role group, then by default each Pod will have a `1Gi` large local volume mount for the data location.
+If nothing is configured in the custom resource for a certain role group, then by default each Pod has a `1Gi` large local volume mount for the data location.
 
 == Resource Requests
 
@@ -32,7 +32,8 @@ A minimal HA setup consisting of 2 Brokers has the following https://kubernetes.
 * `2560Mi` memory request and limit
 * `4Gi` persistent storage
 
-Of course, additional services, require additional resources. For Stackable components, see the corresponding documentation on further resource requirements.
+Of course, additional services, require additional resources.
+For Stackable components, see the corresponding documentation on further resource requirements.
 
 Corresponding to the values above, the operator uses the following resource defaults:
Original file line number	Diff line number	Diff line change
Expand Up		@@ -2,4 +2,4 @@

		This section of the documentation is intended for the operations teams that maintain a Stackable Data Platform installation.

		Please read the xref:concepts:operations/index.adoc[Concepts page on Operations] that contains the necessary details to operate the platform in a production environment.
		Read the xref:concepts:operations/index.adoc[Concepts page on Operations] that contains the necessary details to operate the platform in a production environment.