Skip to content

Commit ee010b6

Browse files
author
Felix Hennig
committed
~
1 parent e9fde01 commit ee010b6

16 files changed

+88
-77
lines changed

docs/modules/hdfs/pages/getting_started/first_steps.adoc

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
= First steps
22
:description: Deploy and verify an HDFS cluster with Stackable by setting up Zookeeper and HDFS components, then test file operations using WebHDFS API.
33

4-
Once you have followed the steps in the xref:getting_started/installation.adoc[] section to install the operator and its dependencies, you will now deploy an HDFS cluster and its dependencies.
4+
Once you have followed the steps in the xref:getting_started/installation.adoc[] section to install the operator and its dependencies, now deploy an HDFS cluster and its dependencies.
55
Afterward, you can <<_verify_that_it_works, verify that it works>> by creating, verifying and deleting a test file in HDFS.
66

77
== Setup
@@ -13,7 +13,7 @@ To deploy a Zookeeper cluster create one file called `zk.yaml`:
1313
[source,yaml]
1414
include::example$getting_started/zk.yaml[]
1515

16-
We also need to define a ZNode that will be used by the HDFS cluster to reference Zookeeper.
16+
Define a ZNode that is used by the HDFS cluster to reference Zookeeper.
1717
Create another file called `znode.yaml`:
1818

1919
[source,yaml]
@@ -94,7 +94,7 @@ Then use `curl` to issue a `PUT` command:
9494
[source]
9595
include::example$getting_started/getting_started.sh[tag=create-file]
9696

97-
This will return a location that will look something like this:
97+
This returns a location that looks similar to this:
9898

9999
[source]
100100
http://simple-hdfs-datanode-default-0.simple-hdfs-datanode-default.default.svc.cluster.local:9864/webhdfs/v1/testdata.txt?op=CREATE&user.name=stackable&namenoderpcaddress=simple-hdfs&createflag=&createparent=true&overwrite=false
@@ -109,7 +109,7 @@ Rechecking the status again with:
109109
[source]
110110
include::example$getting_started/getting_started.sh[tag=file-status]
111111

112-
will now display some metadata about the file that was created in the HDFS cluster:
112+
now displays some metadata about the file that was created in the HDFS cluster:
113113

114114
[source,json]
115115
{

docs/modules/hdfs/pages/getting_started/index.adoc

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,18 @@
11
= Getting started
22
:description: Start with HDFS using the Stackable Operator. Install the Operator, set up your HDFS cluster, and verify its operation with this guide.
33

4-
This guide will get you started with HDFS using the Stackable Operator.
5-
It will guide you through the installation of the Operator and its dependencies, setting up your first HDFS cluster and verifying its operation.
4+
This guide gets you started with HDFS using the Stackable operator.
5+
It guides you through the installation of the operator and its dependencies, setting up your first HDFS cluster and verifying its operation.
66

77
== Prerequisites
88

9-
You will need:
9+
You need:
1010

1111
* a Kubernetes cluster
1212
* kubectl
1313
* optional: Helm
1414

15-
Resource sizing depends on cluster type(s), usage and scope, but as a starting point we recommend a minimum of the following resources for this operator:
15+
Resource sizing depends on cluster type(s), usage and scope, but as a starting point the following resources are recommended as a minium requirement for this operator:
1616

1717
* 0.2 cores (e.g. i5 or similar)
1818
* 256MB RAM

docs/modules/hdfs/pages/getting_started/installation.adoc

Lines changed: 21 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,39 +1,41 @@
11
= Installation
22
:description: Install the Stackable HDFS operator and dependencies using stackablectl or Helm. Follow steps for setup and verification in Kubernetes.
3+
:kind: https://kind.sigs.k8s.io/
34

4-
On this page you will install the Stackable HDFS operator and its dependency, the Zookeeper operator, as well as the
5+
Install the Stackable HDFS operator and its dependency, the Zookeeper operator, as well as the
56
commons, secret and listener operators which are required by all Stackable operators.
67

7-
== Stackable Operators
8-
9-
There are 2 ways to run Stackable Operators
10-
11-
. Using xref:management:stackablectl:index.adoc[]
12-
. Using Helm
13-
14-
=== stackablectl
8+
There are multiple ways to install the Stackable operators.
9+
xref:management:stackablectl:index.adoc[] is the preferred way but Helm is also supported.
10+
OpenShift users may prefer installing the operator from the RedHat Certified Operator catalog using the OpenShift web console.
1511

12+
[tabs]
13+
====
14+
stackablectl::
15+
+
16+
--
1617
`stackablectl` is the command line tool to interact with Stackable operators and our recommended way to install
1718
operators. Follow the xref:management:stackablectl:installation.adoc[installation steps] for your platform.
1819
19-
After you have installed `stackablectl`, run the following command to install all operators necessary for the HDFS
20-
cluster:
20+
After you have installed `stackablectl`, run the following command to install all operators necessary for the HDFS cluster:
2121
2222
[source,bash]
2323
----
2424
include::example$getting_started/getting_started.sh[tag=stackablectl-install-operators]
2525
----
2626
27-
The tool will show
27+
The tool prints
2828
2929
[source]
3030
include::example$getting_started/install_output.txt[]
3131
32-
TIP: Consult the xref:management:stackablectl:quickstart.adoc[] to learn more about how to use `stackablectl`. For
33-
example, you can use the `--cluster kind` flag to create a Kubernetes cluster with link:https://kind.sigs.k8s.io/[kind].
34-
35-
=== Helm
32+
TIP: Consult the xref:management:stackablectl:quickstart.adoc[] to learn more about how to use `stackablectl`.
33+
For example, you can use the `--cluster kind` flag to create a Kubernetes cluster with {kind}[kind].
34+
--
3635
36+
Helm::
37+
+
38+
--
3739
You can also use Helm to install the operators. Add the Stackable Helm repository:
3840
[source,bash]
3941
----
@@ -46,8 +48,9 @@ Then install the Stackable Operators:
4648
include::example$getting_started/getting_started.sh[tag=helm-install-operators]
4749
----
4850
49-
Helm will deploy the operators in a Kubernetes Deployment and apply the CRDs for the HDFS cluster (as well as the CRDs
50-
for the required operators). You are now ready to deploy HDFS in Kubernetes.
51+
Helm deploys the operators in a Kubernetes Deployment and apply the CRDs for the HDFS cluster (as well as the CRDs for the required operators).
52+
--
53+
====
5154

5255
== What's next
5356

docs/modules/hdfs/pages/index.adoc

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,9 +18,7 @@ The operator depends on the xref:zookeeper:index.adoc[] to operate a ZooKeeper c
1818

1919
== Getting started
2020

21-
Follow the xref:getting_started/index.adoc[Getting started guide] which will guide you through installing the Stackable
22-
HDFS and ZooKeeper operators, setting up ZooKeeper and HDFS and writing a file to HDFS to verify that everything is set
23-
up correctly.
21+
Follow the xref:getting_started/index.adoc[Getting started guide] which guides you through installing the Stackable HDFS and ZooKeeper operators, setting up ZooKeeper and HDFS and writing a file to HDFS to verify that everything is set up correctly.
2422

2523
Afterwards you can consult the xref:usage-guide/index.adoc[] to learn more about tailoring your HDFS configuration to
2624
your needs, or have a look at the <<demos, demos>> for some example setups.

docs/modules/hdfs/pages/reference/commandline-parameters.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ stackable-hdfs-operator run --product-config /foo/bar/properties.yaml
2323

2424
*Multiple values:* false
2525

26-
The operator will **only** watch for resources in the provided namespace `test`:
26+
The operator **only** watches for resources in the provided namespace `test`:
2727

2828
[source]
2929
----

docs/modules/hdfs/pages/reference/environment-variables.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ docker run \
3636

3737
*Multiple values:* false
3838

39-
The operator will **only** watch for resources in the provided namespace `test`:
39+
The operator **only** watches for resources in the provided namespace `test`:
4040

4141
[source]
4242
----

docs/modules/hdfs/pages/usage-guide/configuration-environment-overrides.adoc

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,8 @@ nameNodes:
5050
replicas: 2
5151
----
5252

53-
All override property values must be strings. The properties will be formatted and escaped correctly into the XML file.
53+
All override property values must be strings.
54+
The properties are formatted and escaped correctly into the XML file.
5455

5556
For a full list of configuration options we refer to the Apache Hdfs documentation for https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml[hdfs-site.xml] and https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/core-default.xml[core-site.xml].
5657

@@ -117,4 +118,10 @@ nameNodes:
117118
replicas: 1
118119
----
119120

120-
IMPORTANT: Some environment variables will be overriden by the operator and cannot be set manually by the user. These are `HADOOP_HOME`, `HADOOP_CONF_DIR`, `POD_NAME` and `ZOOKEEPER`.
121+
IMPORTANT: Some environment variables are overridden by the operator and cannot be set manually by the user.
122+
These are `HADOOP_HOME`, `HADOOP_CONF_DIR`, `POD_NAME` and `ZOOKEEPER`.
123+
124+
== Pod overrides
125+
126+
The HDFS operator also supports Pod overrides, allowing you to override any property that you can set on a Kubernetes Pod.
127+
Read the xref:concepts:overrides.adoc#pod-overrides[Pod overrides documentation] to learn more about this feature.

docs/modules/hdfs/pages/usage-guide/fuse.adoc

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,9 @@ FUSE is short for _Filesystem in Userspace_ and allows a user to export a filesy
77
HDFS contains a native FUSE driver/application, which means that an existing HDFS filesystem can be mounted into a Linux environment.
88

99
To use the FUSE driver you can either copy the required files out of the image and run it on a host outside of Kubernetes or you can run it in a Pod.
10-
This Pod, however, will need some extra capabilities.
10+
This Pod, however, needs some extra capabilities.
1111

12-
This is an example Pod that will work _as long as the host system that is running the kubelet does support FUSE_:
12+
This is an example Pod that works _as long as the host system that is running the kubelet does support FUSE_:
1313

1414
[source,yaml]
1515
----
@@ -57,7 +57,7 @@ securityContext:
5757
----
5858
5959
Unfortunately, there is no way around some extra privileges.
60-
In Kubernetes the Pods usually share the Kernel with the host running the Kubelet, which means a Pod wanting to use FUSE will need access to the underlying Kernel modules.
60+
In Kubernetes the Pods usually share the Kernel with the host running the Kubelet, which means a Pod wanting to use FUSE needs access to the underlying Kernel modules.
6161
====
6262

6363
Inside this Pod you can get a shell (e.g. using `kubectl exec --stdin --tty hdfs-fuse -- /bin/bash`) to get access to a script called `fuse_dfs_wrapper` (it is in the `PATH` of our Hadoop images).
@@ -70,14 +70,14 @@ To mount HDFS call the script like this:
7070
----
7171
fuse_dfs_wrapper dfs://<your hdfs> <target> <1> <2>
7272
73-
# This will run in debug mode and stay in the foreground
73+
# This runs in debug mode and stays in the foreground
7474
fuse_dfs_wrapper -odebug dfs://<your hdfs> <target>
7575
7676
# Example:
7777
mkdir simple-hdfs
7878
fuse_dfs_wrapper dfs://simple-hdfs simple-hdfs
7979
cd simple-hdfs
80-
# Any operations in this directory will now happen in HDFS
80+
# Any operations in this directory now happens in HDFS
8181
----
8282
<1> Again, use the name of the HDFS service as above
83-
<2> `target` is the directory in which HDFS will be mounted, it must exist otherwise this command will fail
83+
<2> `target` is the directory in which HDFS is mounted, it must exist otherwise this command fails

docs/modules/hdfs/pages/usage-guide/index.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,6 @@
22
:description: Learn to configure and use the Stackable Operator for Apache HDFS. Ensure basic setup knowledge from the Getting Started guide before proceeding.
33
:page-aliases: ROOT:usage.adoc
44

5-
This Section will help you to use and configure the Stackable Operator for Apache HDFS in various ways.
5+
This Section helps you to use and configure the Stackable operator for Apache HDFS in various ways.
66
You should already be familiar with how to set up a basic instance.
77
Follow the xref:getting_started/index.adoc[] guide to learn how to set up a basic instance with all the required dependencies (for example ZooKeeper).

docs/modules/hdfs/pages/usage-guide/listenerclass.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,4 +19,4 @@ spec:
1919
listenerClass: external-stable # <2>
2020
----
2121
<1> DataNode listeners should prioritize having a direct connection, to minimize network transfer overhead.
22-
<2> NameNode listeners should prioritize having a stable address, since they will be baked into the client configuration.
22+
<2> NameNode listeners should prioritize having a stable address, since they are baked into the client configuration.

docs/modules/hdfs/pages/usage-guide/operations/graceful-shutdown.adoc

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,9 @@ You can configure the graceful shutdown as described in xref:concepts:operations
66

77
As a default, JournalNodes have `15 minutes` to shut down gracefully.
88

9-
The JournalNode process will receive a `SIGTERM` signal when Kubernetes wants to terminate the Pod.
10-
It will log the received signal as shown in the log below and initiate a graceful shutdown.
11-
After the graceful shutdown timeout runs out, and the process still didn't exit, Kubernetes will issue a `SIGKILL` signal.
9+
The JournalNode process receives a `SIGTERM` signal when Kubernetes wants to terminate the Pod.
10+
It logs the received signal as shown in the log below and initiate a graceful shutdown.
11+
After the graceful shutdown timeout runs out, and the process still didn't exit, Kubernetes issues a `SIGKILL` signal.
1212

1313
https://github.com/apache/hadoop/blob/a585a73c3e02ac62350c136643a5e7f6095a3dbb/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/JournalNode.java#L272[This] is the relevant code that gets executed in the JournalNodes as of HDFS version `3.3.4`.
1414

docs/modules/hdfs/pages/usage-guide/operations/pod-disruptions.adoc

Lines changed: 12 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -3,22 +3,22 @@
33

44
You can configure the permitted Pod disruptions for HDFS nodes as described in xref:concepts:operations/pod_disruptions.adoc[].
55

6-
Unless you configure something else or disable our PodDisruptionBudgets (PDBs), we write the following PDBs:
6+
Unless you configure something else or disable our PodDisruptionBudgets (PDBs), the operator write the following PDBs:
77

88
== JournalNodes
9-
We only allow a single JournalNode to be offline at any given time, regardless of the number of replicas or `roleGroups`.
9+
Only a single JournalNode is allowed to be offline at any given time, regardless of the number of replicas or `roleGroups`.
1010

1111
== NameNodes
12-
We only allow a single NameNode to be offline at any given time, regardless of the number of replicas or `roleGroups`.
12+
Only a single NameNode is allowed to be offline at any given time, regardless of the number of replicas or `roleGroups`.
1313

1414
== DataNodes
1515
For DataNodes the question of how many instances can be unavailable at the same time is a bit harder:
1616
HDFS stores your blocks on the DataNodes.
1717
Every block can be replicated multiple times (to multiple DataNodes) to ensure maximum availability.
1818
The default replication factor is `3` - which can be configured using `spec.clusterConfig.dfsReplication`. However, it is also possible to change the replication factor for a specific file or directory to something other than the cluster default.
1919

20-
When you have a replication of `3`, you can safely take down 2 DataNodes, as there will always be a third DataNode holding a copy of each block currently assigned to one of the unavailable DataNodes.
21-
However, you need to be aware that you are now down to a single point of failure - the last of three replicas!
20+
When you have a replication of `3`, you can safely take down 2 DataNodes, as there is always a third DataNode holding a copy of each block currently assigned to one of the unavailable DataNodes.
21+
However, you need to be aware that you are now down to a single point of failure -- the last of three replicas!
2222

2323
Taking this into consideration, our operator uses the following algorithm to determine the maximum number of DataNodes allowed to be unavailable at the same time:
2424

@@ -93,13 +93,15 @@ This results e.g. in the following numbers:
9393
|===
9494

9595
== Reduce rolling redeployment durations
96-
The default PDBs we write out are pessimistic and will cause the rolling redeployment to take a considerable amount of time.
97-
As an example, when you have 100 DataNodes and a replication factor of `3`, we can safely only take a single DataNode down at a time. Assuming a DataNode takes 1 minute to properly restart, the whole re-deployment would take 100 minutes.
96+
The default PDBs written out are pessimistic and cause the rolling redeployment to take a considerable amount of time.
97+
As an example, when you have 100 DataNodes and a replication factor of `3`, only a single DataNode can be taken offline at a time.
98+
Assuming a DataNode takes 1 minute to properly restart, the whole re-deployment would take 100 minutes.
9899

99100
You can use the following measures to speed this up:
100101

101-
1. Increase the replication factor, e.g. from `3` to `5`. In this case the number of allowed disruptions triples from `1` to `3` (assuming >= 5 DataNodes), reducing the time it takes by 66%.
102-
2. Increase `maxUnavailable` using the `spec.dataNodes.roleConfig.podDisruptionBudget.maxUnavailable` field as described in xref:concepts:operations/pod_disruptions.adoc[].
103-
3. Write your own PDBs as described in xref:concepts:operations/pod_disruptions.adoc#_using_you_own_custom_pdbs[Using you own custom PDBs].
102+
* Increase the replication factor, e.g. from `3` to `5`.
103+
In this case the number of allowed disruptions triples from `1` to `3` (assuming >= 5 DataNodes), reducing the time it takes by 66%.
104+
* Increase `maxUnavailable` using the `spec.dataNodes.roleConfig.podDisruptionBudget.maxUnavailable` field as described in xref:concepts:operations/pod_disruptions.adoc[].
105+
* Write your own PDBs as described in xref:concepts:operations/pod_disruptions.adoc#_using_you_own_custom_pdbs[Using you own custom PDBs].
104106

105107
WARNING: In cases you modify or disable the default PDBs, it's your responsibility to either make sure there are enough DataNodes available or accept the risk of blocks not being available!

docs/modules/hdfs/pages/usage-guide/operations/rack-awareness.adoc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
= HDFS Rack Awareness
22

33
Apache Hadoop supports a feature called Rack Awareness, which allows users to define a topology for the nodes making up a cluster.
4-
Hadoop will then use that topology to spread out replicas of blocks in a fashion that maximizes fault tolerance.
4+
Hadoop then uses that topology to spread out replicas of blocks in a fashion that maximizes fault tolerance.
55

66
The default write path, for example, is to put replicas of a newly created block first on a different node, but within the same rack, and the second copy on a node in a remote rack.
77
In order for this to work properly, Hadoop needs to have access to the information about the underlying infrastructure it runs on. In a Kubernetes environment, this means obtaining information from the pods or nodes of the cluster.
@@ -29,4 +29,4 @@ spec:
2929
...
3030
----
3131

32-
Internally this will be used to create a topology label consisting of the value of the node label `topology.kubernetes.io/zone` and the pod label `app.kubernetes.io/role-group`, e.g. `/eu-central-1/rg1`.
32+
Internally this is used to create a topology label consisting of the value of the node label `topology.kubernetes.io/zone` and the pod label `app.kubernetes.io/role-group`, e.g. `/eu-central-1/rg1`.

docs/modules/hdfs/pages/usage-guide/resources.adoc

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55

66
You can mount volumes where data is stored by specifying https://kubernetes.io/docs/concepts/storage/persistent-volumes[PersistentVolumeClaims] for each individual role group.
77

8-
In case nothing is configured in the custom resource for a certain role group, each Pod will have one volume mount with `10Gi` capacity and storage type `Disk`:
8+
In case nothing is configured in the custom resource for a certain role group, each Pod has one volume mount with `10Gi` capacity and storage type `Disk`:
99

1010
[source,yaml]
1111
----
@@ -35,7 +35,7 @@ dataNodes:
3535
capacity: 128Gi
3636
----
3737

38-
In the above example, all DataNodes in the default group will store data (the location of `dfs.datanode.name.dir`) on a `128Gi` volume.
38+
In the above example, all DataNodes in the default group store data (the location of `dfs.datanode.name.dir`) on a `128Gi` volume.
3939

4040
=== Multiple storage volumes
4141

@@ -61,13 +61,13 @@ dataNodes:
6161
capacity: 5Ti
6262
storageClass: premium-ssd
6363
hdfsStorageType: SSD
64-
# The default "data" PVC will still be created.
64+
# The default "data" PVC is still created.
6565
# If this is not desired then the count must be set to 0.
6666
data:
6767
count: 0
6868
----
6969

70-
This will create the following PVCs:
70+
This creates the following PVCs:
7171

7272
1. `my-disks-hdfs-datanode-default-0` (12Ti)
7373
2. `my-disks-1-hdfs-datanode-default-0` (12Ti)
@@ -81,7 +81,7 @@ By configuring and using a dedicated https://kubernetes.io/docs/concepts/storage
8181
====
8282
You might need to re-create the StatefulSet to apply the new PVC configuration because of https://github.com/kubernetes/kubernetes/issues/68737[this Kubernetes issue].
8383
You can delete the StatefulSet using `kubectl delete statefulsets --cascade=orphan <statefulset>`.
84-
The hdfs-operator will re-create the StatefulSet automatically.
84+
The hdfs-operator recreates the StatefulSet automatically.
8585
====
8686

8787
== Resource Requests

0 commit comments

Comments
 (0)