You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add code changes to enable rack awareness features in Hdfs clusters (#429)
* crd changes
* Minor changes, mostly pushing so Sigi can try make run-dev
* Attempt to add a reflector that updates the clusterrolebinding.
Went horribly wrong :)
* WIP, should work, but patch doesn't
* Got clusterrolebinding patching to work.
* Added operator changes to roll out the topology provider.
* Got clusterrolebinding patching to work.
* wip: complete missing code
* merge conflict fixes
* clippy fix
* removed tiltfile and added hadoop path env-var
* restored tiltfile (only meant to delete stack references)
* linting
* cleaned up patching code
* clippy fix
* comment struct fields
* regenerate charts
* wip: working test, but with local image
* cleaned up test
* updated docs and added some code comments
* add listeners to role
* removed listenerclasses from role
* update roles
* class path setting not needed as jar copied to different location in product image
* changelog
* updated docs
* fixed kerberos tests on openshift
* Update rust/operator-binary/src/main.rs
Co-authored-by: Siegfried Weber <mail@siegfriedweber.net>
* Update rust/crd/src/lib.rs
Co-authored-by: Siegfried Weber <mail@siegfriedweber.net>
* clarified docs
* removed function and created dedicated test for rack-awareness
* cleanup comments and log statement
* Move the code for the hdfs_clusterrolebinding_nodes to a separate module
---------
Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
Co-authored-by: Sebastian Bernauer <sebastian.bernauer@stackable.de>
Co-authored-by: Siegfried Weber <mail@siegfriedweber.net>
Copy file name to clipboardExpand all lines: deploy/helm/hdfs-operator/crds/crds.yaml
+19Lines changed: 19 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -83,6 +83,25 @@ spec:
83
83
enum:
84
84
- cluster-internal
85
85
type: string
86
+
rackAwareness:
87
+
description: Configuration to control HDFS topology (rack) awareness feature
88
+
items:
89
+
properties:
90
+
labelName:
91
+
description: Name of the label that will be used to resolve a datanode to a topology zone.
92
+
type: string
93
+
labelType:
94
+
description: Name of the label type that will be typically either `node` or `pod`, used to create a topology out of datanodes.
95
+
enum:
96
+
- node
97
+
- pod
98
+
type: string
99
+
required:
100
+
- labelName
101
+
- labelType
102
+
type: object
103
+
nullable: true
104
+
type: array
86
105
vectorAggregatorConfigMapName:
87
106
description: Name of the Vector aggregator [discovery ConfigMap](https://docs.stackable.tech/home/nightly/concepts/service_discovery). It must contain the key `ADDRESS` with the address of the Vector aggregator. Follow the [logging tutorial](https://docs.stackable.tech/home/nightly/tutorials/logging-vector-aggregator) to learn how to configure log aggregation with Vector.
Apache Hadoop supports a feature called Rack Awareness, which allows defining a topology for the nodes making up a cluster.
3
+
Apache Hadoop supports a feature called Rack Awareness, which allows users to define a topology for the nodes making up a cluster.
4
4
Hadoop will then use that topology to spread out replicas of blocks in a fashion that maximizes fault tolerance.
5
5
6
6
The default write path, for example, is to put replicas of a newly created block first on a different node, but within the same rack, and the second copy on a node in a remote rack.
7
-
In order for this to work properly, Hadoop needs to have information about the underlying infrastructure it runs on available - in a Kubernetes environment, this means obtaining information from the pods or nodes of the cluster.
7
+
In order for this to work properly, Hadoop needs to have access to the information about the underlying infrastructure it runs on. In a Kubernetes environment, this means obtaining information from the pods or nodes of the cluster.
8
8
9
9
In order to enable gathering this information the Hadoop images contain https://github.com/stackabletech/hdfs-topology-provider on the classpath, which can be configured to read labels from Kubernetes objects.
10
10
11
-
In the current version of the SDP this is not exposed as fully integrated functionality in the operator, but rather needs to be configured via config overrides.
11
+
In the current version of the SDP this is now exposed as fully integrated functionality in the operator, and no longer needs to be configured via config overrides.
12
12
13
+
NOTE: Prior to SDP release 24.3, it was necessary to manually deploy RBAC objects to allow the Hadoop pods access to the necessary Kubernetes objects. This ClusterRole allows the reading of pods and nodes and needs to be bound to the individual ServiceAccounts that are deployed per Hadoop cluster: this is now performed by the operator itself.
13
14
14
-
NOTE: Until the operator code has been merged, users will need to manually deploy RBAC objects to allow the Hadoop pods access to the necessary Kubernetes objects.
15
-
16
-
Specifically this is a ClusterRole that allows reading pods and nodes, which needs to be bound to the individual ServiceAccounts that are deployed per Hadoop cluster.
17
-
18
-
The following listing shows the generic objects that need to be deployed:
19
-
20
-
[source,yaml]
21
-
----
22
-
---
23
-
apiVersion: rbac.authorization.k8s.io/v1
24
-
kind: ClusterRole
25
-
metadata:
26
-
name: hdfs-clusterrole-nodes
27
-
rules:
28
-
- apiGroups:
29
-
- ""
30
-
resources:
31
-
- nodes
32
-
- pods
33
-
verbs:
34
-
- get
35
-
- list
36
-
---
37
-
apiVersion: rbac.authorization.k8s.io/v1
38
-
# This cluster role binding allows anyone in the "manager" group to read secrets in any namespace.
39
-
kind: ClusterRoleBinding
40
-
metadata:
41
-
name: hdfs-clusterrolebinding-nodes
42
-
roleRef:
43
-
kind: ClusterRole
44
-
name: hdfs-clusterrole-nodes
45
-
apiGroup: rbac.authorization.k8s.io
46
-
----
47
-
48
-
In addition to this, the ClusterRoleBinding object needs to be patched with an entry for every Hadoop cluster in the `subjects` field:
49
-
50
-
[source,yaml]
51
-
----
52
-
subjects:
53
-
- kind: ServiceAccount
54
-
name: hdfs-<clustername>-serviceaccount
55
-
namespace: <cluster-namespace>
56
-
----
57
-
58
-
So for an HDFS cluster using the ServiceAccount `hdfs-serviceaccount` in the `stackable` namespace, the full ClusterRoleBinding would look like this:
59
-
[source,yaml]
60
-
----
61
-
---
62
-
apiVersion: rbac.authorization.k8s.io/v1
63
-
# This cluster role binding allows anyone in the "manager" group to read secrets in any namespace.
64
-
kind: ClusterRoleBinding
65
-
metadata:
66
-
name: hdfs-clusterrolebinding-nodes
67
-
subjects:
68
-
- kind: ServiceAccount
69
-
name: hdfs-serviceaccount
70
-
namespace: stackable
71
-
roleRef:
72
-
kind: ClusterRole
73
-
name: hdfs-clusterrole-nodes
74
-
apiGroup: rbac.authorization.k8s.io
75
-
----
76
-
77
-
To then configure the cluster for rack awareness, the following setting needs to be set via config override:
15
+
Configuration of the tool is done by using the field `rackAwareness` under the cluster configuration:
This instructs the namenode to use the topology tool for looking up information from Kubernetes.
93
-
94
-
Configuration of the tool is then done via the environment variable `TOPOLOGY_LABELS`.
95
-
96
-
This variable can be set to a semicolon separated list (maximum of two levels are allowed by default) of the following format: [node|pod]:<labelname>
97
-
98
-
99
-
So for example `node:topology.kubernetes.io/zone;pod:app.kubernetes.io/role-group` would resolve to /<value of label topology.kubernetes.io/zone on the node>/<value of label app.kubernetes.io/role-group on the pod>.
100
-
101
-
102
-
A full example of configuring this would look like this:
Internally this will be used to create a topology label consisting of the value of the node label `topology.kubernetes.io/zone` and the pod label `app.kubernetes.io/role-group`, e.g. `/eu-central-1/rg1`.
0 commit comments