Skip to content

Commit c2c79fe

Browse files
adwk67sbernauerfhennig
authored
docs: rego rule explanation and links (#519)
* rego rule explanation and links * minor changes * typos and added section ID * added missing new lines * Update docs/modules/hdfs/pages/usage-guide/security.adoc Co-authored-by: Sebastian Bernauer <sebastian.bernauer@stackable.de> * Update docs/modules/hdfs/pages/usage-guide/security.adoc Co-authored-by: Felix Hennig <fhennig@users.noreply.github.com> * Update docs/modules/hdfs/pages/usage-guide/security.adoc Co-authored-by: Felix Hennig <fhennig@users.noreply.github.com> * Update docs/modules/hdfs/pages/usage-guide/security.adoc Co-authored-by: Felix Hennig <fhennig@users.noreply.github.com> --------- Co-authored-by: Sebastian Bernauer <sebastian.bernauer@stackable.de> Co-authored-by: Felix Hennig <fhennig@users.noreply.github.com>
1 parent 03dc777 commit c2c79fe

File tree

1 file changed

+139
-3
lines changed

1 file changed

+139
-3
lines changed

docs/modules/hdfs/pages/usage-guide/security.adoc

Lines changed: 139 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,14 @@
22

33
== Authentication
44
Currently the only supported authentication mechanism is Kerberos, which is disabled by default.
5-
For Kerberos to work a Kerberos KDC is needed, which the users needs to provide.
5+
For Kerberos to work a Kerberos KDC is needed, which the user needs to provide.
66
The xref:secret-operator:secretclass.adoc#backend-kerberoskeytab[secret-operator documentation] states which kind of Kerberos servers are supported and how they can be configured.
77

88
IMPORTANT: Kerberos is supported starting from HDFS version 3.3.x
99

1010
=== 1. Prepare Kerberos server
1111
To configure HDFS to use Kerberos you first need to collect information about your Kerberos server, e.g. hostname and port.
12-
Additionally you need a service-user, which the secret-operator uses to create create principals for the HDFS services.
12+
Additionally you need a service-user, which the secret-operator uses to create principals for the HDFS services.
1313

1414
=== 2. Create Kerberos SecretClass
1515
Afterwards you need to enter all the needed information into a SecretClass, as described in xref:secret-operator:secretclass.adoc#backend-kerberoskeytab[secret-operator documentation].
@@ -69,7 +69,9 @@ include::example$usage-guide/hdfs-regorules.yaml[]
6969
----
7070

7171
This rego rule is intended for demonstration purposes and allows every operation.
72-
For a production setup you probably want to take a look at our integration tests for a more secure set of rego rules.
72+
For a production setup you will probably need to have something much more granular.
73+
We provide a more representative rego rule in our integration tests and in the aforementioned hdfs-utils repository.
74+
Details can be found below in the <<fine-granular-rego-rules, fine-granular rego rules>> section.
7375
Reference the rego rule as follows in your HdfsCluster:
7476

7577
[source,yaml]
@@ -109,6 +111,140 @@ The implication is thus that you cannot add users to the `superuser` group, whic
109111
We have decided that this is an acceptable approach as normal operations will not be affected.
110112
In case you really need users to be part of the `superusers` group, you can use a configOverride on `hadoop.user.group.static.mapping.overrides` for that.
111113

114+
[#fine-granular-rego-rules]
115+
=== Fine-granular rego rules
116+
117+
The hdfs-utils repository contains a more production-ready rego-rule https://github.com/stackabletech/hdfs-utils/blob/main/rego/hdfs.rego[here].
118+
With a few minor differences (e.g. Pod names) it is the same rego rule that is used in this https://github.com/stackabletech/hdfs-operator/blob/main/tests/templates/kuttl/kerberos/12-rego-rules.txt.j2[integration test].
119+
120+
Access is granted by looking at three bits of information that must be supplied for every rego-rule callout:
121+
122+
* the *identity* of the user
123+
* the *resource* requested by the user
124+
* the *operation* which the user wants to perform on the resource
125+
126+
Each operation has an implicit action-level attribute e.g. `create` requires at least read-write permissions.
127+
This action attribute is then checked against the permissions assigned to the user by an ACL and the operation is permitted if this check is fulfilled.
128+
129+
The basic structure of this rego rule is shown below (you can refer to the full https://github.com/stackabletech/hdfs-utils/blob/main/rego/hdfs.rego[here]).
130+
131+
.Rego rule outline
132+
[source]
133+
----
134+
package hdfs
135+
136+
import rego.v1
137+
138+
# Turn off access by default.
139+
default allow := false
140+
default matches_identity(identity) := false
141+
142+
# Check access in order of increasing specificity (i.e. identity first).
143+
# Deny access as "early" as possible.
144+
allow if {
145+
some acl in acls
146+
matches_identity(acl.identity)
147+
matches_resource(input.path, acl.resource)
148+
action_sufficient_for_operation(acl.action, input.operationName)
149+
}
150+
151+
# Identity checks based on e.g.
152+
# - explicit matches on the (long) userName or shortUsername
153+
# - regex matches
154+
# - the group membership (simple- or regex-matches on long-or short-username)
155+
matches_identity(identity) if {
156+
...
157+
}
158+
159+
# Resource checks on e.g.
160+
# - explicit file- or directory-mentions
161+
# - inclusion of the file in recursively applied access rights
162+
matches_resource(file, resource) if {
163+
...
164+
}
165+
166+
# Check the operation and its implicit action against an ACL
167+
action_sufficient_for_operation(action, operation) if {
168+
action_hierarchy[action][_] == action_for_operation[operation]
169+
}
170+
171+
action_hierarchy := {
172+
"full": ["full", "rw", "ro"],
173+
"rw": ["rw", "ro"],
174+
"ro": ["ro"],
175+
}
176+
177+
178+
# This should contain a list of all HDFS actions relevant to the application
179+
action_for_operation := {
180+
"abandonBlock": "rw",
181+
...
182+
}
183+
184+
acls := [
185+
{
186+
"identity": "group:admins",
187+
"action": "full",
188+
"resource": "hdfs:dir:/",
189+
},
190+
...
191+
]
192+
----
193+
194+
The full file in the hdfs-utils repository contains extra documentary information such as a https://github.com/stackabletech/hdfs-utils/blob/main/rego/hdfs.rego#L186-L204[listing] of HDFS actions that would not typically be subject to an ACL.
195+
In hdfs-utils there is also a https://github.com/stackabletech/hdfs-utils/blob/main/rego/hdfs_test.rego[test file] to verify the rules, where different asserts are applied to the rules.
196+
Take the test case below as an example:
197+
198+
[source]
199+
----
200+
test_admin_access_to_developers if {
201+
allow with input as {
202+
"callerUgi": {
203+
"shortUserName": "admin",
204+
"userName": "admin/test-hdfs-permissions.default.svc.cluster.local@CLUSTER.LOCAL",
205+
},
206+
"path": "/developers/file",
207+
"operationName": "create",
208+
}
209+
}
210+
----
211+
212+
This test passes through the following steps:
213+
214+
==== 1. Does the user or group exist in the ACL?
215+
216+
Yes, a match is found on userName via the corresponding group (`admins`, yielded by the mapping `groups_for_user`).
217+
218+
==== 2. Does this user/group have permission to fulfill the specified operation on the given path?
219+
220+
Yes, as this ACL item
221+
222+
[source]
223+
----
224+
{
225+
"identity": "group:admins",
226+
"action": "full",
227+
"resource": "hdfs:dir:/",
228+
},
229+
----
230+
231+
matches the resource on
232+
233+
[source]
234+
----
235+
# Resource mentions a folder higher up the tree, which will will grant access recursively
236+
matches_resource(file, resource) if {
237+
startswith(resource, "hdfs:dir:/")
238+
# directories need to have a trailing slash
239+
endswith(resource, "/")
240+
startswith(file, trim_prefix(resource, "hdfs:dir:"))
241+
}
242+
----
243+
244+
and the action permission required for the operation `create` (`rw`) is a subset of the ACL grant (`full`).
245+
246+
NOTE: The various checks for `matches_identity` and `matches_resource` are generic, given that the internal list of HDFS actions is comprehensive and the `input` structure is an internal implementation. This means that only the ACL needs to be adapted to specific customer needs.
247+
112248
== Wire encryption
113249
In case Kerberos is enabled, `Privacy` mode is used for best security.
114250
Wire encryption without Kerberos as well as https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SecureMode.html#Data_confidentiality[other wire encryption modes] are *not* supported.

0 commit comments

Comments
 (0)