@@ -33,3 +33,313 @@ Please open Git Issues if you would like to see updates/other plugin integration
33
33
- Apache Ranger: https://ranger.apache.org/
34
34
- Apache Ranger + Amazon EMR Blog: https://aws.amazon.com/blogs/big-data/implementing-authorization-and-auditing-using-apache-ranger-on-amazon-emr/
35
35
- Apache Ranger Presto Plugin: https://cwiki.apache.org/confluence/display/RANGER/Presto+Plugin
36
+
37
+ ---
38
+
39
+ # Sub Project RANGER-EMR-CLI-INSTALLER: A CLI Tool for Ranger Self Installing and Integrating with AWS EMR Cluster and AD/LDAP
40
+
41
+ This is a command line tool which is used to install ranger and integrate a AWS EMR cluster and a windows AD or Open LDAP server as authentication channel. There is another closely related project: ** [ ranger-emr-cfn-installer] ( https://github.com/bluishglc/ranger-emr-cfn-installer ) ** which does the same job via aws cloudformation. The two projects are very close, but can work independently,you can pick anyone as you wish.
42
+
43
+ ## 1. Ranger Introduction
44
+
45
+ Let’s check out Ranger's architecture:
46
+
47
+ ![ ranger-architecture] ( https://user-images.githubusercontent.com/5539582/99872048-f0c24480-2c19-11eb-8c0f-43df2552837c.png )
48
+
49
+ Ranger has 5 parts:
50
+
51
+ 1 . Ranger Admin Service
52
+ 2 . Ranger UserSync Service
53
+ 3 . A Backend RDB for Storing User's Authorization
54
+ 4 . A Solr Server for Storing Audit Log
55
+ 5 . A Series of Plugins for Big Data Components/Services
56
+
57
+ Besides above, there are 2 external dependencies For Ranger to integrate:
58
+
59
+ 6 . A Windows AD or Open LDAD Server as Authentication Channel
60
+ 7 . A Hadoop (AWS EMR) Cluster to Be Managed by Ranger
61
+
62
+ So, a fully Ranger installation will cover following jobs:
63
+
64
+ 1 . Install JDK (Required by Ranger Admin and Solr)
65
+ 2 . Install MySQL (As Ranger Backend RDB)
66
+ 3 . Install Solr (As Ranger Audit Store)
67
+ 4 . Install Ranger Admin (and Integrate with AD/LDAP Server)
68
+ 5 . Install Ranger UserSync (and Integrate with AD/LDAP Server)
69
+ 6 . Install Ranger Plugins (i.e. HDFS, Hive, HBase and so on)
70
+
71
+ ## 2. Prerequisites
72
+
73
+ Before installing, make sure following items are ready or done:
74
+
75
+ 1 . Make sure the EMR cluster is in waiting status, no any job is running
76
+ 2 . Upload your private SSH key (the pem file) to ranger server, for example ` /home/ec2-user/key.pem `
77
+ 3 . It's recommanded to explore users and groups on Windows AD or Open LDAP via GUI tool, for example LDAP Admin, so as to detemine AD/LDAP related parameters
78
+ 4 . Check network connectivities among Ranger server, Windows AD or Open LDAP server and EMR nodes
79
+
80
+ ## 3. Download
81
+
82
+ 1 . First of all, setup a clean linux server, login and switch to ` root ` user.
83
+
84
+ 2 . Install git and check out this project.
85
+
86
+ ``` bash
87
+ yum -y install git
88
+ git clone https://github.com/bluishglc/ranger-emr-cli-installer.git /home/ec2-user/ranger-emr-cli-installer
89
+ ```
90
+
91
+ ## 4. Usage
92
+
93
+ After download, let's print usage to check if the cli tool is ready to use:
94
+
95
+ ``` bash
96
+ sh /home/ec2-user/ranger-emr-cli-installer/bin/setup.sh help
97
+ ```
98
+ if goes well, the console will print all actions and options supported by this CLI tool:
99
+
100
+ ```
101
+ ============================= RANGER-EMR-CLI-INSTALLER USAGE =============================
102
+
103
+ SYNOPSIS
104
+
105
+ sudo sh ranger-emr-cli-installer/bin/setup.sh [ACTION] [--OPTION1 VALUE1] [--OPTION2 VALUE2]...
106
+
107
+ ACTIONS:
108
+
109
+ install Install all components
110
+ install-ranger Install ranger only
111
+ install-ranger-plugins Install ranger plugin only
112
+ test-emr-ssh-connectivity Test EMR ssh connectivity
113
+ test-emr-namenode-connectivity Test EMR namenode connectivity
114
+ test-ldap-connectivity Test LDAP connectivity
115
+ install-mysql Install MySQL
116
+ test-mysql-connectivity Test MySQL connectivity
117
+ install-mysql-jdbc-driver Install MySQL JDBC driver
118
+ install-jdk Install JDK8
119
+ download-ranger Download ranger
120
+ install-solr Install solr
121
+ test-solr-connectivity Test solr connectivity
122
+ init-solr-as-ranger-audit-store Test solr connectivity
123
+ init-ranger-admin-db Init ranger admin db
124
+ install-ranger-admin Install ranger admin
125
+ install-ranger-usersync Install ranger usersync
126
+ help Print help
127
+
128
+ OPTIONS:
129
+
130
+ --auth-type [ad|ldap] Authentication type, optional value: ad or ldap
131
+ --ad-domain Specify the domain name of windows ad server
132
+ --ad-url Specify the ldap url of windows ad server, i.e. ldap://10.0.0.1
133
+ --ad-base-dn Specify the base dn of windows ad server
134
+ --ad-bind-dn Specify the bind dn of windows ad server
135
+ --ad-bind-password Specify the bind password of windows ad server
136
+ --ad-user-object-class Specify the user object class of windows ad server
137
+ --ldap-url Specify the ldap url of Open LDAP, i.e. ldap://10.0.0.1
138
+ --ldap-user-dn-pattern Specify the user dn pattern of Open LDAP
139
+ --ldap-group-search-filter Specify the group search filter of Open LDAP
140
+ --ldap-base-dn Specify the base dn of Open LDAP
141
+ --ldap-bind-dn Specify the bind dn of Open LDAP
142
+ --ldap-bind-password Specify the bind password of Open LDAP
143
+ --ldap-user-object-class Specify the user object class of Open LDAP
144
+ --java-home Specify the JAVA_HOME path, default value is /usr/lib/jvm/java
145
+ --skip-install-mysql [true|false] Specify If skip mysql installing or not, default value is 'false'
146
+ --mysql-host Specify the mysql server hostname or IP, default value is current host IP
147
+ --mysql-root-password Specify the root password of mysql
148
+ --mysql-ranger-db-user-password Specify the ranger db user password of mysql
149
+ --solr-host Specify the solr server hostname or IP, default value is current host IP
150
+ --skip-install-solr [true|false] Specify If skip solr installing or not, default value is 'false'
151
+ --ranger-host Specify the ranger server hostname or IP, default value is current host IP
152
+ --ranger-version [2.1.0] Specify the ranger version, now only Ranger 2.1.0 is supported
153
+ --ranger-repo-url Specify the ranger repository url
154
+ --ranger-plugins [hdfs|hive|hbase] Specify what plugins will be installed(accept multiple comma-separated values), now support hdfs, hive and hbase
155
+ --emr-master-nodes Specify master nodes list of EMR cluster(accept multiple comma-separated values), i.e. 10.0.0.1,10.0.0.2,10.0.0.3
156
+ --emr-core-nodes Specify core nodes list of EMR cluster(accept multiple comma-separated values), i.e. 10.0.0.4,10.0.0.5,10.0.0.6
157
+ --emr-ssh-key Specify the path of ssh key to connect EMR nodes
158
+ --restart-interval Specify the restart interval
159
+
160
+ ```
161
+
162
+ This means the tool is ready to use.
163
+
164
+ ## 5. Examples
165
+
166
+ To explain how to use this cli tool, assume we have following environment:
167
+
168
+ ** A Windows AD Server:**
169
+
170
+ Key|Value
171
+ ---------:|:-----
172
+ &emsp ;&emsp ;&emsp ;&emsp ;&emsp ;&emsp ;&emsp ;&emsp ;&emsp ;&emsp ; IP|10.0.0.194
173
+ Domain Name|corp.emr.local
174
+ Base DN|cn=users,dc=corp,dc=emr,dc=local
175
+ Bind DN|cn=ranger,ou=service accounts,dc=example,dc=com
176
+ Bind DN Password|Admin1234!
177
+ User Object Class|person
178
+
179
+ ** An Open LDAP Server:**
180
+
181
+ Key|Value
182
+ ---------:|:-----
183
+ &emsp ;&emsp ;&emsp ;&emsp ;&emsp ;&emsp ;&emsp ;&emsp ;&emsp ;&emsp ; IP|10.0.0.41
184
+ Base DN|dc=example,dc=com
185
+ Bind DN|cn=ranger,ou=service accounts,dc=example,dc=com
186
+ Bind DN Password|Admin1234!
187
+ User DN Pattern|uid={0},dc=example,dc=com
188
+ Bind Group Search Filter|(member=uid={0},dc=example,dc=com)
189
+ User Object Class|inetOrgPerson
190
+
191
+
192
+ ** A Multi-Master EMR Cluster:**
193
+
194
+ Node|IP
195
+ ---:|:---
196
+ &emsp ;&emsp ;&emsp ;&emsp ;&emsp ; Master Nodes|10.0.0.177,10.0.0.199,10.0.0.21
197
+ Core Nodes|10.0.0.114,10.0.0.136
198
+
199
+
200
+ ** A Normal EMR Cluster:**
201
+
202
+ Node|IP
203
+ ---:|:---
204
+ &emsp ;&emsp ;&emsp ;&emsp ;&emsp ; Master Nodes|10.0.0.177,10.0.0.199,10.0.0.21
205
+ Core Nodes|10.0.0.114,10.0.0.136
206
+
207
+ ### 5.1. Install Ranger + Integrate a Window AD Server + Integrate A Multi-Master EMR Cluster
208
+
209
+ The following diagram illustrates what this example will do:
210
+
211
+ ![ example1] ( https://user-images.githubusercontent.com/5539582/99872053-fc157000-2c19-11eb-94c4-ee36ed30ce14.png )
212
+
213
+ The following command line will finish this job:
214
+
215
+ ``` bash
216
+ sudo sh ranger-emr-cli-installer/bin/setup.sh install \
217
+ --auth-type ad \
218
+ --ad-domain corp.emr.local \
219
+ --ad-url ldap://10.0.0.194 \
220
+ --ad-base-dn ' cn=users,dc=corp,dc=emr,dc=local' \
221
+ --ad-bind-dn ' cn=ranger,ou=service accounts,dc=corp,dc=emr,dc=local' \
222
+ --ad-bind-password ' Admin1234!' \
223
+ --ad-user-object-class person \
224
+ --ranger-plugins hdfs,hive,hbase \
225
+ --emr-master-nodes 10.0.0.177,10.0.0.199,10.0.0.21 \
226
+ --emr-core-nodes 10.0.0.114,10.0.0.136 \
227
+ --emr-ssh-key /home/ec2-user/key.pem
228
+ ```
229
+
230
+ This cli tool follows the principle of "convention over configuration", most parameters are preset by default values, so a complete equivalent version of above command line is as following:
231
+
232
+ ``` bash
233
+ sudo sh ranger-emr-cli-installer/bin/setup.sh install \
234
+ --ranger-host $( hostname -i) \
235
+ --java-home /usr/lib/jvm/java \
236
+ --skip-install-mysql false \
237
+ --mysql-host $( hostname -i) \
238
+ --mysql-root-password ' Admin1234!' \
239
+ --mysql-ranger-db-user-password ' Admin1234!' \
240
+ --skip-install-solr false \
241
+ --solr-host $( hostname -i) \
242
+ --auth-type ad \
243
+ --ad-domain corp.emr.local \
244
+ --ad-url ldap://10.0.0.194 \
245
+ --ad-base-dn ' cn=users,dc=corp,dc=emr,dc=local' \
246
+ --ad-bind-dn ' cn=ranger,ou=service accounts,dc=corp,dc=emr,dc=local' \
247
+ --ad-bind-password ' Admin1234!' \
248
+ --ad-user-object-class person \
249
+ --ranger-version 2.1.0 \
250
+ --ranger-repo-url ' http://52.81.173.97:7080/ranger-repo/' \
251
+ --ranger-plugins hdfs,hive,hbase \
252
+ --emr-master-nodes 10.0.0.177,10.0.0.199,10.0.0.21 \
253
+ --emr-core-nodes 10.0.0.114,10.0.0.136 \
254
+ --emr-ssh-key /home/ec2-user/key.pem \
255
+ --restart-interval 30
256
+ ```
257
+
258
+ You can adjust more parameters against your demands or environments based on above cli.
259
+
260
+ ### 5.2. Integrate The Second Normal EMR Cluster
261
+
262
+ The following diagram illustrates what this example will do:
263
+
264
+ ![ example2] ( https://user-images.githubusercontent.com/5539582/99872056-0172ba80-2c1a-11eb-9087-ea8e5ef353b7.png )
265
+
266
+ The following command line will finish this job:
267
+
268
+ ``` bash
269
+ sudo sh ranger-emr-cli-installer/bin/setup.sh install-ranger-plugins \
270
+ --ranger-host $( hostname -i) \
271
+ --solr-host $( hostname -i) \
272
+ --ranger-version 2.1.0 \
273
+ --ranger-plugins hdfs,hive,hbase \
274
+ --emr-master-nodes 10.0.0.18 \
275
+ --emr-core-nodes 10.0.0.69 \
276
+ --emr-ssh-key /home/ec2-user/key.pem \
277
+ --restart-interval 30
278
+ ```
279
+
280
+ ### 5.3. Install Ranger + Integrate a Open LDAP Server + Integrate A Multi-Master EMR Cluster
281
+
282
+ The following diagram illustrates what this example will do:
283
+
284
+ ![ example3] ( https://user-images.githubusercontent.com/5539582/99872059-059ed800-2c1a-11eb-82e7-da5e21949d44.png )
285
+
286
+ The following command line will finish this job:
287
+
288
+ ``` bash
289
+ sudo sh ranger-emr-cli-installer/bin/setup.sh install \
290
+ --auth-type ldap \
291
+ --ldap-url ldap://10.0.0.41 \
292
+ --ldap-base-dn ' dc=example,dc=com' \
293
+ --ldap-bind-dn ' cn=ranger,ou=service accounts,dc=example,dc=com' \
294
+ --ldap-bind-password ' Admin1234!' \
295
+ --ldap-user-dn-pattern ' uid={0},dc=example,dc=com' \
296
+ --ldap-group-search-filter ' (member=uid={0},dc=example,dc=com)' \
297
+ --ldap-user-object-class inetOrgPerson \
298
+ --ranger-plugins hdfs,hive,hbase \
299
+ --emr-master-nodes 10.0.0.177,10.0.0.199,10.0.0.21 \
300
+ --emr-core-nodes 10.0.0.114,10.0.0.136 \
301
+ --emr-ssh-key /home/ec2-user/key.pem
302
+ ```
303
+
304
+ Again,a complete equivalent version of above command line is as following:
305
+
306
+ ``` bash
307
+ sudo sh ranger-emr-cli-installer/bin/setup.sh install \
308
+ --ranger-host $( hostname -i) \
309
+ --java-home /usr/lib/jvm/java \
310
+ --skip-install-mysql false \
311
+ --mysql-host $( hostname -i) \
312
+ --mysql-root-password ' Admin1234!' \
313
+ --mysql-ranger-db-user-password ' Admin1234!' \
314
+ --skip-install-solr false \
315
+ --solr-host $( hostname -i) \
316
+ --auth-type ldap \
317
+ --ldap-url ldap://10.0.0.41 \
318
+ --ldap-base-dn ' dc=example,dc=com' \
319
+ --ldap-bind-dn ' cn=ranger,ou=service accounts,dc=example,dc=com' \
320
+ --ldap-bind-password ' Admin1234!' \
321
+ --ldap-user-dn-pattern ' uid={0},dc=example,dc=com' \
322
+ --ldap-group-search-filter ' (member=uid={0},dc=example,dc=com)' \
323
+ --ldap-user-object-class inetOrgPerson \
324
+ --ranger-version 2.1.0 \
325
+ --ranger-repo-url ' http://52.81.173.97:7080/ranger-repo/' \
326
+ --ranger-plugins hdfs,hive,hbase \
327
+ --emr-master-nodes 10.0.0.177,10.0.0.199,10.0.0.21 \
328
+ --emr-core-nodes 10.0.0.114,10.0.0.136 \
329
+ --emr-ssh-key /home/ec2-user/key.pem \
330
+ --restart-interval 30
331
+ ```
332
+
333
+ You can adjust more parameters against your demands or environments based on above cli.
334
+
335
+ ## 6. Versions & Compatibility
336
+
337
+ The following is Ranger and EMR version compatibility form:
338
+
339
+   ; |Ranger 1.X|Ranger 2.x
340
+ ---|---|---
341
+ EMR 5.X|Y|N
342
+ EMR 6.X|N|Y
343
+
344
+ For Ranger 1, it works with Hadoop 2, for Ranger 2, it works with Hadoop 3, ** This project is developed against Ranger 2.1.0, so now, it can only integrate EMR 6.X.** For Ranger 1.2 + EMR 5.X, it is to be developed in the next according to demands.
345
+
0 commit comments