Skip to content

Commit 9cfb603

Browse files
author
buishglc
committed
initial import of project: ranger-emr-cli-installer
1 parent 4981133 commit 9cfb603

29 files changed

+5184
-0
lines changed

README.md

Lines changed: 310 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,3 +33,313 @@ Please open Git Issues if you would like to see updates/other plugin integration
3333
- Apache Ranger: https://ranger.apache.org/
3434
- Apache Ranger + Amazon EMR Blog: https://aws.amazon.com/blogs/big-data/implementing-authorization-and-auditing-using-apache-ranger-on-amazon-emr/
3535
- Apache Ranger Presto Plugin: https://cwiki.apache.org/confluence/display/RANGER/Presto+Plugin
36+
37+
---
38+
39+
# Sub Project RANGER-EMR-CLI-INSTALLER: A CLI Tool for Ranger Self Installing and Integrating with AWS EMR Cluster and AD/LDAP
40+
41+
This is a command line tool which is used to install ranger and integrate a AWS EMR cluster and a windows AD or Open LDAP server as authentication channel. There is another closely related project: **[ranger-emr-cfn-installer](https://github.com/bluishglc/ranger-emr-cfn-installer)** which does the same job via aws cloudformation. The two projects are very close, but can work independently,you can pick anyone as you wish.
42+
43+
## 1. Ranger Introduction
44+
45+
Let’s check out Ranger's architecture:
46+
47+
![ranger-architecture](https://user-images.githubusercontent.com/5539582/99872048-f0c24480-2c19-11eb-8c0f-43df2552837c.png)
48+
49+
Ranger has 5 parts:
50+
51+
1. Ranger Admin Service
52+
2. Ranger UserSync Service
53+
3. A Backend RDB for Storing User's Authorization
54+
4. A Solr Server for Storing Audit Log
55+
5. A Series of Plugins for Big Data Components/Services
56+
57+
Besides above, there are 2 external dependencies For Ranger to integrate:
58+
59+
6. A Windows AD or Open LDAD Server as Authentication Channel
60+
7. A Hadoop (AWS EMR) Cluster to Be Managed by Ranger
61+
62+
So, a fully Ranger installation will cover following jobs:
63+
64+
1. Install JDK (Required by Ranger Admin and Solr)
65+
2. Install MySQL (As Ranger Backend RDB)
66+
3. Install Solr (As Ranger Audit Store)
67+
4. Install Ranger Admin (and Integrate with AD/LDAP Server)
68+
5. Install Ranger UserSync (and Integrate with AD/LDAP Server)
69+
6. Install Ranger Plugins (i.e. HDFS, Hive, HBase and so on)
70+
71+
## 2. Prerequisites
72+
73+
Before installing, make sure following items are ready or done:
74+
75+
1. Make sure the EMR cluster is in waiting status, no any job is running
76+
2. Upload your private SSH key (the pem file) to ranger server, for example `/home/ec2-user/key.pem`
77+
3. It's recommanded to explore users and groups on Windows AD or Open LDAP via GUI tool, for example LDAP Admin, so as to detemine AD/LDAP related parameters
78+
4. Check network connectivities among Ranger server, Windows AD or Open LDAP server and EMR nodes
79+
80+
## 3. Download
81+
82+
1. First of all, setup a clean linux server, login and switch to `root` user.
83+
84+
2. Install git and check out this project.
85+
86+
```bash
87+
yum -y install git
88+
git clone https://github.com/bluishglc/ranger-emr-cli-installer.git /home/ec2-user/ranger-emr-cli-installer
89+
```
90+
91+
## 4. Usage
92+
93+
After download, let's print usage to check if the cli tool is ready to use:
94+
95+
```bash
96+
sh /home/ec2-user/ranger-emr-cli-installer/bin/setup.sh help
97+
```
98+
if goes well, the console will print all actions and options supported by this CLI tool:
99+
100+
```
101+
============================= RANGER-EMR-CLI-INSTALLER USAGE =============================
102+
103+
SYNOPSIS
104+
105+
sudo sh ranger-emr-cli-installer/bin/setup.sh [ACTION] [--OPTION1 VALUE1] [--OPTION2 VALUE2]...
106+
107+
ACTIONS:
108+
109+
install Install all components
110+
install-ranger Install ranger only
111+
install-ranger-plugins Install ranger plugin only
112+
test-emr-ssh-connectivity Test EMR ssh connectivity
113+
test-emr-namenode-connectivity Test EMR namenode connectivity
114+
test-ldap-connectivity Test LDAP connectivity
115+
install-mysql Install MySQL
116+
test-mysql-connectivity Test MySQL connectivity
117+
install-mysql-jdbc-driver Install MySQL JDBC driver
118+
install-jdk Install JDK8
119+
download-ranger Download ranger
120+
install-solr Install solr
121+
test-solr-connectivity Test solr connectivity
122+
init-solr-as-ranger-audit-store Test solr connectivity
123+
init-ranger-admin-db Init ranger admin db
124+
install-ranger-admin Install ranger admin
125+
install-ranger-usersync Install ranger usersync
126+
help Print help
127+
128+
OPTIONS:
129+
130+
--auth-type [ad|ldap] Authentication type, optional value: ad or ldap
131+
--ad-domain Specify the domain name of windows ad server
132+
--ad-url Specify the ldap url of windows ad server, i.e. ldap://10.0.0.1
133+
--ad-base-dn Specify the base dn of windows ad server
134+
--ad-bind-dn Specify the bind dn of windows ad server
135+
--ad-bind-password Specify the bind password of windows ad server
136+
--ad-user-object-class Specify the user object class of windows ad server
137+
--ldap-url Specify the ldap url of Open LDAP, i.e. ldap://10.0.0.1
138+
--ldap-user-dn-pattern Specify the user dn pattern of Open LDAP
139+
--ldap-group-search-filter Specify the group search filter of Open LDAP
140+
--ldap-base-dn Specify the base dn of Open LDAP
141+
--ldap-bind-dn Specify the bind dn of Open LDAP
142+
--ldap-bind-password Specify the bind password of Open LDAP
143+
--ldap-user-object-class Specify the user object class of Open LDAP
144+
--java-home Specify the JAVA_HOME path, default value is /usr/lib/jvm/java
145+
--skip-install-mysql [true|false] Specify If skip mysql installing or not, default value is 'false'
146+
--mysql-host Specify the mysql server hostname or IP, default value is current host IP
147+
--mysql-root-password Specify the root password of mysql
148+
--mysql-ranger-db-user-password Specify the ranger db user password of mysql
149+
--solr-host Specify the solr server hostname or IP, default value is current host IP
150+
--skip-install-solr [true|false] Specify If skip solr installing or not, default value is 'false'
151+
--ranger-host Specify the ranger server hostname or IP, default value is current host IP
152+
--ranger-version [2.1.0] Specify the ranger version, now only Ranger 2.1.0 is supported
153+
--ranger-repo-url Specify the ranger repository url
154+
--ranger-plugins [hdfs|hive|hbase] Specify what plugins will be installed(accept multiple comma-separated values), now support hdfs, hive and hbase
155+
--emr-master-nodes Specify master nodes list of EMR cluster(accept multiple comma-separated values), i.e. 10.0.0.1,10.0.0.2,10.0.0.3
156+
--emr-core-nodes Specify core nodes list of EMR cluster(accept multiple comma-separated values), i.e. 10.0.0.4,10.0.0.5,10.0.0.6
157+
--emr-ssh-key Specify the path of ssh key to connect EMR nodes
158+
--restart-interval Specify the restart interval
159+
160+
```
161+
162+
This means the tool is ready to use.
163+
164+
## 5. Examples
165+
166+
To explain how to use this cli tool, assume we have following environment:
167+
168+
**A Windows AD Server:**
169+
170+
Key|Value
171+
---------:|:-----
172+
           IP|10.0.0.194
173+
Domain Name|corp.emr.local
174+
Base DN|cn=users,dc=corp,dc=emr,dc=local
175+
Bind DN|cn=ranger,ou=service accounts,dc=example,dc=com
176+
Bind DN Password|Admin1234!
177+
User Object Class|person
178+
179+
**An Open LDAP Server:**
180+
181+
Key|Value
182+
---------:|:-----
183+
           IP|10.0.0.41
184+
Base DN|dc=example,dc=com
185+
Bind DN|cn=ranger,ou=service accounts,dc=example,dc=com
186+
Bind DN Password|Admin1234!
187+
User DN Pattern|uid={0},dc=example,dc=com
188+
Bind Group Search Filter|(member=uid={0},dc=example,dc=com)
189+
User Object Class|inetOrgPerson
190+
191+
192+
**A Multi-Master EMR Cluster:**
193+
194+
Node|IP
195+
---:|:---
196+
     Master Nodes|10.0.0.177,10.0.0.199,10.0.0.21
197+
Core Nodes|10.0.0.114,10.0.0.136
198+
199+
200+
**A Normal EMR Cluster:**
201+
202+
Node|IP
203+
---:|:---
204+
     Master Nodes|10.0.0.177,10.0.0.199,10.0.0.21
205+
Core Nodes|10.0.0.114,10.0.0.136
206+
207+
### 5.1. Install Ranger + Integrate a Window AD Server + Integrate A Multi-Master EMR Cluster
208+
209+
The following diagram illustrates what this example will do:
210+
211+
![example1](https://user-images.githubusercontent.com/5539582/99872053-fc157000-2c19-11eb-94c4-ee36ed30ce14.png)
212+
213+
The following command line will finish this job:
214+
215+
```bash
216+
sudo sh ranger-emr-cli-installer/bin/setup.sh install \
217+
--auth-type ad \
218+
--ad-domain corp.emr.local \
219+
--ad-url ldap://10.0.0.194 \
220+
--ad-base-dn 'cn=users,dc=corp,dc=emr,dc=local' \
221+
--ad-bind-dn 'cn=ranger,ou=service accounts,dc=corp,dc=emr,dc=local' \
222+
--ad-bind-password 'Admin1234!' \
223+
--ad-user-object-class person \
224+
--ranger-plugins hdfs,hive,hbase \
225+
--emr-master-nodes 10.0.0.177,10.0.0.199,10.0.0.21 \
226+
--emr-core-nodes 10.0.0.114,10.0.0.136 \
227+
--emr-ssh-key /home/ec2-user/key.pem
228+
```
229+
230+
This cli tool follows the principle of "convention over configuration", most parameters are preset by default values, so a complete equivalent version of above command line is as following:
231+
232+
```bash
233+
sudo sh ranger-emr-cli-installer/bin/setup.sh install \
234+
--ranger-host $(hostname -i) \
235+
--java-home /usr/lib/jvm/java \
236+
--skip-install-mysql false \
237+
--mysql-host $(hostname -i) \
238+
--mysql-root-password 'Admin1234!' \
239+
--mysql-ranger-db-user-password 'Admin1234!' \
240+
--skip-install-solr false \
241+
--solr-host $(hostname -i) \
242+
--auth-type ad \
243+
--ad-domain corp.emr.local \
244+
--ad-url ldap://10.0.0.194 \
245+
--ad-base-dn 'cn=users,dc=corp,dc=emr,dc=local' \
246+
--ad-bind-dn 'cn=ranger,ou=service accounts,dc=corp,dc=emr,dc=local' \
247+
--ad-bind-password 'Admin1234!' \
248+
--ad-user-object-class person \
249+
--ranger-version 2.1.0 \
250+
--ranger-repo-url 'http://52.81.173.97:7080/ranger-repo/' \
251+
--ranger-plugins hdfs,hive,hbase \
252+
--emr-master-nodes 10.0.0.177,10.0.0.199,10.0.0.21 \
253+
--emr-core-nodes 10.0.0.114,10.0.0.136 \
254+
--emr-ssh-key /home/ec2-user/key.pem \
255+
--restart-interval 30
256+
```
257+
258+
You can adjust more parameters against your demands or environments based on above cli.
259+
260+
### 5.2. Integrate The Second Normal EMR Cluster
261+
262+
The following diagram illustrates what this example will do:
263+
264+
![example2](https://user-images.githubusercontent.com/5539582/99872056-0172ba80-2c1a-11eb-9087-ea8e5ef353b7.png)
265+
266+
The following command line will finish this job:
267+
268+
```bash
269+
sudo sh ranger-emr-cli-installer/bin/setup.sh install-ranger-plugins \
270+
--ranger-host $(hostname -i) \
271+
--solr-host $(hostname -i) \
272+
--ranger-version 2.1.0 \
273+
--ranger-plugins hdfs,hive,hbase \
274+
--emr-master-nodes 10.0.0.18 \
275+
--emr-core-nodes 10.0.0.69 \
276+
--emr-ssh-key /home/ec2-user/key.pem \
277+
--restart-interval 30
278+
```
279+
280+
### 5.3. Install Ranger + Integrate a Open LDAP Server + Integrate A Multi-Master EMR Cluster
281+
282+
The following diagram illustrates what this example will do:
283+
284+
![example3](https://user-images.githubusercontent.com/5539582/99872059-059ed800-2c1a-11eb-82e7-da5e21949d44.png)
285+
286+
The following command line will finish this job:
287+
288+
```bash
289+
sudo sh ranger-emr-cli-installer/bin/setup.sh install \
290+
--auth-type ldap \
291+
--ldap-url ldap://10.0.0.41 \
292+
--ldap-base-dn 'dc=example,dc=com' \
293+
--ldap-bind-dn 'cn=ranger,ou=service accounts,dc=example,dc=com' \
294+
--ldap-bind-password 'Admin1234!' \
295+
--ldap-user-dn-pattern 'uid={0},dc=example,dc=com' \
296+
--ldap-group-search-filter '(member=uid={0},dc=example,dc=com)' \
297+
--ldap-user-object-class inetOrgPerson \
298+
--ranger-plugins hdfs,hive,hbase \
299+
--emr-master-nodes 10.0.0.177,10.0.0.199,10.0.0.21 \
300+
--emr-core-nodes 10.0.0.114,10.0.0.136 \
301+
--emr-ssh-key /home/ec2-user/key.pem
302+
```
303+
304+
Again,a complete equivalent version of above command line is as following:
305+
306+
```bash
307+
sudo sh ranger-emr-cli-installer/bin/setup.sh install \
308+
--ranger-host $(hostname -i) \
309+
--java-home /usr/lib/jvm/java \
310+
--skip-install-mysql false \
311+
--mysql-host $(hostname -i) \
312+
--mysql-root-password 'Admin1234!' \
313+
--mysql-ranger-db-user-password 'Admin1234!' \
314+
--skip-install-solr false \
315+
--solr-host $(hostname -i) \
316+
--auth-type ldap \
317+
--ldap-url ldap://10.0.0.41 \
318+
--ldap-base-dn 'dc=example,dc=com' \
319+
--ldap-bind-dn 'cn=ranger,ou=service accounts,dc=example,dc=com' \
320+
--ldap-bind-password 'Admin1234!' \
321+
--ldap-user-dn-pattern 'uid={0},dc=example,dc=com' \
322+
--ldap-group-search-filter '(member=uid={0},dc=example,dc=com)' \
323+
--ldap-user-object-class inetOrgPerson \
324+
--ranger-version 2.1.0 \
325+
--ranger-repo-url 'http://52.81.173.97:7080/ranger-repo/' \
326+
--ranger-plugins hdfs,hive,hbase \
327+
--emr-master-nodes 10.0.0.177,10.0.0.199,10.0.0.21 \
328+
--emr-core-nodes 10.0.0.114,10.0.0.136 \
329+
--emr-ssh-key /home/ec2-user/key.pem \
330+
--restart-interval 30
331+
```
332+
333+
You can adjust more parameters against your demands or environments based on above cli.
334+
335+
## 6. Versions & Compatibility
336+
337+
The following is Ranger and EMR version compatibility form:
338+
339+
 |Ranger 1.X|Ranger 2.x
340+
---|---|---
341+
EMR 5.X|Y|N
342+
EMR 6.X|N|Y
343+
344+
For Ranger 1, it works with Hadoop 2, for Ranger 2, it works with Hadoop 3, **This project is developed against Ranger 2.1.0, so now, it can only integrate EMR 6.X.** For Ranger 1.2 + EMR 5.X, it is to be developed in the next according to demands.
345+
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
*text=

ranger-emr-cli-installer/.gitignore

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
.idea
2+
/*.iml
3+
target
4+
build.bat

0 commit comments

Comments
 (0)