Skip to content
This repository was archived by the owner on Apr 18, 2024. It is now read-only.

Commit 33c790e

Browse files
committed
Moved v6 to top level, v5 removed (use release 1.0.0)
1 parent 1af00dd commit 33c790e

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

80 files changed

+110
-11506
lines changed

README.md

Lines changed: 110 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,112 @@
11
# oci-cloudera
2-
These are Terraform modules for deploying Cloudera Enterprise Data Hub (EDH) on Oracle Cloud Infrastructure (OCI). This consists of two sub-modules, one for Cloudera EDH v5, and one for Cloudera EDH v6:
2+
This module deploys a cluster of arbitrary size using Cloudera Enterprise Data Hub v6 and Cloudera Manager v6.1.
3+
4+
Future development will include support for EDH v5 clusters. In the meantime, use the [1.0.0 release](https://github.com/oci-quickstart/oci-cloudera/releases/tag/1.0.0) for v5 deployments.
5+
6+
| | Worker Nodes | Bastion Instance | Utility and Master Instances |
7+
|-------------|----------------|------------------|------------------------------|
8+
| Recommended | BM.DenseIO2.52 | VM.Standard2.4 | VM.Standard2.16 |
9+
10+
Host types can be customized in the env-vars file referenced below. Also included with this template is an easy method to customize block volume quantity and size as pertains to HDFS capacity. See "variables.tf" for more information in-line.
11+
12+
## Prerequisites
13+
First off you'll need to do some pre deploy setup. That's all detailed [here](https://github.com/oci-quickstart/oci-prerequisites).
14+
15+
### Additional Python Dependencies
16+
This module depends on Python, Paramiko, PIP, and cm_client. These should be installed on the host you are using to deploy the Terraform module.
17+
18+
On EL7 hosts, installation can be performed using the following commands:
19+
20+
sudo yum install python python-pip python-paramiko.noarch -y
21+
sudo pip install --upgrade pip
22+
sudo pip install cm_client
23+
24+
On Mac, installation can be peformed using the following commands:
25+
26+
curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
27+
sudo python get-pip.py
28+
sudo pip install --upgrade pip
29+
sudo pip install cm_client paramiko
30+
31+
### Clone the Module
32+
Now, you'll want a local copy of this repo. You can make that with the commands:
33+
34+
git clone https://github.com/oci-quickstart/oci-cloudera.git
35+
cd oci-cloudera/v6
36+
ls
37+
38+
## Python Deployment using cm_client
39+
The deployment script "deploy_on_oci.py" uses cm_client against Cloudera Manger API v31. As such it does require some customization before execution. Reference the header section in the script, it is highly encouraged you modify the following variables before deployment, ssh_keyfile is required or deployment will fail:
40+
41+
admin_user_name
42+
admin_password
43+
cluster_name
44+
ssh_keyfile (REQUIRED)
45+
cluster_service_list
46+
47+
Also if you modify the compute.tf in any way to change hostname parameters, you will need to update these variables for pattern matching, otherwise host detection and cluster layout will fail:
48+
49+
worker_hosts_contain
50+
master_hosts_contain
51+
namenode_host_contains
52+
secondary_namenode_host_contains
53+
cloudera_manager_host_contains
54+
55+
In addition, further customization of the cluster deployment can be done by modification of the following functions:
56+
57+
setup_mgmt_rcg
58+
update_cluster_rcg_configuration
59+
60+
This does require some knowledge of Python - modify at your own risk. These functions contain Cloudera specific tuning parameters as well as host mapping for roles.
61+
62+
## Kerberos Secure Cluster by Default
63+
64+
This automation now defaults to using a local KDC deployed on the Cloudera Manager instance for secure cluster operation. Please read the scripts [README](../v6/scripts/README.md) for information regarding how to set these parameters prior to deployment.
65+
66+
Also - for cluster management, you will need to manually create at a minimum the HDFS Superuser Principal as [detailed here](https://www.cloudera.com/documentation/enterprise/latest/topics/cm_sg_using_cm_sec_config.html#create-hdfs-superuser) after deployment.
67+
68+
## Cloudera Manager and Cluster Metadata Database
69+
You are able to customize which database you want to use for Cloudera Manager and Cluster Metadata. In compute.tf you will see a "user_data" field for the Utility instance:
70+
71+
user_data = "${base64encode(file("scripts/cm_boot_mysql.sh"))}"
72+
73+
This is set to use MySQL for the database. If you want to use Postgres, you would change it:
74+
75+
user_data = "${base64encode(file("scripts/cm_boot_postgres.sh"))}"
76+
77+
You can customize the default root password for MySQL by editing the source script. For the various Cloudera databases, random passwords are generated and used. The same is true when using Postgres.
78+
79+
Note that you will also need to change "meta_db_port" in deploy_on_oci.py if you choose to run Postgres.
80+
81+
## Deployment Syntax
82+
Deployment of the module is straight forward using the following Terraform commands
83+
84+
terraform init
85+
terraform plan
86+
terraform apply
87+
88+
This will create all the required elements in a compartment in the target OCI tenancy. This includes VCN and Security List parameters. Security audit of these in the network.tf is suggested.
89+
90+
After Terraform is finished deploying, the output will show the Python syntax to trigger cluster deployment. This command can be run immediately following deployment, as it has built-in checks to wait until Cloudera Manager API is up and responding before it executes deployment. The syntax is as follows:
91+
92+
python scripts/deploy_on_oci.py -B -m <master_ip> -d <disk_count> -w <worker_shape>
93+
94+
It is also possible to destroy an existing cluster with this script using Cloudera Manager
95+
96+
python scripts/deploy_on_oci.py -D -m <master_ip>
97+
98+
## Destroy the Deployment
99+
100+
When you no longer need the deployment, you can run this command to destroy it:
101+
102+
terraform destroy
103+
104+
## Deployment Caveats
105+
Currently this module requires Cloudera Manager API to be on an edge host with a Public IP address. This is used to trigger cluster deployment, as well as SSH into the Cloudera Manger host to perform dynamic host discovery to map for Cluster topology.
106+
107+
Future enhancements to this module are planned to support a completely Private (non-Internet exposed) cluster deployment.
108+
109+
110+
111+
3112

4-
* Cloudera EDH v5 uses cm_api python based deployment, which is currently deprecated (v19).
5-
* Cloudera EDH v6 uses cm_client python based deployment, which is current (v31).

v5/README.md

Lines changed: 0 additions & 32 deletions
This file was deleted.

v5/ad-spanning/README.md

Lines changed: 0 additions & 47 deletions
This file was deleted.

v5/ad-spanning/block.tf.NO

Lines changed: 0 additions & 110 deletions
This file was deleted.

0 commit comments

Comments
 (0)