description |
---|
August 6, 2019 - John Park |
Install Docker on your workstation
Validate Installation
docker version
docker pull cloudera/quickstart:latest
Please take a look at Docker Hub for Cloudera Quickstart here
docker images
Make modification to Docker to Run with 8CPU(Minimum 4) and 8 GB of Memory
Execute the Docker Image by Running following Command
docker run --name qst.cdh --hostname=quickstart.cloudera --privileged=true -ti -d -v /Users/jrp/Documents/Cloudera:/src --publish-all=true -p 8888:8888 -p 80:80 -p 7180:7180 -p 10000:10000 -p 21050:21050 cloudera/quickstart /usr/bin/docker-quickstart
docker run --name qst.cdh --hostname=quickstart.cloudera --privileged=true -ti -d -v /Users/jrp/Documents/Cloudera:/src --publish-all=true -p 8888:8888 -p 80:80 -p 7180:7180 -p 10000:10000 -p 21050:21050 cloudera/quickstart /usr/bin/docker-quickstart
- To run an image with the hostname of
quickstart.cloudera
/usr/bin/docker-quickstart
#Entry point to start all CDH services. Provided by cloudera--hostname=quickstart.cloudera
#Required: pseudo-distributed configuration assumes this hostname--name =qst.cdh
force docker to name the container for easy recognition--privileged=true
#Required: for HBase, MySQL-backed Hive metastore, Hue, Oozie, Sentry, and Cloudera Manager, and possibly others-t
#Required: once services are started, a Bash shell takes over and will die without this-i
#Required: if you want to use the terminal, either immediately or attach later-v
allows me to share volumes with the container, so anything that I put in the/Users/jrp/Documents/Cloudera
directory, will show up in the Docker container under the/src
directory-d
#Optional: runs the container in the background. I would recommend to use this option if you planning to run container constantly on background.--publish-all=true
opens up all the host ports to the docker ports, so you can access stuff like the Cloudera manager, HUE, Hive, Impala and etc-p 8888
#Recommended: maps the Hue port in the guest to another port on the host-p 7180
#Recommended: maps the Cloudera Manager port in the guest (7180) to another port on the-p 80, -p21050, -p10000
needed for Tutorial, Impala and Hive- Available Ports
- 8888 Hue
- 7180 Cloudera Manager
- 80 Tutorial
- 8983 SolR
- 8088 Hadoop MapReduce UI
- 11000 Oozie
- 9092 Kafka
- 2181 Zookeeper
- 10000 Hive (Thrift)
- 21050 Impala Thrift
Validate Docker running by executing
docker ps
Inspect the Docker Network by typing
docker inspect qst.cdh
Multiple ways to do this but I will run docker exec you can also attach to the docker image using docker attach
docker exec -ti qst.cdh /home/cloudera/cloudera-manager --express
You can also execute
docker attach qst.cdh
and run script to enable cloudera manager: /home/cloudera/cloudera-manager --express
First service it tries to start is Kafka but since we do not have Kafka enabled we will need to wait 30 to 45 seconds for CDH services to start
use Clouder/Cloudera to log in
First thing you will see is the Server is not working and stopped due to Clock Offset issues
This can be resolved with following command
docker -ti qst.cdh /etc/init.d/ntpd stop
and docker -ti qst.cdh etc/init.d/ntpd start
Validate Cloudera Manager Shows Cluster Health Green
Restart Cloudera Quickstart Services using
curl -X POST -u "admin:admin" -i http://localhost:7180/api/v18/clusters/Cloudera%20QuickStart/commands/start
or using the GUI
Load some Data to your Cluster using HUE and you are ready to test.
For this exercise I used data from ourairports.com. Please take a look
On Cloudera Manager Use Hue or Hive to Import a CSV into Hive and Impala
Stopping Docker Image
docker stop --time 60 qst.cdh
If for accident your container is stopped
docker restart qst.cdh
force Cloudera Manager to restart
docker exec -ti qst.cdh /home/cloudera/cloudera-manager --express --force
Tip To Stop all running/dangling Docker Images
docker stop --time=60 $(docker ps -a)
Cleaning up Docker
docker system prune