Skip to content

Commit 177643f

Browse files
authored
chore(demo/hbase-hdfs-load-cycling-data-demo): Update demo (#129)
* chore(demo/hbase-hdfs-load-cycling-data): Tidy up job scripts * fix(demo/hbase-hdfs-load-cycling-data): Revert to 24.3 image for distcp (stackabletech/docker-images#793) * docs(demo/hbase-hdfs-load-cycling-data): Add hints about the active namenode for browsing files.
1 parent 9765d51 commit 177643f

File tree

3 files changed

+30
-8
lines changed

3 files changed

+30
-8
lines changed

demos/hbase-hdfs-load-cycling-data/create-hfile-and-import-to-hbase.yaml

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -28,16 +28,24 @@ spec:
2828
- mountPath: /stackable/conf/hbase-env.sh
2929
name: config-volume-hbase
3030
subPath: hbase-env.sh
31-
command: [ "bash", "-c", "/stackable/hbase/bin/hbase \
31+
command:
32+
- bash
33+
- -euo
34+
- pipefail
35+
- -c
36+
- |
37+
# https://hbase.apache.org/book.html#tools
38+
/stackable/hbase/bin/hbase \
3239
org.apache.hadoop.hbase.mapreduce.ImportTsv \
3340
-Dimporttsv.separator=, \
3441
-Dimporttsv.columns=HBASE_ROW_KEY,rideable_type,started_at,ended_at,start_station_name,start_station_id,end_station_name,end_station_id,start_lat,start_lng,end_lat,end_lng,member_casual \
3542
-Dimporttsv.bulk.output=hdfs://hdfs/data/hfile \
36-
cycling-tripdata hdfs://hdfs/data/raw/demo-cycling-tripdata.csv.gz \
37-
&& /stackable/hbase/bin/hbase \
43+
cycling-tripdata hdfs://hdfs/data/raw/demo-cycling-tripdata.csv.gz
44+
45+
/stackable/hbase/bin/hbase \
3846
org.apache.hadoop.hbase.tool.LoadIncrementalHFiles \
3947
hdfs://hdfs/data/hfile \
40-
cycling-tripdata" ] # https://hbase.apache.org/book.html#tools
48+
cycling-tripdata
4149
volumes:
4250
- name: config-volume-hbase
4351
configMap:

demos/hbase-hdfs-load-cycling-data/distcp-cycling-data.yaml

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,9 +9,9 @@ spec:
99
containers:
1010
- name: distcp-cycling-data
1111
# We use 24.3.0 here which contains the distcp MapReduce components
12-
# This is not included in the 24.7 images and will fail.
12+
# This is not included in the 24.7 and 24.11 images and will fail.
1313
# See: https://github.com/stackabletech/docker-images/issues/793
14-
image: docker.stackable.tech/stackable/hadoop:3.4.0-stackable0.0.0-dev
14+
image: docker.stackable.tech/stackable/hadoop:3.3.6-stackable24.3.0
1515
env:
1616
- name: HADOOP_USER_NAME
1717
value: stackable
@@ -20,7 +20,16 @@ spec:
2020
- name: HADOOP_CLASSPATH
2121
value: "/stackable/hadoop/share/hadoop/tools/lib/*.jar"
2222
# yamllint disable-line rule:line-length
23-
command: ["bash", "-c", "bin/hdfs dfs -mkdir -p /data/raw && bin/hadoop distcp -D fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider s3a://public-backup-nyc-tlc/cycling-tripdata/demo-cycling-tripdata.csv.gz hdfs://hdfs/data/raw"]
23+
command:
24+
- bash
25+
- -euo
26+
- pipefail
27+
- -c
28+
- |
29+
bin/hdfs dfs -mkdir -p /data/raw
30+
bin/hadoop distcp -D fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider \
31+
s3a://public-backup-nyc-tlc/cycling-tripdata/demo-cycling-tripdata.csv.gz \
32+
hdfs://hdfs/data/raw
2433
volumeMounts:
2534
- name: config-volume-hdfs
2635
mountPath: /stackable/conf/hdfs

docs/modules/demos/pages/hbase-hdfs-load-cycling-data.adoc

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -206,7 +206,7 @@ image::hbase-hdfs-load-cycling-data/hbase-table-ui.png[]
206206

207207
== Accessing the HDFS web interface
208208

209-
You can also see HDFS details via a UI by running `stackablectl stacklet list` and following the link next to one of the namenodes.
209+
You can also see HDFS details via a UI by running `stackablectl stacklet list` and following the http links next to the namenodes.
210210

211211
Below you will see the overview of your HDFS cluster.
212212

@@ -218,6 +218,11 @@ image::hbase-hdfs-load-cycling-data/hdfs-datanode.png[]
218218

219219
You can also browse the file system by clicking on the `Utilities` tab and selecting `Browse the file system`.
220220

221+
[TIP]
222+
====
223+
Check that the namenode you browse to is the _active_ namenode in the Overview page. Otherwise you will not be able to browse files.
224+
====
225+
221226
image::hbase-hdfs-load-cycling-data/hdfs-data.png[]
222227

223228
Navigate in the file system to the folder `data` and then the `raw` folder.

0 commit comments

Comments
 (0)