Data Pipeline for Analyzing Application Crash and Performance

Setting up environment

Deploy the nifi-hdsf-spark-hive-superset cluster

docker-compose up

Putting data from local to HDFS using Apache NiFi

http://localhost:8091/nifi/

Run Spark job

Go to the command line of the Spark master and start spark-shell

docker exec -it spark-master bash

spark/bin/spark-shell --master spark://spark-master:7077

Load and run Spark job

:load /spark/job/src/main/scala/Main.scala

Main.main()

Building Data Warehouse with Hive

Go to the command line of the Hive server and run hive scripts

docker exec -it hive-server bash

hive -f /hive/scripts/hive.hql;

Create report using Superset

Run docker network inspect on the network (e.g. docker-hadoop-spark-hive_default) to get hostname of Apache Hive ( used for connecting Superset)

Result: https://drive.google.com/file/d/1MrmS3WJZs1UoUKuLZeEbBPnV73nbZn11/view

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.idea		.idea
env		env
hive		hive
info		info
mount		mount
staging		staging
README.md		README.md
crash_insights_data_pipeline.iml		crash_insights_data_pipeline.iml
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Data Pipeline for Analyzing Application Crash and Performance

Setting up environment

Putting data from local to HDFS using Apache NiFi

Run Spark job

Building Data Warehouse with Hive

Create report using Superset

About

Uh oh!

Releases

Packages

Languages

tadod12/crash_insights_data_pipeline

Folders and files

Latest commit

History

Repository files navigation

Data Pipeline for Analyzing Application Crash and Performance

Setting up environment

Putting data from local to HDFS using Apache NiFi

Run Spark job

Building Data Warehouse with Hive

Create report using Superset

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages