-
Notifications
You must be signed in to change notification settings - Fork 0
mbhavik91/United-States-Census-Data-Analysis-Using-MapReduce_P1
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
#READ ME !!! Dataset used is "unsorted-2" Number of reducers used is 16. (change code if asked to use 32 reducers. change partitioner and main job) Main Classname is MainJob Package Structure: cs455.gigasort.hadoop How to run? Please typr the following command: $HADOOP_HOME/bin/hadoop jar workspace/Assignment3/dist/gigasort.jar cs455.gigasort.hadoop.MainJob /data/gigasort/unsorted-2 /home/output/final Dataset Size = 19.9 GB A shell script was developed to get the final results The following commang was used to generate the hash value of each file "openssl sha1 part-r-00006 | awk '{print $2}' >> hashvalue" After that Generatehash.java was used to generate the finalHash in a recursive fashion Merkle Tree Algorithm was used to generate the final hash value
About
Giga Sort
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published