Skip to content

Implemented an algorithm using large-star and small-star operations on a large undirected graph to find the connected component label for every vertex using Python / Apache Spark. It is assumed that un-directed graph on which we are operating is too large to be represented in the memory of a single compute node.

License

Notifications You must be signed in to change notification settings

adityamuralidaran/Connected-Components-Using-MapReduce

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Implemented an algorithm using large-star and small-star operations on a large undirected graph to find the connected component label for every vertex using Python/Spark.
It is assumed that un-directed graph on which we are operating is too large to be represented in the memory of a single compute node.

Run using the following command:
	PATH=$PATH:/opt/spark-2.2.0-bin-hadoop2.7/bin
	spark-submit a2.py input.txt output


Sample input is given in 'input.txt'

About

Implemented an algorithm using large-star and small-star operations on a large undirected graph to find the connected component label for every vertex using Python / Apache Spark. It is assumed that un-directed graph on which we are operating is too large to be represented in the memory of a single compute node.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages