K-Means is unsupervised learning algorithm and used in clustering.
K-Means Algorithm -
- We are reading K-value (number of cluster ), MaxIteration and Distance metric to be used from users.
- Load training data set.
- Generate Random number to put number of elements in each cluster. Generate random number to select element from input file in the cluster. Calculate centroid using Euclidean or Manhattan distance.
- Select one cluster at a time and calculate distance for each element in input file and store the distance.
- Repeat step 4 for all clusters.
- Compare distances of each element in input with all cluster centroids and assign element to cluster which has lowest distance.
- Create new cluster and re-assign the elements.
- Calculate centroids for new clusters.
- Repeat the steps 4 to 8 till max iterations.
- Clusters after max iterations will be final clusters.