-
Notifications
You must be signed in to change notification settings - Fork 4
Optimal clustering methods
Ewoud Ewing edited this page Sep 16, 2022
·
1 revision
There are many ways to determine what is the best separation of the clusters. In this package the function OptimalGeneSets can help with determining what is the best separation.
There are 3 different statistics the function can plot that help with this; Gap, Elbow and Silhouette.
- Gap: Compares the total within intra-cluster variation for different values of k with their expected values under null reference distribution of the data
- Elbow: For each k, calculate the total within cluster sum of squares
- Silhouette: Determines how wel each object lies within its cluster. Higher the better.
The computational time for each method is different:
for an average of 10 iterations of example data the time it took R to calulate.
Mean times:
- Gap: 23.35 ± 2.88
- Elbow: 0.76 ± 0.174
- Silhouette: 0.75 ± 0.158
Example Script: Example
Step 1A: Loading the data
Step 1B: Creating an Object
Step 2: Combine and Cluster
Step 2B: User supplied distance function
Step 2C: Highlighting-Genes
Step 3: Exporting Data
Step 4: Functional Investigation
Video: Step-by-step user guide