-
Notifications
You must be signed in to change notification settings - Fork 4
ClusterGeneSets: User supplied clustering function
When running the ClusterGeneSets function, there are 2 types of clustering provided, kmeans and hierarchical clustering. When running these clustering there is the option to run this group by group if you don't want different groups to be mixed. For most applications these options will provide enough clustering methods to group the gene sets together. In the case that the user wants another clustering option ClusterGeneSets does provides an opportunity to use a different clustering function
In the code there is the following option:
> canonical.df$cluster <- user_function(Object@Data.RR)
It takes the calculated distance (standard is RR) and it uses the distance between gene sets to cluster. What the function allows to do is use a function defined by the user and the only thing the output needs to have is for every Pathway a number that corrosponds to the cluster it belongs to.
If you want to run a clustering method where the output is 1:29, this needs to be cut so that it provides the output 1 1 1 1 2 2 2 2 3 3 3 3 3 3 4 4 4 5 5 5 5 5 6 6 6 6 6 6
user.cluster <- function(data)
{
x <- hclust(dist(t(data)), method = "ward.D2")
x <- cutree(x, k = 5)
return(x)
}
Then to run the ClusterGeneSets function:
Object <- ClusterGeneSets(Object,
clusters = 5,
method = "User_supplied",
order = "group",
molecular.signature = "All",
user_function = user.cluster )
Example Script: Example
Step 1A: Loading the data
Step 1B: Creating an Object
Step 2: Combine and Cluster
Step 2B: User supplied distance function
Step 2C: Highlighting-Genes
Step 3: Exporting Data
Step 4: Functional Investigation
Video: Step-by-step user guide