-
Notifications
You must be signed in to change notification settings - Fork 4
Part 1B Creating an Object
Creating object with custom table
When data wasnt run in GREAT or IPA the aproach can still be used. In fact it doesnt neccesary need to be proper pathways. The thing the code needs is a group, a name, and name content. Plus some meta info. We can also simulate a dataset.
Loading IPA data
As an example we show that with simple information present the PathwayObjectcan be created
require(GeneSetCluster)
IPA.files <- c(system.file("extdata", "MM10.IPA.KO.uGvsMac.Canonical_pathways.xls", package = "GeneSetCluster"),
system.file("extdata", "MM10.IPA.WT.uGvsMac.Canonical_pathways.xls", package = "GeneSetCluster"),
system.file("extdata", "MM10.IPA.KO.uGvsMac.Functional_annotations.xls", package = "GeneSetCluster"),
system.file("extdata", "MM10.IPA.WT.uGvsMac.Functional_annotations.xls", package = "GeneSetCluster"))
canonical.files <- IPA.files[grep("Canonical", IPA.files)]
##################
#Loading the data as a table
MM10.IPA.KO.uGvsMac.Canonical <- read_excel(path = system.file("extdata", "MM10.IPA.KO.uGvsMac.Canonical_pathways.xls", package = "GeneSetCluster"),
skip=1, sheet = 1)
MM10.IPA.WT.uGvsMac.Canonical <- read_excel(path = system.file("extdata", "MM10.IPA.WT.uGvsMac.Canonical_pathways.xls", package = "GeneSetCluster"),
skip=1, sheet = 1)
MM10.IPA.KO.uGvsMac.Canonical <- as.data.frame(MM10.IPA.KO.uGvsMac.Canonical)
MM10.IPA.WT.uGvsMac.Canonical <- as.data.frame(MM10.IPA.WT.uGvsMac.Canonical)
head(MM10.IPA.KO.uGvsMac.Canonical)
Making R objects
IPA exports a lot of data, but we are only interested in Gene-Sets with a pvalue < 0.05 (aka -log10(pvalue) > 1.31) and more than 5 molecules. When running ObjectCreator, the user needs to do the filtering of the relevant pathways.
#Calculating the number of molecules:
#we can see that the string is comma seperated for these molecules:
MM10.IPA.KO.uGvsMac.Canonical$MoleculesCount <- NA
for(can.i in 1:nrow(MM10.IPA.KO.uGvsMac.Canonical))
{
mol.i <- as.vector(strsplit2(as.character(MM10.IPA.KO.uGvsMac.Canonical[can.i,"Molecules"]), split=","))
MM10.IPA.KO.uGvsMac.Canonical[can.i,"MoleculesCount"]<- length(mol.i)
}
head(MM10.IPA.KO.uGvsMac.Canonical)
MM10.IPA.KO.uGvsMac.Canonical.filtered <- MM10.IPA.KO.uGvsMac.Canonical[MM10.IPA.KO.uGvsMac.Canonical$`-log(p-value)` > 1.31 &
MM10.IPA.KO.uGvsMac.Canonical$MoleculesCount > 5,]
nrow(MM10.IPA.KO.uGvsMac.Canonical.filtered)
#We can see that we have 53 Gene-Sets which are significant according to our definition.
#Repeat for WT
MM10.IPA.WT.uGvsMac.Canonical$MoleculesCount <- NA
for(can.i in 1:nrow(MM10.IPA.WT.uGvsMac.Canonical))
{
mol.i <- as.vector(strsplit2(as.character(MM10.IPA.WT.uGvsMac.Canonical[can.i,"Molecules"]), split=","))
MM10.IPA.WT.uGvsMac.Canonical[can.i,"MoleculesCount"]<- length(mol.i)
}
MM10.IPA.WT.uGvsMac.Canonical.filtered <- MM10.IPA.WT.uGvsMac.Canonical[MM10.IPA.WT.uGvsMac.Canonical$`-log(p-value)` > 1.31 &
MM10.IPA.WT.uGvsMac.Canonical$MoleculesCount > 5,]
nrow(MM10.IPA.KO.uGvsMac.Canonical.filtered)
nrow(MM10.IPA.WT.uGvsMac.Canonical.filtered)
We can see that we have 53 and 281 Gene-Sets respectivly which are significant according to our definition.
Creating Combine
Now we combine
- Pathways are just concatenated
- Molecules (aka the genes) are just concatenated
- groups is a string that is the length of the combined pathways with the repeating info.
- Source is how the data was generated (for meta data reasons, not nessecary to add)
- Type is what kind of data is it (for meta data reasons, not nessecary to add)
- Structure is how the genes are presented, only important if you want to combine gene sets, the genes have to match, so the program wants to know its speaking the same language
- organism, same as the structure, only important for combining gene sets (optional)
- sep, how the genes in the molecules group are seperated. Important for readign the individual genes.
IPA.KOvsWT.PathwayObject <- ObjectCreator(Pathways = c(MM10.IPA.KO.uGvsMac.Canonical.filtered$`Ingenuity Canonical Pathways`,
MM10.IPA.WT.uGvsMac.Canonical.filtered$`Ingenuity Canonical Pathways`),
Molecules = c(MM10.IPA.KO.uGvsMac.Canonical.filtered$Molecules,
MM10.IPA.WT.uGvsMac.Canonical.filtered$Molecules),
Groups = c(rep("KO", times = nrow(MM10.IPA.KO.uGvsMac.Canonical.filtered)),
rep("WT", times = nrow(MM10.IPA.WT.uGvsMac.Canonical.filtered))),
Source = "IPA",
Type = "Canonical_Pathways",#Optional
structure = "SYMBOL",
organism ="org.Mm.eg.db",
sep = ",")
https://github.com/TranslationalBioinformaticsUnit/GeneSetCluster/wiki/ObjectCreator
Example Script: Example
Step 1A: Loading the data
Step 1B: Creating an Object
Step 2: Combine and Cluster
Step 2B: User supplied distance function
Step 2C: Highlighting-Genes
Step 3: Exporting Data
Step 4: Functional Investigation
Video: Step-by-step user guide