Clarification on Family vs Branch-Level Significance in CAFE5 Outputs #232
-
Hi folks, I'm using CAFE5 to analyze gene family evolution across several hymenopteran species. After going through multiple discussions on GitHub and the Google Group, I noticed references to two types of significance values: family-level p-values and branch-level p-values. I assume that family_results.txt provides the p-values indicating whether a gene family is rapidly evolving somewhere in the tree. Is that correct? My understanding is that the significant families listed here are considered rapidly evolving across all the species included in the analysis, but not necessarily in a species-specific way? If I want to say that Species A has XX significantly evolving (i.e., rapidly evolving) gene families, compared to other species, do I need to further filter these gene families branch level p-vale in the result_summary.tsv file produced by CafePlotter? I'm also confused about the gamma_branch_probabilities.tab file. In result_summary.tsv from CafePlotter, gamma_branch_probabilities values appear to be labeled or interpreted as p-values for each gene family per species. However, the values in gamma_branch_probabilities.tab look more like posterior probabilities than true statistical p-values? Are these values actually being treated as branch-specific p-values, or is this a mislabeling? I'd really appreciate some clarification to ensure I'm interpreting the outputs correctly when reporting species-specific gene family evolution. Cheers, |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
This question was also asked on the mailing list. Posting the answers here for completeness:
Yes, that’s correct. Family-level results tell you about the tree as a whole, but not which branch(es) may be individually rapidly evolving.
These are indeed posteriors, and should not be interpreted as p-values.
I think it depends exactly what you want to know. If you want to test whether one lineage evolves faster than another, testing a model with different lambda values on the two branches would be best (you get p-values by carrying out a likelihood ratio test). You can test this by comparing the number of significant families along each branch, but this is a less direct test of this hypothesis. |
Beta Was this translation helpful? Give feedback.
This question was also asked on the mailing list. Posting the answers here for completeness:
Yes, that’s correct. Family-level results tell you about the tree as a whole, but not which branch(es) may be individually rapidly evolving.