For details on the method see our preprint. If you use the package partialLDSC
please cite:
Mounier et al. (2024) Genetics identifies obesity as a shared risk factor for co-occurring multiple long-term conditions. medRxiv https://doi.org/10.1101/2024.07.10.24309772
partialLDSC
is an R-package to estimate partial genetic correlations
from GWAS summary statistics, and compare them to their unadjusted
counterparts, to quantify the contribution of a given confounder in
explaining genetic similarity between conditions. The partial genetic
correlations between two conditions correspond to their genetic
correlation, holding the genetic effects of a potential confounder
constant. Differences between unadjusted and partial estimates are not
necessarily due to a causal effect of the potential confounder on both
conditions and further (causal inference) analyses might be needed to
better describe the relationship between the conditions and the
potential confounder.
It relies on cross-trait LD-score regression (LDSC), as first described by Bulik-Sullivan, B. et al. - “An atlas of genetic correlations across human diseases and traits.”.
Our implementation of LDSC is based on the one from
GenomicSEM
. Moreover, the
pre-processing of the GWAS summary statistics prior to analysis should
be done using the munge
function they provide.
There are two main functions available:
-
partial_ldsc()
main function to estimate unadjusted and partial genetic correlations (as well as heritabilities, on the observed scale only), and compare them to each other to assess if adjusting for the potential confounder’s genetic significantly affect the pairwise genetic correlation estimates. -
forest_plot()
main function to visualise the results.
More details about their usage can be found in the manual.
You can install the current version of partialLDSC
with:
# Directly install the package from github
# install.packages("remotes")
remotes::install_github("GEMINI-multimorbidity/partialLDSC")
library(partialLDSC)
To run the analysis with partialLDSC
different inputs are needed:
More information about how to munge the summary statistics can be found
in the GenomicSEM
R-package documentaion,
here.
LD scores are needed, these are the same as the ones needed by the
GenomicSEM
R-package and
can be directly downloaded from the link they provide.
Expects LD scores formated as required by the original LD score regression software. Weights for the european population can be obtained by downloading the eur_w_ld_chr folder in the link below (Note that these are the same weights provided by the original developers of LDSC): https://utexas.box.com/s/vkd36n197m8klbaio3yzoxsee6sxo11v
Before running the examples, please make sure to have downloaded the
ld-scores files. You may also need to modify the ld
parameters to
indicate the correct path. Note that when running the analysis with your
own GWAS summary statistics, you will first need to properly munge them.
- Example A
# Using GEMINI GWAS summary statistics for three conditions
# (osteoarthitis: OA, type 2 diabetes: T2D, benign hyperplasia of prostate: BPH)
# + GIANT GWAS summary statistics for the confounder (BMI)
# (1,150,000 SNPs - stored in gzipped files)
OA_file <- system.file("Data/", "OA_GEMINI.sumstats.gz", package="partialLDSC")
T2D_file <- system.file("Data/", "diabetes_type_2_GEMINI.sumstats.gz", package="partialLDSC")
BPH_file <- system.file("Data/", "BPH_GEMINI.sumstats.gz", package="partialLDSC")
BMI_file <- system.file("Data/", "BMI_Yengo_2018.txt.sumstats.gz", package="partialLDSC")
# launch analysis (using default number of blocks)
A = partial_ldsc(conditions = c(OA_file, T2D_file, BPH_file),
confounder = BMI_file,
condition.names = c("OA", "T2D", "BPH"),
confounder.name = "BMI",
ld = "~/eur_w_ld_chr",
log.name = "Example_A")
Show log
## Multivariate ld-score regression of 4 traits (3 conditions: OA, T2D, BPH + confounder: BMI) began at: 15:13
## Number of blocks used to perform the block jacknife used to estimate the sampling covariance matrix (V) is 200
## Reading in LD scores from: C:\Users\nm572\OneDrive - University of Exeter\Documents\Exeter\Projects\Data\eur_w_ld_chr
## Read in summary statistics [1/4] (OA) from: C:\Users\nm572\OneDrive - University of Exeter\Documents\R\win-library\4.1\partialLDSC\Data\OA_GEMINI.sumstats.gz
## Out of 1168143 SNPs, 1155445 remain after merging with LD-score files
## Removing 0 SNPs with Chi^2 > 766.622; 1155445 remain
## Read in summary statistics [2/4] (T2D) from: C:\Users\nm572\OneDrive - University of Exeter\Documents\R\win-library\4.1\partialLDSC\Data\diabetes_type_2_GEMINI.sumstats.gz
## Out of 1161441 SNPs, 1153850 remain after merging with LD-score files
## Removing 15 SNPs with Chi^2 > 446.142; 1153835 remain
## Read in summary statistics [3/4] (BPH) from: C:\Users\nm572\OneDrive - University of Exeter\Documents\R\win-library\4.1\partialLDSC\Data\BPH_GEMINI.sumstats.gz
## Out of 1166954 SNPs, 1160509 remain after merging with LD-score files
## Removing 1 SNPs with Chi^2 > 203.231; 1160508 remain
## Read in summary statistics [4/4] (BMI) from: C:\Users\nm572\OneDrive - University of Exeter\Documents\R\win-library\4.1\partialLDSC\Data\BMI_Yengo_2018.txt.sumstats.gz
## Out of 1018130 SNPs, 1014995 remain after merging with LD-score files
## Removing 6 SNPs with Chi^2 > 795.64; 1014989 remain
##
## Estimating heritability [1/10] for: OA
## Heritability Results for trait: OA
## Mean Chi^2 across remaining SNPs: 1.4953
## Lambda GC: 1.22
## Intercept: 0.8991 (0.0113)
## Ratio: -0.2036 (0.0229)
## Total Observed Scale h2: 0.039 (0.0014)
## h2 Z: 28
##
## Calculating genetic covariance [2/10] for traits: OA and T2D
## 1145694 SNPs remain after merging OA and T2D summary statistics
## Results for genetic covariance between: OA and T2D
## Mean Z*Z: 0.2186
## Cross trait Intercept: 0.0299 (0.0076)
## Total Observed Scale Genetic Covariance (g_cov): 0.0158 (0.0016)
## g_cov Z: 10.2
## g_cov P-value: 3.0229e-24
##
## Calculating genetic covariance [3/10] for traits: OA and BPH
## 1151912 SNPs remain after merging OA and BPH summary statistics
## Results for genetic covariance between: OA and BPH
## Mean Z*Z: 0.0934
## Cross trait Intercept: 0.0418 (0.0058)
## Total Observed Scale Genetic Covariance (g_cov): 0.0064 (0.0016)
## g_cov Z: 3.96
## g_cov P-value: 7.3987e-05
##
## Calculating genetic covariance [4/10] for traits: OA and BMI
## 996930 SNPs remain after merging OA and BMI summary statistics
## Results for genetic covariance between: OA and BMI
## Mean Z*Z: 0.6241
## Cross trait Intercept: 0.0446 (0.0112)
## Total Observed Scale Genetic Covariance (g_cov): 0.0394 (0.0019)
## g_cov Z: 20.5
## g_cov P-value: 8.6266e-94
##
## Estimating heritability [5/10] for: T2D
## Heritability Results for trait: T2D
## Mean Chi^2 across remaining SNPs: 2.1342
## Lambda GC: 1.4961
## Intercept: 1.0054 (0.0259)
## Ratio: 0.0048 (0.0229)
## Total Observed Scale h2: 0.1307 (0.0055)
## h2 Z: 23.8
##
## Calculating genetic covariance [6/10] for traits: T2D and BPH
## 1151699 SNPs remain after merging T2D and BPH summary statistics
## Results for genetic covariance between: T2D and BPH
## Mean Z*Z: 0.0113
## Cross trait Intercept: -0.0058 (0.0069)
## Total Observed Scale Genetic Covariance (g_cov): 0.0042 (0.0027)
## g_cov Z: 1.55
## g_cov P-value: 0.12162
##
## Calculating genetic covariance [7/10] for traits: T2D and BMI
## 996408 SNPs remain after merging T2D and BMI summary statistics
## Results for genetic covariance between: T2D and BMI
## Mean Z*Z: 1.0387
## Cross trait Intercept: 0.0487 (0.0152)
## Total Observed Scale Genetic Covariance (g_cov): 0.0894 (0.0038)
## g_cov Z: 23.4
## g_cov P-value: 7.1286e-121
##
## Estimating heritability [8/10] for: BPH
## Heritability Results for trait: BPH
## Mean Chi^2 across remaining SNPs: 1.1591
## Lambda GC: 0.9517
## Intercept: 0.9102 (0.0108)
## Ratio: -0.5646 (0.0676)
## Total Observed Scale h2: 0.0623 (0.0051)
## h2 Z: 12.2
##
## Calculating genetic covariance [9/10] for traits: BPH and BMI
## 1001610 SNPs remain after merging BPH and BMI summary statistics
## Results for genetic covariance between: BPH and BMI
## Mean Z*Z: 0.0049
## Cross trait Intercept: 0.01 (0.0101)
## Total Observed Scale Genetic Covariance (g_cov): -6e-04 (0.0033)
## g_cov Z: -0.175
## g_cov P-value: 0.86117
##
## Estimating heritability [10/10] for: BMI
## Heritability Results for trait: BMI
## Mean Chi^2 across remaining SNPs: 3.9344
## Lambda GC: 2.7869
## Intercept: 1.0199 (0.0277)
## Ratio: 0.0068 (0.0094)
## Total Observed Scale h2: 0.2091 (0.0063)
## h2 Z: 33.3
##
##
## Genetic Correlation Results
## Genetic Correlation between OA and T2D: 0.2216 (0.0208)
## Genetic Correlation between OA and BPH: 0.1295 (0.0333)
## Genetic Correlation between OA and BMI: 0.4361 (0.0182)
## Genetic Correlation between T2D and BPH: 0.0462 (0.0303)
## Genetic Correlation between T2D and BMI: 0.5408 (0.022)
## Genetic Correlation between BPH and BMI: -0.005 (0.0288)
##
##
## Partial genetic Correlation Results
## Partial genetic Correlation between OA and T2D: -0.0189 (0.0296)
## Partial genetic Correlation between OA and BPH: 0.1463 (0.0344)
## Partial genetic Correlation between T2D and BPH: 0.0581 (0.0315)
##
## Analysis finished running at 15:14
## Runtime: 0 minute(s) and 55 second(s)
##
ldsc_partial()
returns a named list containing the following
results:
res_diff
(pairwise results)
condition.1
: first condition in the pair,
condition.2
: second condition in the pair,
rg
: unadjusted genetic correlation between the two conditions,
rg.SE
: standard error of the unadjusted genetic correlation between
the two conditions,
partial_rg
: partial genetic correlation between the two conditions,
partial_rg.SE
: standard error of the partial genetic correlation
between the two conditions,
rg_cov
: covariance between the unadjusted and the partial correlation
estimates for the pair,
diff.T
: test statistic used to test for the difference between the
unadjusted and the partial correlation estimates for the pair,
diff.P
: p-value corresponding to the test statistic used to test the
difference between the unadjusted and the partial correlation estimates
for the pair.
-
S
: estimated genetic covariance matrix for all conditions + confounder. -
V
: variance covariance matrix of the parameter estimates inS
. -
S_Stand
: estimated genetic correlation matrix for all conditions + confounder. -
V_Stand
: variance covariance matrix of the parameter estimates inS_Stand
. -
partial.S
: estimated partial genetic covariance matrix for all conditions. -
partial.V
: variance covariance matrix of the parameter estimates inpartial.S
. -
partial.S_Stand
: estimated partial genetic correlation matrix for all conditions. -
partial.V_Stand
: variance covariance matrix of the parameter estimates inpartial.S_Stand
. -
I
: matrix containing the cross-trait intercepts.
-
**<log.name>_ldsc.log** - log file
-
**<log.name>_difference.tsv** - results file
-
Example A
### structure of the results
str(A)
## List of 10
## $ res_diff :'data.frame': 3 obs. of 9 variables:
## ..$ condition.1 : chr [1:3] "OA" "OA" "T2D"
## ..$ condition.2 : chr [1:3] "T2D" "BPH" "BPH"
## ..$ rg : num [1:3] 0.2216 0.1295 0.0462
## ..$ rg.SE : num [1:3] 0.0208 0.0333 0.0303
## ..$ partial_rg : num [1:3] -0.0189 0.1463 0.0581
## ..$ partial_rg.SE: num [1:3] 0.0296 0.0344 0.0315
## ..$ rg_cov : num [1:3] 0.000451 0.001059 0.000819
## ..$ diff.T : num [1:3] 11.92 -1.29 -0.72
## ..$ diff.P : num [1:3] 9.03e-33 1.96e-01 4.71e-01
## $ S : num [1:4, 1:4] 0.03902 0.01582 0.00639 0.0394 0.01582 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : chr [1:4] "OA" "T2D" "BPH" "BMI"
## .. ..$ : chr [1:4] "OA" "T2D" "BPH" "BMI"
## $ V : num [1:10, 1:10] 1.94e-06 6.56e-07 -5.20e-09 8.45e-07 2.01e-06 ...
## $ S_Stand : num [1:4, 1:4] 1 0.222 0.129 0.436 0.222 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : chr [1:4] "OA" "T2D" "BPH" "BMI"
## .. ..$ : chr [1:4] "OA" "T2D" "BPH" "BMI"
## $ V_Stand : num [1:10, 1:10] 1.93e-30 2.08e-18 -5.98e-19 1.07e-18 7.29e-33 ...
## $ partial.S : num [1:3, 1:3] 0.0316 -0.00102 0.00649 -0.00102 0.09248 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : chr [1:3] "OA" "T2D" "BPH"
## .. ..$ : chr [1:3] "OA" "T2D" "BPH"
## $ partial.V : num [1:6, 1:6] 1.83e-06 5.95e-07 3.32e-07 1.97e-06 2.00e-07 ...
## $ partial.S_Stand: num [1:3, 1:3] 1 -0.0189 0.1463 -0.0189 1 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : chr [1:3] "OA" "T2D" "BPH"
## .. ..$ : chr [1:3] "OA" "T2D" "BPH"
## $ partial.V_Stand: num [1:6, 1:6] 0 0 0 0 0 ...
## $ I : num [1:4, 1:4] 0.8991 0.0299 0.0418 0.0446 0.0299 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : chr [1:4] "OA" "T2D" "BPH" "BMI"
## .. ..$ : chr [1:4] "OA" "T2D" "BPH" "BMI"
### matrix of unadjusted genetic correlations
A$S_Stand
## OA T2D BPH BMI
## OA 1.0000000 0.22155702 0.129499318 0.436123972
## T2D 0.2215570 1.00000000 0.046169361 0.540761961
## BPH 0.1294993 0.04616936 1.000000000 -0.005027793
## BMI 0.4361240 0.54076196 -0.005027793 1.000000000
### pairwise results
# in this case, we observed a significant difference between the unadjusted and the partial
# genetic correlation estimates only for OA and T2D.
# This does make sense since the genetic correlation between BMI and BPH is very low,
# explaining why adjusting for BMI does not affect the genetic correlation between BPH
# and the other conditions.
A$res_diff
## condition.1 condition.2 rg rg.SE partial_rg partial_rg.SE
## 1 OA T2D 0.22155702 0.02076908 -0.01886781 0.02961074
## 2 OA BPH 0.12949932 0.03327598 0.14634480 0.03435472
## 3 T2D BPH 0.04616936 0.03030614 0.05811963 0.03154903
## rg_cov diff.T diff.P
## 1 0.0004507507 11.9225545 9.029680e-33
## 2 0.0010588452 -1.2925710 1.961595e-01
## 3 0.0008192339 -0.7201883 4.714090e-01
### functions to list conditions / pairs
get_conditions(A)
## [1] "OA" "T2D" "BPH"
get_pairs(A)
## [1] "OA-T2D" "OA-BPH" "T2D-BPH"
### forest plot
forest_plot(A)
Example A ~ 0 minute(s) and 55 second(s)
Runtime can be influenced by the number of traits.
Results from analyses performed on a Windows 10 laptop - Processor : Intel(R) Core(TM) i5-10310U CPU @ 1.70GHz 2.21 GHz - Memory : 16.0 GB.
If you use the partialLDSC
package, please cite:
Mounier et al. (2024) Genetics identifies obesity as a shared risk factor for co-occurring multiple long-term conditions. medRxiv https://doi.org/10.1101/2024.07.10.24309772
…