🌉 Multi-omics bridged Kernel Association test (MOKA) Pipeline

MOKA implements a Snakemake pipeline to automate data bridge kernel-based association tests. This pipeline offers flexibility of GWAS analysis & visualizations with different multi-omics variant specific weights. Publication available at: https://www.medrxiv.org/content/10.1101/2025.07.06.25330974v1

🚀 Usage

To run the moka pipeline:

1.Minimal data inputs

-GWAS genotype files in PLINK format (bed, bim & fam)

-Variant specific weights for each SNP ('SNP_ID, CHROMOSOME, POSITION, WEIGHT)

Install Snakemake

Snakemake Installation Guide

conda install -n base -c conda-forge mamba
mamba create -c conda-forge -c bioconda -n snakemake
mamba activate snakemake
snakemake --help

Install Plink, Python & R (Rscript configure) !important Check Dependencies section
Download and install moka

git clone https://github.com/davidenoma/moka
cd moka

Configure the pipeline parameters in the config.yaml file.
Execute the pipeline using the software:

📚 Rules

Rule: moka association_test

Input: Preprocessed genotype data and weight files.
Output: Results of association tests.

snakemake --cores <num_cores>

If you do not have all the dependencies with Python and R you can get it configured on conda, utilize with:

snakemake --cores <num_cores> --use-conda

However, some R packages are not available to best to be installed R package manager.

Rule: merge_results

Input: Individual association test results.
Output: Merged association test results.

snakemake --cores 1 merge_moka_results

Rule: annotate_results

Input: Merged association test results.
Output: Annotated association test results with DisGeNet database

snakemake --cores 1 disgenet_annotation_005

Rule: visualize_results

Input: Merged association test results.
Output: Manhattan plots with visual representations of association test results.

snakemake --cores 1 manhattan_plots

Rule: go_analysis

Input: Merged association test results.
Output: GO analysis results.

snakemake --cores 1 go_analysis

Rule: kegg_pathway_analysis

Input: Merged association test results.
Output: KEGG pathway analysis results.

snakemake --cores 1 kegg_pathway_analysis

Rule: Skat test with linear kernel

Input: Genotype
Output: results for association mapping, folder: output_association/

snakemake --cores 22 skat

Dependencies

Software

They must be configured on your path

Snakemake (8.0.1+)
R(4.2.0+)
Python (3.9+)
PLINK (1.9+): [https://www.cog-genomics.org/plink/1.9/]
Rscript

Python Packages

FaST-LMM Factored Spectrally Transformed Linear Mixed Models, is a program for performing genome-wide association studies (GWAS) on datasets of all sizes
PySnpTools PySnpTools is a library for reading and manipulating genetic data.

pip install pysnptools fastlmm

R Packages

manhattan: R package for creating manhattan plots, commonly used in genome-wide association studies (GWAS).
SKAT: R package for SKAT (Sequence Kernel Association Test) which is a powerful gene-based association test.
QQMAN: R package for creating QQ (Quantile-Quantile) plots, commonly used in GWAS to assess whether observed p-values deviate from the expected distribution under the null hypothesis.
GGPLOT: R package for creating highly customizable plots and graphics.
gprofiler2: R package for gene set enrichment analysis (GO analysis).
pathfindR: R package for pathway analysis, including KEGG pathway analysis.

Installation steps:

install.packages(c("BiocManager","SKAT","ggplot2"))
BiocManager::install(c( "gprofiler2", "pathfindR","manhattan","qqman"))

Other Software

Parallel: Linux Parallel GNU : https://www.gnu.org/software/parallel/

apt install parallel #linux or WSL windows
brew install parallel #macos

Input file format

Data Files: Plink https://www.cog-genomics.org/plink/1.9/ format genotyped BIM, BED & FAM files [!required]
Multi-omics Bridge weights.csv file (SNP_ID,Chromosome,Position,Weight) [!required for moka]
Gene regions file provied in GRCh38 or hg38. (Genome Research Consortium Human Build 38)
DisGeNET gene disease database reference file ( If disease external validation needed)

Liftover protocol

You much lift over to GRCh38 format check here: Liftover GWAS: [https://github.com/davidenoma/LiftOver]

📋 Configuration

genotype_prefix: Prefix for genotype data files.
weights_type: Text string for type of bridge weights to be used e.g. "eqtl", "imaging"
genotype_file_path: Path to genotype data files.
weight_file: Path to weight files used for association tests.
disgenet_reference_file: External disease database specific gene-disease associations from https://disgenet.org [For gene disease associations only!]
spectral decomposition: Flag for performation decomposition and transformation of genotype and phenotype, default: TRUE
is_binary: Flag for binary/ quantitative trait, default: TRUE
Plink: Path to plink installation e.g. "~/software/plink"

📖 Additional Information

For more information on the MOKA pipeline and its usage, refer to the documentation provided in the repository or contact the project maintainers. david.enoma@ucalgary.ca

Publication reference

MOKA: A pipeline for multi-omics bridged SNP-set kernel association test https://www.medrxiv.org/content/10.1101/2025.07.06.25330974v1

Name		Name	Last commit message	Last commit date
Latest commit History 397 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
.idea		.idea
config		config
disease_database		disease_database
genotype_data		genotype_data
helper		helper
output_plots		output_plots
result_folder		result_folder
scripts		scripts
weights		weights
workflow		workflow
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🌉 Multi-omics bridged Kernel Association test (MOKA) Pipeline

🚀 Usage

📚 Rules

Rule: moka association_test

Rule: merge_results

Rule: annotate_results

Rule: visualize_results

Rule: go_analysis

Rule: kegg_pathway_analysis

Rule: Skat test with linear kernel

Dependencies

Software

Python Packages

R Packages

Other Software

Input file format

Liftover protocol

📋 Configuration

📖 Additional Information

Publication reference

About

Uh oh!

Releases

Packages

Languages

License

davidenoma/moka

Folders and files

Latest commit

History

Repository files navigation

🌉 Multi-omics bridged Kernel Association test (MOKA) Pipeline

🚀 Usage

📚 Rules

Rule: moka association_test

Rule: merge_results

Rule: annotate_results

Rule: visualize_results

Rule: go_analysis

Rule: kegg_pathway_analysis

Rule: Skat test with linear kernel

Dependencies

Software

Python Packages

R Packages

Other Software

Input file format

Liftover protocol

📋 Configuration

📖 Additional Information

Publication reference

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages