GitHub - lntran26/ConfuseNN: Haplotype matrix permutation for interpreting pop gen ML models

Interpreting supervised machine learning inferences in population genomics using haplotype matrix permutations

Supervised machine learning methods, such as convolutional neural networks (CNNs), that use haplotype matrices as input data have become powerful tools for population genomics inference. However, these methods often lack interpretability, making it difficult to understand which population genetic features drive their predictions—a critical limitation for method development and biological interpretation. Here we introduce a systematic permutation approach that progressively disrupts population genetics features within input test haplotype matrices, including linkage disequilibrium, haplotype structure, and allele frequencies. By measuring performance degradation after each permutation, the importance of each feature can be assessed. We applied our approach to three published CNNs for positive selection and demographic history inference.

In this repository, we use the term "ConfuseNN" to refer to our permutation approach, since we are attempting to "confuse" the networks by testing them on disrupted data.

Preprint

bioRxiv

Reproduce

To reproduce the result for each of the three CNNs evaluated in this work, refer to the three respective subdirs, each with their own specifications.

Example code for haplotype matrix permutations

For ease of adoptability, we have provided the code to perform all permutations described in our paper in minimal_example.ipynb, with visualization. How these permutations are applied in practice likely varies depending on the simulation and training procedure. For examples of how we customized our permutation approach to each CNN, refer to corresponding subdir with further descriptions.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
ImaGene		ImaGene
demography		demography
disc-pg-gan		disc-pg-gan
.gitignore		.gitignore
README.md		README.md
minimal_example.ipynb		minimal_example.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Interpreting supervised machine learning inferences in population genomics using haplotype matrix permutations

Preprint

Reproduce

Example code for haplotype matrix permutations

Extra: Conference presentations of this work

About

Uh oh!

Releases

Packages

Languages

lntran26/ConfuseNN

Folders and files

Latest commit

History

Repository files navigation

Interpreting supervised machine learning inferences in population genomics using haplotype matrix permutations

Preprint

Reproduce

Example code for haplotype matrix permutations

Extra: Conference presentations of this work

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages