Skip to content

Code for a variety of pollen recognition tasks, including classification, quantification of morphological diversity within Poaceae pollen communities, and identification of the photosynthetic pathway (C3 or C4) based solely on morphological traits.

Notifications You must be signed in to change notification settings

paleopollen/Pollen_Diversity_Dynamics

Repository files navigation

Reconstructing the diversity dynamics of Late Quaternary grasslands using deep learning on superresolution images of fossil Poaceae pollen

Abstract

Despite its abundance in the fossil record, grass pollen is largely overlooked as a source of ecological and evolutionary data because species and genera cannot be easily discriminated visually. However, superresolution imaging and deep learning can identify morphological differences among grass taxa by focusing on small variations in grain morphology and surface ornamentation. Using a semi-supervised learning strategy, we trained convolutional neural networks (CNNs) on image data from extant Poaceae species and unlabeled fossil samples. Semi-supervised learning improved the CNN models' capability to generalize feature recognition in fossil pollen specimens. Our models successfully capture morphological diversity among our 60 modern species and between C3 and C4 grasses. We applied our trained models to fossil pollen images from a 25,000-year sediment core from Lake Rutundu, Mt Kenya, and identified a strong correlation between shifts in grass diversity and atmospheric CO2 levels, temperature, precipitation, and fire frequency as measured by charcoal abundance. We quantified grass diversity by focusing on morphotypic variability, calculating Shannon entropy and morphotypic counts from the probability density function of specimens’ CNN features for each core depth. Predicted C3 : C4 abundances suggest a gradual increase in C4 grass species correlated with rising temperature and fire activity. Our results demonstrate that machine-learned morphological features can significantly advance palynological analysis, enabling the estimation of biodiversity and distinction between C3 and C4 grass pollen using morphology alone.

Significance Statement

Deep learning and superresolution imaging are capable of solving some of the most intractable identification problems in fossil pollen analysis. The pollen of grass species are morphologically indistinguishable under traditional light and difficult to discriminate. However, superresolution imaging and deep learning successfully distinguishes the pollen of grass species. Features derived from convolutional neural networks quantify the biological and physiological diversity of grass pollen assemblages and can be applied without a priori knowledge of the species present, allowing reconstructions of changes in grass diversity and C4 abundance. This approach unlocks new ecological information preserved in the abundant grass pollen record.

Main Structure

There are three main folders in this repository:

  1. Training and Classification: Scripts for training the two classification models described in the paper using two modalities: maximum intensity projection (MIP) images, and patches.
  2. Diversity Estimation: Scripts for running the ecological simulations described in the paper and for applying Shannon entropy to calculate morphological diversity along the Lake Rutundu sediment core over the past 25,000 years.
  3. Photosynthetic Pathway Analysis: Scripts for detecting morphological differences between C3 and C4 grass pollen while accounting for phylogenetic relatedness, and for developing a random forest classifier to identify the photosynthetic pathway (C3 or C4) of grass fossil pollen based on morphology alone.

Hardware Specifications

Experiments were conducted on an NVIDIA GeForce RTX3090 GPU card with 24 GB of memory and an NVIDIA A100 SXM4 card with 40 GB of memory. We used the PyTorch toolbox for training neural networks. Additionally, some analyses were performed using R on a standard CPU.

About

Code for a variety of pollen recognition tasks, including classification, quantification of morphological diversity within Poaceae pollen communities, and identification of the photosynthetic pathway (C3 or C4) based solely on morphological traits.

Resources

Stars

Watchers

Forks

Packages

No packages published