GitHub - Saddeekhan/Flowering-time-genes-: This project provides a Python-based workflow for analyzing and visualizing flowering time gene expression from RNA-seq data. It includes preprocessing steps such as missing value imputation and normalization, followed by clustering and statistical summaries of expression patterns across developmental stages (M1–M8).

Flowering Time Gene Expression Analysis

This repository contains a Python script for analyzing and visualizing flowering time gene expression data from RNA-seq experiments. The workflow includes data preprocessing, normalization, clustering, and visualization of gene expression patterns across developmental stages.

Features

The script performs the following steps:

1. Import Libraries

Uses pandas for data handling, numpy for numerical operations, matplotlib and seaborn for visualization, and scikit-learn for preprocessing and clustering.

2. Load Data

Reads gene expression data (genes.fpkm_tracking) into a Pandas DataFrame.

3. Select Expression Data

Extracts the gene_id column and FPKM values across developmental stages (M1–M8).

4. Handle Missing Values

Uses SimpleImputer with the median strategy to replace missing values in the dataset.

5. Normalize Data

Standardizes FPKM values using StandardScaler (mean = 0, standard deviation = 1).

6. Summarize Expression Profiles

Computes mean normalized FPKM values across all stages.

7. Data Visualization

I. KDE plot of mean normalized expression values.

II. Line plot showing mean expression across stages.

III. Scatter plot of normalized values.

8. Focus on Flowering Time Genes

Defines a list of genes of interest related to flowering time.

Filters the dataset to extract and visualize their expression patterns.

9. Expression Profile Plot

Generates a line plot of flowering time gene expression across stages M1–M8, including markers, labels, and legends for clear interpretation.

Example Plots

I. KDE plot of normalized FPKM distribution

II. Line plot of mean gene expression across stages

III. Expression profile plot of selected flowering time genes

Requirements

Python 3.x
pandas
numpy
matplotlib
seaborn
scikit-learn

Install dependencies with:

pip install -r requirements.txt

Usage

Place your genes.fpkm_tracking file in the working directory.

Update the list of genes_of_interest in the script.

Run the script:

python flowering_time_analysis.py

View the generated plots for insights into flowering time gene expression patterns.

Summary

This project provides a reproducible pipeline to:

Clean and normalize RNA-seq expression data

Explore global expression trends

Visualize specific gene expression profiles across developmental stages

It can be easily adapted to study other gene sets or biological processes.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
1. Klipikova reproducibility.png		1. Klipikova reproducibility.png
Flowering time genes.ipynb		Flowering time genes.ipynb
Klepikova reproducibiliity.png		Klepikova reproducibiliity.png
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases

Packages

Languages

Saddeekhan/Flowering-time-genes-

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages