Omics data analysis project repository template

This template repository provides an organizational structure for more quickly setting up projects focused on the analysis of omics data. The directory structure accomodates projects that focus mainly on using semi-processed omics data (e.g., transcript quantifications derived from RNA-seq reads) as opposed to starting from raw data from omics assays, though the latter is still feasible. This README should be replaced by README_template.md for new repositories created from this template repository.

# Directory tree

{project name}
├── LICENSE
├── README.md
├── README_template.md
├── task
│   ├── 0_examples
│   ├── 0_templates
│   │   └── r_analysis_notebook_template.ipynb
│   ├── 1_munging
│   │   ├── 0000
│   │   │   ├── 0001
│   │   │   │   ├── figures
│   │   │   │   ├── code
│   │   │   │   ├── serialized
│   │   │   │   └── tables
│   │   │   └── README.md
│   │   ├── 0001
│   │   │   ├── 0001
│   │   │   │   ├── figures
│   │   │   │   ├── code
│   │   │   │   ├── serialized
│   │   │   │   └── tables
│   │   │   └── README.md
│   │   └── README.md
│   ├── 2_eda
│   │   ├── 0000
│   │   │   ├── 0001
│   │   │   │   ├── figures
│   │   │   │   ├── code
│   │   │   │   ├── serialized
│   │   │   │   └── tables
│   │   │   └── README.md
│   │   ├── 0001
│   │   │   ├── 0001
│   │   │   │   ├── figures
│   │   │   │   ├── code
│   │   │   │   ├── serialized
│   │   │   │   └── tables
│   │   │   └── README.md
│   │   └── README.md
│   ├── 3_modeling_and_inference
│   │   ├── 0000
│   │   │   ├── 0001
│   │   │   │   ├── figures
│   │   │   │   ├── code
│   │   │   │   ├── serialized
│   │   │   │   └── tables
│   │   │   └── README.md
│   │   ├── 0001
│   │   │   ├── 0001
│   │   │   │   ├── figures
│   │   │   │   ├── code
│   │   │   │   ├── serialized
│   │   │   │   └── tables
│   │   │   └── README.md
│   │   └── README.md
│   ├── 4_reports
│   │   ├── 0001
│   │   │   ├── 0001
│   │   │   └── README.md
│   │   └── README.md
│   └── README.md
├── data
│   ├── processed
│   │   ├── README.md
│   │   └── dataset_0001
│   │       ├── 0001
│   │       └── README.md
│   ├── raw
│   │   ├── README.md
│   │   ├── annotations
│   │   └── dataset_0001
│   └── tmp
│       └── dataset_0001
├── scratch
├── src
│   ├── bash
│   ├── docker
│   │   ├── README.md
│   │   └── jupyter_settings
│   │       └── overrides.json
│   ├── python
│   └── r
└── tests
    └── code_linting
        ├── python3
        │   └── pylint
        │       └── pylintrc
        └── r
            └── lintr
                └── lint.R

Subdirectory descriptions

data: Consists of data used for exploratory data analysis, model fitting, and statistical inference. Original data files are retrieved and stored in the raw folder. Intermediate files during processing are held in tmp before the final files are stored in processed. For analyses used to generate reports, there should be no dependencies on files in tmp so that those files can be deleted as needed.
task: Contains interactive notebooks (e.g., Jupyter) or code, visualizations, tables, and serialized output files for various tasks of data analysis projects. The subdirectories further divide results or output by types of processing/analysis tasks and task instances, using a two-tier sequential numbering system.
src: Contains reusable source code and utility scripts used in data retrieval, preparation, and analysis tasks.
tests: Contains code for unit testing the code maintained in the src folder. Also includes code linting tools for style checking Python and R code.
scratch: Scratch space for temporary local files.

Note: Typically in omics analyses, multiple iterations of a processed dataset or analysis task are generated. This is considered in the directory structure through the use of numeric subdirectories that categorize files relevant to a specific dataset, task type, or task iteration. In the directory tree, the incremental numbering for subdirectories uses a 4-digit ID format that can accomodate 9999 tasks/variations.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Omics data analysis project repository template

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.github		.github
data		data
scratch		scratch
src		src
task		task
tests		tests
.gitattributes		.gitattributes
.gitconfig		.gitconfig
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README_template.md		README_template.md

License

jaamarks/omics_analysis_project_template_public

Folders and files

Latest commit

History

Repository files navigation

Omics data analysis project repository template

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages