I am a bioinformatician, scalable operations architect, Scientist, Artist, software engineer, and systems thinker working at the intersection of biology, clinical genomics, data science & cutting-edge computational solutions.
🔬 Scientist: Passionate about unraveling biological complexity.
💻 Open Source Advocate: Building tools to accelerate discovery and collaboration.
🌱 Innovator: Driving sustainable, impactful solutions in informatics and beyond.
🏗️ Builder: Contributor to the 🧬 Human Genome Project and several successful 🏥 clinical diagnostic startup companies.
- 🌐 Daylily Informatics Consulting: dyly.bio
- 💼 LinkedIn: john--major
- 🧪 Consulting Projects: Daylily-Informatics GitHub
- 🔧 Personal Repositories: iamh2o GitHub
🧬 Bioinformatics: Clinical WGS, RNA-seq, and variant analysis.
📊 Data Science: Python, R, machine learning.
☁️ HPC & Cloud Computing: AWS, Slurm, high-performance computing.
🌐 Open Source Development: FastAPI, Snakemake, and more.
🏥 Clinical Diagnostic Operations: designing and running scalable diagnostic workflows.
⚖️ Clinical Diagnostic Regulation & Compliance: expertise in CLIA/CAP standards and certification processes.
daylily-ephemeral-cluster: Infrastructure as code allows on-demand creation of arbitrarily large self-scaling clusters.
Features:
- Built using AWS Parallel Cluster and Parallel Cluster UI.
- Scans AWS Regions and AZs to determine best spot market pricing, and creates clusters where spot pricing is most competitive.
- Highly performant globally shared filesystem via FSx Lustre mirroring reference and other data from
S3
. - Reproducible and predictable runtimes and costs.
- Automateable.
- Fine resolution budget tracking of jobs and resources.
- Real time cost reporting and decision gating capabilities.
- Will run any slurm based workflow manager ( snakemake, CROMWELL, nextflow, ...).
- Tight coupling of reference data allows highly performant and nimble ephemeral cluster lifecycles.
- Designed to be ephemeral-- packaged tools facilitate rapid creation. monitoring, updating, archiving and deleting of ephemeral clusters.
daylily-omics-analysis: Achieving ~$2–$5 per 30x no-amp WGS from FASTQ to VCF.
Features:
- Optimized to run w/in a daylily-ephemeral-cluster framework.
- Industry-leading accuracy, speed, cost, auditability, scalability, QC views & observability.
- Reproducible, sustainable, growing & open-source omics analysis workflows.
- Automated infrastructure management with predictive and real-time cost visibility for storage, data transfer, and compute.
- Open source & free: Deploy daylily in ~1hr and begin returning completed WGS analysis shortly thereafter.
snakemake-executor-plugin-pcluster-slurm
A plugin designed to integrate Snakemake workflows with AWS ParallelCluster’s Slurm workload manager.
bloom:
A templated, abstract, polymorphic, and opinionated LIMS for efficient laboratory data management. Real time COGS moitoring and operational decision gating.
zebra_day:
A library and API for network-connected Zebra printers, managing ZPL label templates and numerous printers with ease.
- xcures_day: Python wrapper for the Xcures REST API.
- fedex_tracking_day: Python module for programmatic shipment tracking.
- img_stitcher_day:
Tool for merging images of tube exteriors for analysis.
- rgbw_colorspace_converter:
Utility for RGB to RGBW conversion, supporting HSV, HSI, HSL, and HEX, focused on LED-based projects.
- gravity_well:
Imports text, markdown, and PDF files into Obsidian with NLP-derived tags and enhanced metadata tracking.
- pyramidtriangles:
Software for artistic LED installations, derived from the grgbrn baaahs2014 codebase.
👀 Explore more in my repositories or get in touch!
I’m committed to contributing to the global scientific community by creating tools and sharing knowledge. Let’s collaborate to push the boundaries of what’s possible in science and technology.
👀 Interested in what I’m building? Follow, star, or get in touch.