Step-by-step interactive guides for RNA-Seq, single-cell, GWAS, population genomics, CNV, epigenomics, and 25 more topics. Every threshold justified. Every formula shown live. Guided by Byte.
Every category links to a curated set of tutorials. Pick your domain and dive in.
High-quality, end-to-end guides with downloadable scripts and live interactive spreadsheets.
DESeq2, edgeR, and limma-voom end-to-end. padj thresholds, MA plots, volcano plots, and GO enrichment — every decision justified.
Start TutorialCell Ranger output to annotated clusters: QC, normalization, PCA, UMAP, marker discovery, and cell-type annotation with Seurat v5.
Start TutorialBQSR, HaplotypeCaller, VQSR, and hard-filter thresholds. Covers germline and multi-sample joint genotyping with full parameter rationale.
Start TutorialEvery formula shown in a live interactive spreadsheet. Windowed Fst, nucleotide diversity, Tajima's D, and iHS/XP-EHH selection scans.
Start TutorialSite models, Bayesian MCMC, gene vs species trees, divergence dating, ancestral state reconstruction, biogeography, and macroevolution.
Start SeriesTn5 shift, TSS enrichment, IDR reproducibility, MACS3 peak calling, motif analysis, DESeq2 differential accessibility, and deepTools heatmaps.
Start TutorialMany tutorials embed fully interactive Excel-like spreadsheets that recalculate every formula in real time as you change inputs. No more black-box statistics — see exactly how Fst, LOD scores, Tajima's D, CNV depth ratios, and GWAS lambda inflation are computed.
| Population | p (allele freq) | 2pq (heterozygosity) | Fst formula |
|---|---|---|---|
| Pop A | 0.72 | 0.403 | =2*B2*(1-B2) |
| Pop B | 0.31 | 0.428 | =2*B3*(1-B3) |
| Fst (Ht-Hs)/Ht | 0.142 | =(D5-D6)/D5 | |
Your pocket-sized bioinformatics scientist who appears throughout every tutorial — explaining, warning, and waiting with you through the long compute jobs.
We believe bioinformatics should be accessible, enjoyable, and transparent. Every tutorial is written from scratch with updated tools, detailed parameter explanations, and real threshold-selection guidance — so you understand why, not just how.
From a biology student doing their first RNA-Seq to a researcher exploring GWAS fine-mapping or long-read haplotype phasing, these guides meet you where you are and take you further.
News, featured content, new tutorial additions, and the community spotlight — all in one place.
This month's spotlight recognises the launch of 6 new Nature Methods 2024 tutorial pages plus the Vibe Science guide — bringing the total catalogue to 37+ end-to-end tutorials. Every page was hand-written, formatted, and tested to ensure accuracy and reproducibility.
Want to be featured here?
Submit a tutorial, fix a bug, or help with documentation. Every contribution counts.
View Contributors2024
Seurat v5 introduces sketch-based analysis for datasets of millions of cells without loading everything into memory, plus Bridge Integration for linking scRNA-seq to ATAC/multiome. BPCells backend enables 10M+ cell analyses on a laptop.
Seurat v5 docsOct 2024
Bioconductor 3.20 (R 4.4) added 73 new software packages including major updates to SingleCellExperiment, scran, DESeq2 1.44, and new spatial transcriptomics packages. The BiocManager install workflow is now the standard for all R genomics work.
2024–2025
The nf-core community now maintains 100+ production-ready Nextflow DSL2 pipelines covering RNA-seq, scRNA-seq, ATAC-seq, variant calling, amplicon sequencing, and more. Every pipeline is containerised, versioned, and benchmarked. nf-core/rnaseq v3.14 includes STARsolo and Alevin-fry for ultra-fast quantification.
Early 2026
DeepMind's AlphaFold 3 predicts joint structures of proteins, nucleic acids, and ligands in one model — a major leap for drug target ID and RNA structural biology. Model weights are available for non-commercial research via the AlphaFold Server.
Read in Nature2024
Scanpy 1.10 lands with faster UMAP via RAPIDS-singlecell GPU acceleration, improved AnnData sparse handling, and tighter integration with the scverse ecosystem (Squidpy, Muon, scvi-tools). The anndata 0.10 zarr backend enables lazy loading of 10M+ cell datasets.
2024–2025
The HPRC released a pangenome graph from 94 haplotype-resolved assemblies. vg and Minigraph-Cactus enable alignment to the pangenome, reducing reference bias in variant calling. GRCh38 is officially no longer the only valid reference for human genomics.
HPRC Portal2025
With Dorado v0.7+ and R10.4.1 chemistry, ONT long reads now routinely hit Q30 (>99.9% accuracy), matching Illumina short reads for SNP and indel detection. Long-read-only WGS pipelines are now clinically viable for the first time.
Oxford Nanopore2025
Arc Institute's Evo 2 is trained on 9.3M genomes spanning all life domains. It generates functional DNA, predicts variant effects, and performs zero-shot gene design — effectively the GPT-4 moment for genomics. Weights are open.
Arc Institute2024
Nextflow 24 introduces the Fusion virtual file system (direct S3/GCS access without staging), Wave on-demand container builds, and the Seqera Platform for enterprise pipeline monitoring. Running cloud-scale pipelines is now simpler than ever.
Nextflow blog2025–2026
The HCA consortium has mapped over 50 million cells across 35 tissue types, creating the most comprehensive human cell reference ever. Data is freely available via the HCA portal and CZ CELLxGENE Discover — a goldmine for annotation and benchmarking.
Explore HCANature Methods 2024
CODEX, IMC, Phenocycler, and MIBI were collectively named Method of the Year by Nature Methods 2024. Sub-cellular resolution protein mapping in intact tissue is now routine, enabling cell-type and spatial niche deconvolution in tumour microenvironments and beyond.
Nature Methods2024–2025
GATK 4.6 ships improved HaplotypeCaller for short tandem repeats, updated Mutect2 with better artifact filtering, and a streamlined CNVSomaticPairWorkflow. The DRAGEN-GATK hybrid mode is now production-ready for clinical sequencing.
A curated selection of high-impact preprints in computational biology and genomics. Updated regularly. Click titles to read on the preprint server.
2024
The scvi-tools v1.0 paper formalises the probabilistic framework for VAE-based scRNA-seq integration, deconvolution, and differential expression. Covers SCVI, SCANVI, totalVI, and MULTIVI for multi-modal data. Now the go-to for large-scale atlas integration.
Nature Methods2024–2025
The latest Hifiasm integrates ultra-long ONT reads (UL-ONT) with HiFi to produce fully phased, T2T-quality assemblies of diploid human genomes in a single run. The T2T-CHM13 assembly demonstrated the method; Hifiasm now brings it to population scale.
GitHub / Paper2025
Geneformer (Theodoris et al., Nature 2023) is a context-aware transformer pre-trained on 30M single-cell transcriptomes. Fine-tuned for chromatin dynamics prediction, network inference, and in-silico perturbation — demonstrating transfer learning works in gene expression space.
Nature paper2024
Two tools competing for the DTU gold standard: satuRn (quasi-binomial GLM, fast, handles complex designs) and DRIMSeq (Dirichlet-multinomial). Head-to-head benchmarks show satuRn has superior FDR control on simulated and real data — now recommended over DEXSeq for transcript usage.
bioRxiv search2025
Enformer (DeepMind) and Borzoi (Linder et al.) use large transformer models to predict RNA-seq, ATAC-seq, and ChIP-seq tracks directly from 196kb genomic sequence. Enables in-silico dissection of regulatory variants without any experimental assay.
Enformer paper2024–2025
GLIMPSE2 imputes low-coverage WGS (0.1–1×) to near-WGS quality using a haplotype reference panel. Orders of magnitude faster than BEAGLE for large cohorts and supports the HPRC pangenome reference. Makes large-scale population studies affordable on a sequencing budget.
GLIMPSE2 docsAsk questions, share your results, get help with your analysis, and connect with other bioinformatics learners. Free forever.