Tool updates, field breakthroughs, and curated preprints from bioRxiv and arXiv — everything happening in computational biology right now.
Major releases and version updates across the bioinformatics stack
2024
Seurat v5 rewrites the internals to support sketch-based analysis (analyse millions of cells without full loading via BPCells), Bridge Integration for multiome/ATAC-RNA linking, and a unified JoinLayers() API. The biggest architectural change since v3.
SCTransform v2 as default normalisationOct 2024
The October 2024 release adds 73 new software packages and ships updated versions of core frameworks.
2024–2025
The nf-core community now maintains over 100 peer-reviewed, CI-tested Nextflow DSL2 pipelines with Docker/Singularity support.
2024
Nextflow 24 makes cloud-scale workflows dramatically simpler with three headline features.
2024
Scanpy 1.10 and the broader scverse ecosystem reach maturity with performance and interoperability improvements.
2024–2025
The Broad's latest GATK release ships significant improvements across the variant calling stack.
2025
ONT's R10.4.1 flow cell with Dorado v0.7+ basecalling now routinely achieves Q30 (>99.9%) single-read accuracy on long reads — matching Illumina for SNP/indel detection. Key milestones:
2024–2025
Hifiasm now jointly assembles PacBio HiFi and ultra-long ONT reads to produce fully phased T2T-quality diploid assemblies — bringing T2T to population scale.
2024
STAR 2.7.11 consolidates its position as the standard RNA-seq aligner with enhanced STARsolo for single-cell quantification.
Major breakthroughs, consortia milestones, and AI advances in genomics
2024
DeepMind's AlphaFold 3 (Nature 2024) extends structure prediction to nucleic acids and small molecules jointly with proteins — enabling drug-target complex prediction, RNA structure modelling, and protein-DNA binding prediction in one forward pass. Model weights available for non-commercial research.
Nature paper2025
Arc Institute's Evo 2 is trained on 9.3 million genomes spanning bacteria, archaea, and eukaryotes. It generates functional DNA sequences de novo, predicts variant effect scores, and enables zero-shot gene design. The GPT-4 moment for genomics — open weights.
Arc Institute2025–2026
The HCA consortium has mapped over 50 million cells across 35 tissue types — the most comprehensive cellular atlas of the human body ever assembled. Data is freely available via the HCA Data Portal and CZ CELLxGENE Discover. Used as the gold-standard annotation reference for dozens of single-cell tools.
HCA Portal2024–2025
The Human Pangenome Reference Consortium released HPRCv2, a pangenome graph from 94 haplotype-resolved assemblies representing global diversity. Tools like vg and Minigraph-Cactus support alignment to the graph, reducing reference bias in variant calling and GWAS — especially for non-European populations.
HPRC PortalNature Methods 2024
CODEX, IMC, Phenocycler, and MIBI-TOF were collectively named Method of the Year 2024 by Nature Methods — reflecting how routine sub-cellular resolution protein mapping in intact tissue has become. Combined with spatial transcriptomics (Xenium, Visium HD), spatial multi-omics is now a mature field.
Nature Methods editorial2024
Multiple groups released DNA sequence transformers (Nucleotide Transformer by InstaDeep/EMBL, HyenaDNA by Stanford) that learn long-range genomic context. These models predict regulatory element activity, variant effects, and chromatin states from sequence — no ChIP-seq or ATAC-seq required.
bioRxiv preprintCurated high-impact preprints in computational biology and genomics
2023–2024
Formalises the VAE-based framework for scRNA-seq integration, deconvolution, and DE. Covers SCVI, SCANVI (semi-supervised), totalVI (CITE-seq), and MULTIVI (multi-modal). Now the standard for large atlas integration (>100K cells).
Nature Methods2023
Context-aware transformer pre-trained on 30M single-cell transcriptomes (Theodoris et al.). Fine-tuned for chromatin dynamics, gene network inference, and in-silico perturbation response prediction — demonstrating that transfer learning works in transcriptomics as powerfully as in NLP.
Nature paper2023
DeepMind's Enformer predicts RNA-seq, ATAC-seq, and ChIP-seq tracks directly from 196kb of raw DNA sequence. Enables in-silico regulatory variant effect prediction without any experimental assay — a major advance for variant annotation and functional genomics.
Nature Methods2024–2025
GLIMPSE2 imputes 0.1–1× low-coverage WGS to near-WGS genotype quality using a haplotype reference panel. Orders of magnitude faster than BEAGLE for biobank-scale cohorts, supports the HPRC pangenome reference. Makes large population studies affordable on sequencing budgets.
GLIMPSE2 docs2024
ArchR 2.0 (Granja et al.) rewrites the HDF5-backed ATAC analysis framework with improved Arrow file format, faster iterative LSI, and tighter integration with Seurat and Signac. Handles 1M+ cells, provides trajectory analysis, and links peaks to genes via co-accessibility and co-expression.
ArchR docs2024–2025
EvolutionaryScale's ESM3 jointly models protein sequence, 3D structure, and functional annotations in a single generative model. Demonstrated de novo generation of fluorescent proteins distant from all known sequences — the first AI-designed protein with confirmed novel function. Open weights released.
ESM3 blog2024
Head-to-head benchmarks (Gilis et al.) show satuRn's quasi-binomial GLM approach has superior FDR control and speed compared to DEXSeq for differential transcript usage analysis. Now recommended as the primary DTU tool in the Bioconductor RNA-seq workflow.
satuRn Bioconductor2024
The UHGG v2 catalogue (Almeida et al.) compiles 286,000 non-redundant gut microbial genomes from 42,000 human samples — the most comprehensive human gut reference to date. Enables species-level strain tracking, resistome profiling, and metabolic pathway reconstruction from shotgun metagenomics.
UHGG v2 Portal2024–2025
PolyFun (Weissbrod et al.) uses functional annotations to improve GWAS fine-mapping priors, while SuSiE (Wang et al.) delivers 95% credible sets via sum of single effects. Together they are now the recommended post-GWAS fine-mapping pipeline — replacing FINEMAP for most applications.
PolyFun GitHub