Metagenomics & Microbiome Analysis with QIIME2 & Kraken2

? Amplicon vs. Shotgun Metagenomics

Metagenomics studies the genetic material from entire microbial communities without culturing individual organisms. Two main approaches exist:

16S rRNA Amplicon

Sequences a single marker gene (16S rRNA for bacteria, ITS for fungi)
Cheaper, lower computational demand
Reveals who is there (taxonomy) and relative abundances
Cannot tell you what they're doing (functional potential)
Tool of choice: QIIME2

Shotgun Metagenomics

Sequences all DNA in the sample (every organism, every gene)
More expensive, computationally intensive
Reveals who is there AND what they can do (functional pathways)
Better species/strain-level resolution
Tools: Kraken2, MetaPhlAn4, HUMAnN3

Which Should You Use? If your question is purely "what microbes are present and how do they differ between groups?" — 16S is cost-effective and well-validated. If you need species-level resolution, strain tracking, antibiotic resistance genes, or metabolic pathway profiling — invest in shotgun metagenomics.

16S rRNA Analysis with QIIME2

1 Install QIIME2 & Import Data

Bash

# Install QIIME2 via conda (check qiime2.org for latest version)
wget https://data.qiime2.org/distro/amplicon/qiime2-amplicon-2024.5-py39-linux-conda.yml
conda env create -n qiime2 --file qiime2-amplicon-2024.5-py39-linux-conda.yml
conda activate qiime2

# Verify
qiime --version

QIIME2 uses artifacts (.qza files) and visualizations (.qzv files). Data must be imported into this artifact system before processing.

Bash

# Import paired-end demultiplexed FASTQ files
# Your files should be in Casava 1.8 format in a directory
qiime tools import \
  --type 'SampleData[PairedEndSequencesWithQuality]' \
  --input-path raw_reads/ \
  --input-format CasavaOneEightSingleLanePerSampleDirFmt \
  --output-path demux-paired.qza

# Visualize read quality
qiime demux summarize \
  --i-data demux-paired.qza \
  --o-visualization demux-summary.qzv

# View the visualization at https://view.qiime2.org or:
qiime tools view demux-summary.qzv

Quality Profile is Critical: The demux-summary.qzv shows per-base quality scores for your forward and reverse reads. You'll use this to decide where to truncate reads in the denoising step. Look for where the median quality drops below Q20 — that's your truncation point.

2 Denoising with DADA2

DADA2 corrects sequencing errors and produces Amplicon Sequence Variants (ASVs) — exact sequences resolved to single-nucleotide differences. ASVs have largely replaced OTUs (97% similarity clusters) as the modern standard.

Bash

qiime dada2 denoise-paired \
  --i-demultiplexed-seqs demux-paired.qza \
  --p-trunc-len-f 240 \
  --p-trunc-len-r 200 \
  --p-trim-left-f 0 \
  --p-trim-left-r 0 \
  --p-max-ee-f 2.0 \
  --p-max-ee-r 2.0 \
  --p-n-threads 8 \
  --o-table table.qza \
  --o-representative-sequences rep-seqs.qza \
  --o-denoising-stats denoising-stats.qza

# Check how many reads passed each step
qiime metadata tabulate \
  --m-input-file denoising-stats.qza \
  --o-visualization denoising-stats.qzv

How to Pick DADA2 Truncation Parameters

Parameter	What It Does	How to Choose
`--p-trunc-len-f`	Truncate forward reads at this position	From the quality plot: where does median quality drop below ~Q25? Truncate there. For 300bp V3-V4 reads, typically 240–280.
`--p-trunc-len-r`	Truncate reverse reads at this position	Reverse reads are lower quality. Usually truncate 20–50bp shorter than forward. Must retain enough overlap for merging (~20bp minimum).
`--p-trim-left-f/r`	Trim bases from the start of reads	Use to remove primers if they weren't already removed. Set to primer length (e.g., 17–21bp for common 16S primers).
`--p-max-ee`	Maximum expected errors — reads exceeding this are discarded	2.0 is standard. Decrease to 1.0 for stricter quality; increase to 3.0 if you're losing too many reads.

Critical: After truncation, forward + reverse reads must still overlap by ≥ 20bp for DADA2 to merge them. Calculate: trunc_f + trunc_r - amplicon_length ≥ 20. For the V3-V4 region (~460bp), truncating at 240+200 = 440 gives about no overlap — you may need to adjust.

DADA2 is slow! For a typical 16S dataset with millions of reads, expect 30 minutes to several hours. Using --p-n-threads 8 helps significantly. If you have many samples, consider running on a cluster.

3 Taxonomic Classification

Bash

# Download a pre-trained classifier (Silva 138, V3-V4 region)
# Check https://docs.qiime2.org for the latest classifiers
wget https://data.qiime2.org/2024.5/common/silva-138-99-nb-classifier.qza

# Classify ASVs
qiime feature-classifier classify-sklearn \
  --i-classifier silva-138-99-nb-classifier.qza \
  --i-reads rep-seqs.qza \
  --p-n-jobs 4 \
  --o-classification taxonomy.qza

# Visualize taxonomy
qiime metadata tabulate \
  --m-input-file taxonomy.qza \
  --o-visualization taxonomy.qzv

# Create a barplot of taxonomic composition
qiime taxa barplot \
  --i-table table.qza \
  --i-taxonomy taxonomy.qza \
  --m-metadata-file sample-metadata.tsv \
  --o-visualization taxa-bar-plots.qzv

Classifier Choice Matters:

Silva 138 — most comprehensive for bacteria and archaea; standard choice for 16S
Greengenes2 — newer, phylogenetically consistent; gaining popularity
UNITE — for ITS (fungal) amplicon data
Always use a classifier trained on your specific primer region (e.g., V3-V4, V4 only). A full-length 16S classifier on V4-only reads will perform worse.

Filter Out Non-Target Sequences

Bash

# Remove mitochondria, chloroplast, and unassigned sequences
qiime taxa filter-table \
  --i-table table.qza \
  --i-taxonomy taxonomy.qza \
  --p-exclude "mitochondria,chloroplast,Unassigned" \
  --o-filtered-table table-filtered.qza

4 Diversity Analysis

Diversity analysis answers two fundamental ecological questions:

Alpha diversity — how diverse is each sample? (within-sample richness & evenness)
Beta diversity — how different are communities between samples? (between-sample distance)

Rarefaction & Core Metrics

Bash

# First, check the sampling depth distribution
qiime feature-table summarize \
  --i-table table-filtered.qza \
  --o-visualization table-summary.qzv

# Run core diversity metrics (rarefies to even depth)
qiime diversity core-metrics-phylogenetic \
  --i-phylogeny rooted-tree.qza \
  --i-table table-filtered.qza \
  --p-sampling-depth 10000 \
  --m-metadata-file sample-metadata.tsv \
  --output-dir core-metrics-results

How to Choose Rarefaction Depth

Look at the table-summary.qzv — it shows the read count per sample
Goal: pick a depth that retains the most samples while capturing most of the diversity
Rule of thumb: use the lowest sample's read count, but drop samples below ~1,000 reads (they're unreliable)
Typical range: 5,000–50,000 depending on sequencing depth
Use a rarefaction curve (qiime diversity alpha-rarefaction) to confirm your chosen depth captures the diversity plateau

Statistical Tests

Bash

# Alpha diversity: compare richness between groups
qiime diversity alpha-group-significance \
  --i-alpha-diversity core-metrics-results/shannon_vector.qza \
  --m-metadata-file sample-metadata.tsv \
  --o-visualization shannon-significance.qzv

# Beta diversity: PERMANOVA test (are communities different between groups?)
qiime diversity beta-group-significance \
  --i-distance-matrix core-metrics-results/weighted_unifrac_distance_matrix.qza \
  --m-metadata-column treatment \
  --m-metadata-file sample-metadata.tsv \
  --p-method permanova \
  --p-pairwise \
  --o-visualization permanova-results.qzv

Alpha Diversity Metrics Explained:

Observed Features — simple richness: how many unique ASVs are present
Shannon Index — accounts for both richness and evenness; higher = more diverse
Faith's PD — phylogenetic diversity; sums branch lengths on the phylogenetic tree; captures evolutionary breadth

Beta Diversity Metrics:

Bray-Curtis — abundance-based, non-phylogenetic; most common for general use
Weighted UniFrac — incorporates phylogeny and abundance; best for detecting changes in dominant taxa
Unweighted UniFrac — phylogenetic but presence/absence only; sensitive to rare taxa

Shotgun Metagenomics with Kraken2

5 Taxonomic Classification with Kraken2

Bash

# Install
conda install -c bioconda kraken2 bracken

# Download a standard database (requires ~100GB disk for standard, or ~8GB for minikraken)
kraken2-build --standard --db kraken2_standard_db --threads 16

# Or use a smaller pre-built database
# wget https://genome-idx.s3.amazonaws.com/kraken/k2_standard_08gb_20240605.tar.gz

# Run Kraken2 classification
kraken2 \
  --db kraken2_standard_db \
  --paired \
  --threads 12 \
  --output sample1.kraken.out \
  --report sample1.kraken.report \
  --minimum-hit-groups 3 \
  --confidence 0.1 \
  trimmed_R1.fastq.gz trimmed_R2.fastq.gz

Parameter	Purpose	How to Choose
`--confidence 0.1`	Minimum fraction of k-mers that must map to a taxon	0 = classify everything (more false positives). 0.1 = moderate. 0.2–0.5 = stringent. Start at 0.1 and increase if you see too many low-abundance false hits.
`--minimum-hit-groups 3`	Minimum distinct k-mer groups matching a taxon	Higher values reduce false positives at the cost of sensitivity. 2–3 is standard. Use 5 for high-confidence results.

Refine Abundances with Bracken

Kraken2 assigns reads to the lowest common ancestor, which inflates higher taxonomic levels. Bracken redistributes reads to species level using a Bayesian approach.

Bash

# Build Bracken database (one-time, needs the Kraken2 DB)
bracken-build -d kraken2_standard_db -t 16 -l 150

# Run Bracken to estimate species-level abundances
bracken \
  -d kraken2_standard_db \
  -i sample1.kraken.report \
  -o sample1.bracken.out \
  -w sample1.bracken.report \
  -r 150 \
  -l S \
  -t 10

Bracken Parameters

-r 150 — read length (match to your actual read length)
-l S — level to estimate: S = species, G = genus, F = family. Species-level requires sufficient database coverage.
-t 10 — minimum reads at the taxonomic level to report. Increase to 50–100 to remove very low-abundance noise.

6 Visualization with Krona & R

Bash

# Interactive Krona chart
conda install -c bioconda krona
ktUpdateTaxonomy.sh

# Convert Kraken report to Krona format
cut -f2,3 sample1.kraken.out > sample1.krona.input
ktImportTaxonomy sample1.krona.input -o sample1.krona.html

# Open the interactive HTML in a browser

Stacked Bar Chart in R

R

library(tidyverse)

# Read Bracken output for multiple samples (combine into one table)
files <- list.files(pattern = "*.bracken.out", full.names = TRUE)
bracken_df <- map_dfr(files, ~ read_tsv(.x) %>%
  mutate(sample = gsub(".bracken.out", "", basename(.x))))

# Top 15 species by mean abundance
top_species <- bracken_df %>%
  group_by(name) %>%
  summarize(mean_frac = mean(fraction_total_reads)) %>%
  slice_max(mean_frac, n = 15) %>%
  pull(name)

bracken_df %>%
  mutate(taxon = ifelse(name %in% top_species, name, "Other")) %>%
  group_by(sample, taxon) %>%
  summarize(abundance = sum(fraction_total_reads), .groups = "drop") %>%
  ggplot(aes(x = sample, y = abundance, fill = taxon)) +
  geom_col() +
  scale_fill_brewer(palette = "Set3") +
  theme_minimal(base_size = 14) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  labs(title = "Taxonomic Composition (Species Level)",
       x = NULL, y = "Relative Abundance", fill = "Species")

7 Differential Abundance Testing

Which taxa are significantly enriched or depleted between your conditions? This requires compositional-aware methods because microbiome data is inherently relative (proportions that sum to 1).

R

# ANCOM-BC2 (recommended for microbiome differential abundance)
BiocManager::install("ANCOMBC")
library(ANCOMBC)
library(phyloseq)

# Build a phyloseq object from your data
# (OTU/ASV table, taxonomy table, sample metadata)
ps <- phyloseq(
  otu_table(count_matrix, taxa_are_rows = TRUE),
  tax_table(taxonomy_matrix),
  sample_data(sample_metadata)
)

# Run ANCOM-BC2
result <- ancombc2(
  data     = ps,
  fix_formula = "treatment",
  p_adj_method = "BH",
  alpha    = 0.05,
  group    = "treatment",
  struc_zero = TRUE,     # handle structural zeros
  neg_lb   = TRUE
)

# View significant taxa
sig_taxa <- result$res %>%
  filter(diff_treatment == TRUE)
print(sig_taxa)

Avoid These Common Mistakes:

Don't use standard t-tests or DESeq2 directly on relative abundance data — microbiome data is compositional; an increase in one taxon forces a decrease in others.
Use compositional methods: ANCOM-BC2, ALDEx2, MaAsLin2, or LinDA — all handle the compositionality issue.
Don't over-interpret rare taxa (< 0.1% abundance) — they're below the noise floor of most sequencing experiments.

Summary

What You Learned:

The difference between 16S amplicon and shotgun metagenomics approaches
Import, denoise (DADA2), and classify 16S data with QIIME2
Choose DADA2 truncation parameters from quality profiles
Perform alpha and beta diversity analyses with appropriate metrics
Run Kraken2 + Bracken for shotgun metagenomic classification
Visualize taxonomic composition with Krona and ggplot2
Perform differential abundance testing with ANCOM-BC2
Avoid common pitfalls like using non-compositional statistics on microbiome data

Prev: Variant Calling Back to All Tutorials

Metagenomics & Microbiome Analysis

On this page

? Amplicon vs. Shotgun Metagenomics

16S rRNA Amplicon

Shotgun Metagenomics

16S rRNA Analysis with QIIME2

1 Install QIIME2 & Import Data

2 Denoising with DADA2

How to Pick DADA2 Truncation Parameters

3 Taxonomic Classification

Filter Out Non-Target Sequences

4 Diversity Analysis

Rarefaction & Core Metrics

How to Choose Rarefaction Depth

Statistical Tests

Shotgun Metagenomics with Kraken2

5 Taxonomic Classification with Kraken2

Refine Abundances with Bracken

Bracken Parameters

6 Visualization with Krona & R

Stacked Bar Chart in R

7 Differential Abundance Testing

Summary