Phylogenetic Inference Deep Dive — Module 2: Gene/Species Trees, Networks, Topology Testing, LBA

1.4 — Gene Tree / Species Tree Discordance

Sources of Discordance

Source	Mechanism	Signature	Detection Tool
ILS	Deep coalescence in ancestral populations	Random discordance, no directional bias	ASTRAL, concordance factors
Hybridization/Introgression	Gene flow between lineages	Directional excess of one discordant topology	D-statistics, PhyloNet, HyDe
Gene duplication/loss	Paralogs mistaken for orthologs	Unexpected deep divergences within a gene tree	ASTRAL-Pro, Notung, ALE
Horizontal gene transfer	Lateral acquisition (prokaryotes)	Gene in unexpected clade	Ranger-DTL, ALE
Gene tree estimation error	Short loci, poor alignment	Low bootstrap, short branches	Collapse low-support branches

Concordance Factors in IQ-TREE

Concordance factors measure how much support the data actually provides for each branch, beyond what bootstrap says. gCF = fraction of gene trees supporting a branch. sCF = fraction of informative sites supporting a branch.

Bash

# Step 1: Build individual gene trees
# (One FASTA per locus in a directory)
iqtree2 -S loci_dir/ -m MFP -T AUTO --prefix loci_trees

# Step 2: Compute concordance factors on the species tree
iqtree2 -t species_tree.nwk --gcf loci_trees.treefile \
  -s concat_alignment.phy --scf 100 \
  -T AUTO --prefix concord

# Output: concord.cf.tree
# Each branch annotated with: gCF/sCF
# Example: (A,(B,C)95/78/45);
#   95 = bootstrap, 78 = gCF, 45 = sCF

# Interpretation:
# gCF = 78% means 78% of gene trees contain this branch
# sCF = 45% means 45% of decisive sites support this branch
# For a 3-taxon split, sCF of 33% = random (no signal)
# sCF >> 33% = genuine signal supporting the branch

Download:

Thumb Rule — Reading Concordance Factors:
gCF ≥ 60% = strongly supported by gene trees
gCF 33–60% = conflicted (ILS or hybridization zone)
gCF ~ 33% = no resolution (three topologies equally frequent = ILS at short internodes)
sCF ~ 33% = genuinely uninformative (too few sites)

DensiTree & DiscoVista for Discordance Visualization

Bash

# DensiTree: overlay all gene trees semi-transparently
# Install: included with BEAST2
densitree -b 0 gene_trees.nwk
# Areas of agreement = solid dark lines
# Areas of discordance = fuzzy, spread-out regions

# DiscoVista: quantify and visualize discordance per clade
# https://github.com/esayyari/DiscoVista
python discoVista.py -a annotation_file.txt \
  -t gene_trees_dir/ -m 75 \
  -o discovista_output/
# -m 75: collapse branches with <75% bootstrap
# Produces bar charts showing concordant vs discordant vs missing

Mnemonic

"ILS is Random, Introgression is Directional" — IRID

Under ILS alone, the two discordant topologies are equally frequent (each ~⅓ at short internodes). If one discordant topology is significantly more common than the other, suspect introgression/hybridization. Test with D-statistics (ABBA-BABA).

1.5 — Species Tree Methods

Summary vs Co-Estimation Methods

Approach	Tools	Input	Speed	Strengths
Summary (2-step)	ASTRAL-III, MP-EST	Pre-estimated gene trees	Very fast	Scalable to 1000s of loci, statistically consistent under MSC
Co-estimation	StarBEAST2, BEAST2+MSC	Sequence alignments	Very slow	Jointly estimates gene trees + species tree, uses full data
SNP-based	SVDquartets, SNAPP	SNP matrix	Moderate	No gene tree estimation step, works with unlinked SNPs
Concatenation	IQ-TREE, RAxML-NG	Concatenated alignment	Moderate	High power, but not statistically consistent under MSC

ASTRAL-III & ASTRAL-Pro

Bash

# ASTRAL-III — standard species tree from single-copy orthologs
java -jar astral.5.7.8.jar \
  -i gene_trees.nwk \
  -o species_tree.nwk \
  -t 2 2> astral.log
# -t 2: annotate with local posterior probability AND quartet support

# Collapsing low-support branches in gene trees (recommended!)
# Reduces gene tree estimation error propagation
nw_ed gene_trees.nwk 'i & b<10' o > collapsed_trees.nwk
java -jar astral.5.7.8.jar -i collapsed_trees.nwk -o species_tree.nwk

# ASTRAL-Pro — handles multi-copy gene families (paralogs)
java -jar astral-pro.jar \
  -i gene_family_trees.nwk \
  -o species_tree.nwk \
  -u 2  # -u for multi-individual/multi-copy

# Weighted ASTRAL (wASTRAL) — weights quartets by branch lengths
java -jar wastral.jar -i gene_trees.nwk -o species_tree.nwk

Download:

StarBEAST2 — Co-Estimation Under the MSC

Workflow

# StarBEAST2 workflow (BEAST2 package):
# 1. Install: in BEAST2 Package Manager → StarBEAST2
# 2. Open BEAUti, load multiple gene alignments
# 3. Set up:
#    - Partitions tab: link/unlink trees per gene
#    - Taxon sets: map individuals → species
#    - Clock: relaxed uncorrelated lognormal (per gene)
#    - Tree prior: multispecies coalescent (species tree birth-death)
#    - Species tree prior: Yule or Birth-Death
# 4. MCMC: ngen=500000000 (yes, 500M), logging every 50000
# 5. Run on HPC:
beast -threads 16 -beagle starbeast2.xml
# 6. Check convergence in Tracer (ESS > 200)
# 7. Summarize: TreeAnnotator on species tree log

SVDquartets — SNP-Based Species Tree

PAUP*

# SVDquartets in PAUP*
execute alignment.nex;

[Define species/populations]
taxpartition species =
  SpeciesA: TaxonA1 TaxonA2 TaxonA3,
  SpeciesB: TaxonB1 TaxonB2,
  SpeciesC: TaxonC1 TaxonC2 TaxonC3;

svdquartets evalQuartets=all taxpartition=species
  nthreads=16 bootstrap=standard nreps=100;
savetrees file=svdq_tree.tre format=newick;

Download:

When to Concatenate vs Use Coalescent Methods

Concatenation works when: internodes are long (low ILS), species are closely related, loci have similar rates
Coalescent methods needed when: rapid radiation (short internodes), known hybridization, gene trees show substantial conflict
Both agree: great, high confidence in the result
They disagree: investigate further — compute concordance factors, test for introgression

Thumb Rule — Gene Trees for ASTRAL: Collapse branches with bootstrap <10% in gene trees before running ASTRAL. This prevents erroneous gene tree resolutions from biasing the species tree. Use nw_ed from Newick Utilities or IQ-TREE's --con-tree.

1.6 — Phylogenetic Networks

When reticulate evolution (hybridization, introgression, HGT) is present, a tree is insufficient. Networks represent both vertical and horizontal inheritance.

HyDe — Hybridization Detection

Bash / Python

# HyDe: fast test for hybridization using site pattern frequencies
# Tests if a taxon (Hybrid) is an admixture of P1 and P2
pip install phyde

# Command line
run_hyde.py -i alignment.phy -m map_file.txt -o outgroup \
  -n num_individuals -t num_taxa -s num_sites

# Python API for all triplets
import phyde as gp
dat = gp.HydeData("alignment.phy", "map_file.txt", "outgroup",
                    num_ind, num_taxa, num_sites)
res = dat.bootstrap_triple("P1", "Hybrid", "P2", reps=1000)
# gamma (admixture proportion): 0 = pure P1, 1 = pure P2
# Significant p-value + gamma ∈ (0,1) = hybridization detected

Download:

SNaQ (PhyloNetworks in Julia)

Julia

using PhyloNetworks

# Read gene trees (concordance factor table from IQ-TREE or BUCKy)
cf_table = readTableCF("concordance_factors.csv")

# Start from a tree (e.g., ASTRAL species tree)
start_tree = readTopology("species_tree.nwk")

# Estimate network with 0 hybridizations (= tree)
net0 = snaq!(start_tree, cf_table, hmax=0, runs=10)

# Estimate with 1 hybridization
net1 = snaq!(net0, cf_table, hmax=1, runs=10)

# Estimate with 2 hybridizations
net2 = snaq!(net1, cf_table, hmax=2, runs=10)

# Compare models using log-pseudolikelihood
# Use the "elbow" method: when adding another h doesn't improve much
println("h=0: ", net0.loglik)
println("h=1: ", net1.loglik)
println("h=2: ", net2.loglik)

# Plot network
using PhyloPlots
plot(net1, showGamma=true)  # gamma = inheritance proportion

Download:

PhyloNet — Network Inference

NEXUS

# PhyloNet: InferNetwork_MPL (maximum pseudo-likelihood)
#NEXUS
BEGIN TREES;
Tree gt1 = (A,(B,(C,D)));
Tree gt2 = (A,(C,(B,D)));
Tree gt3 = (A,(B,(C,D)));
[... more gene trees ...]
END;

BEGIN PHYLONET;
InferNetwork_MPL (gt1,gt2,gt3) 1 -x 10 -pl 16;
[1 = max number of reticulations]
[-x 10 = 10 independent runs]
[-pl 16 = 16 threads]
END;

SplitsTree — Exploratory Network Visualization

Bash

# SplitsTree: quick visual check for reticulation
# Input: alignment or distance matrix
# GUI tool — load alignment → Distances → NeighborNet
# Non-tree-like structure = possible hybridization/recombination

# Command line (SplitsTree5):
SplitsTree5 -i alignment.fasta -o network.nex

Mnemonic

"Networks for Non-tree-like evolution"

If gene trees are discordant and the discordance is directional (not random ILS), you need a network, not a tree. Start with SplitsTree for visualization, HyDe for quick tests, SNaQ for formal network estimation.

1.7 — Topology Testing & Hypothesis Comparison

Available Tests

Test	Null Hypothesis	One-sided?	Recommended?
AU test	Tree is as good as the ML tree	No (multi-scale bootstrap)	Yes — gold standard
SH test	Tree is as good as the ML tree	No	Conservative, may fail to reject bad trees
KH test	Two trees are equally good	Only for pre-specified trees	Only for a priori hypotheses
SOWH test	Constrained tree is true	Parametric bootstrap	Most powerful, but slow
Bayes factors	Model comparison	N/A	Best in Bayesian framework

Bash — IQ-TREE Topology Tests

# Step 1: Collect candidate trees (ML tree + alternatives)
cat ml_tree.nwk alternative1.nwk alternative2.nwk > candidate_trees.nwk

# Step 2: Run all topology tests
iqtree2 -s alignment.phy -m GTR+G \
  -z candidate_trees.nwk \
  -zb 10000 -zw -au \
  -T AUTO --prefix topotest

# -zb 10000: 10000 RELL bootstrap replicates
# -zw: weighted SH and KH tests
# -au: AU test

# Output: topotest.iqtree contains a table:
# Tree  logL     deltaL   bp-RELL  p-KH   p-SH   p-WKH  p-WSH  c-ELW  p-AU
# 1     -45678   0        0.89     0.93   0.98   0.92   0.97   0.88   0.95
# 2     -45712   34       0.08     0.07   0.12   0.08   0.13   0.09   0.04*
# 3     -45745   67       0.03     0.02*  0.04*  0.02*  0.04*  0.03   0.01*
# * = rejected at p < 0.05
# Trees not rejected by AU test cannot be statistically distinguished

Download:

Four-Cluster Likelihood Mapping

Bash

# IQ-TREE likelihood mapping (test phylogenetic signal)
iqtree2 -s alignment.phy -m MFP -lmap 10000 -T AUTO

# Output: .lmap.svg and .iqtree
# Triangle plot: corners = fully resolved, center = unresolved
# >90% in corners = strong signal
# >30% in center = insufficient signal for this dataset

# Four-cluster likelihood mapping (custom taxon groups):
# Create a file defining 4 clusters:
echo "ClusterA: TaxA1 TaxA2 TaxA3
ClusterB: TaxB1 TaxB2
ClusterC: TaxC1 TaxC2
ClusterD: TaxD1 TaxD2 TaxD3" > clusters.txt

iqtree2 -s alignment.phy -m MFP -lmap 10000 \
  -lmclust clusters.txt -T AUTO

Thumb Rule — Topology Testing: The AU test is the standard. If a tree is NOT rejected by the AU test, it cannot be statistically distinguished from the ML tree — even if its likelihood is lower. Report AU p-values for all alternative hypotheses in your paper.

1.8 — Long Branch Attraction & Systematic Error

Identifying LBA and Compositional Bias

Long Branch Attraction (LBA) = fast-evolving taxa artificially group together regardless of their true relationship. Compositional bias = taxa with similar base composition cluster together. Both are systematic errors that more data makes worse, not better.

Diagnostic Signs of Systematic Error

Two long-branched taxa group together, but removing one changes the topology
Adding more loci increases bootstrap but doesn't change the (suspect) topology
Amino acid composition differs dramatically between clades (chi-squared test)
Posterior predictive simulations reject the model for composition
Simpler models (GTR+G) give a different topology than complex models (CAT-GTR)

Amino Acid Recoding

Recoding amino acids into fewer categories reduces compositional signal and saturation at the cost of losing some phylogenetic information.

Bash / Python

# Dayhoff6 recoding: 20 amino acids → 6 categories
# Group 1 (ASTGP): Small
# Group 2 (DNEQ): Acidic/amide
# Group 3 (RKH): Basic
# Group 4 (MVIL): Hydrophobic
# Group 5 (FYW): Aromatic
# Group 6 (C): Cysteine

# Python script for Dayhoff6 recoding:
python3 -c "
import sys
dayhoff6 = {
    'A':'0','S':'0','T':'0','G':'0','P':'0',
    'D':'1','N':'1','E':'1','Q':'1',
    'R':'2','K':'2','H':'2',
    'M':'3','V':'3','I':'3','L':'3',
    'F':'4','Y':'4','W':'4',
    'C':'5',
    '-':'-','X':'?','?':'?'
}
for line in open(sys.argv[1]):
    if line.startswith('>'):
        print(line.strip())
    else:
        print(''.join(dayhoff6.get(c.upper(),'?') for c in line.strip()))
" alignment.fasta > alignment_dayhoff6.fasta

# SR4 recoding: 20 → 4 categories (even more aggressive)
# A/G/N/P/S/T → 0
# C/H/W/Y → 1
# D/E/K/Q/R → 2
# F/I/L/M/V → 3

# Run IQ-TREE on recoded data with appropriate model
iqtree2 -s alignment_dayhoff6.fasta -m GTR+G \
  -st MORPH -T AUTO -bb 1000

Download:

Removing Fast-Evolving Sites or Taxa

Bash

# IQ-TREE: site rates estimation
iqtree2 -s alignment.phy -m LG+G -T AUTO --rate
# Output: .rate file with per-site rates

# Remove fastest 10%, 20%, 30% of sites progressively
# and re-run the tree each time
# If the topology changes = fast sites were misleading

# Slow-fast analysis in R:
# Read rates, sort, remove top quantile, subset alignment

# BaCoCa: batch composition checker
# Tests each taxon's amino acid composition against average
# https://github.com/simonwhelan/BaCoCa
python BaCoCa.py -i alignment.fasta
# Chi-squared p-value < 0.05 for a taxon = compositional outlier

Site-Heterogeneous Models as the Solution

The best defense against LBA and compositional bias:

Use CAT-GTR (PhyloBayes) — accounts for site-specific amino acid profiles
Use C60/PMSF (IQ-TREE) — fast approximation of site heterogeneity
Recode amino acids (Dayhoff6/SR4) to reduce compositional signal
Remove saturated/fast sites and check if topology is stable
Remove suspect long-branch taxa one at a time and check topology stability
Run both approaches and report whether they agree

Mnemonic

"More data + Wrong model = More Wrong"

Systematic error is the opposite of stochastic error. Adding more genes with a misspecified model increases bootstrap for the WRONG tree. The fix is a better model, not more data.

Thumb Rule

LBA sensitivity analysis

If removing one long-branch taxon changes where another long-branch taxon falls in the tree, LBA is operating. Report both topologies and discuss. Use site-heterogeneous models and amino acid recoding as cross-checks.

AI Prompt Guide

Discordance & Network Analysis

I have [NUMBER] gene trees from [NUMBER] taxa spanning [CLADE]. I suspect [ILS / hybridization / both] based on preliminary results. My concatenation tree disagrees with ASTRAL for [SPECIFIC NODE]. Please write a pipeline that: 1. Computes gene and site concordance factors (gCF/sCF) in IQ-TREE 2. Runs ABBA-BABA / D-statistics to test for introgression at that node 3. Estimates a phylogenetic network with SNaQ (PhyloNetworks, Julia) testing h=0,1,2,3 4. Runs HyDe for all possible hybridization triplets 5. Tests alternative topologies with the AU test in IQ-TREE 6. Performs amino acid recoding (Dayhoff6) and re-runs the tree to check for LBA 7. Visualizes discordance with DensiTree 8. Provides interpretation guidelines for each result

⚠ Troubleshooting

ASTRAL: "Input gene trees have polytomies"

ASTRAL handles polytomies natively (treats them as uncertainty). But if you want to resolve them first: collapse branches with low support (nw_ed trees.nwk 'i & b<10' o) to create genuine polytomies rather than poorly-resolved bifurcations.

SNaQ: Pseudo-likelihood doesn't improve with more reticulations

The data may genuinely be tree-like (no hybridization). Or the signal-to-noise ratio is too low. Try: (1) use more gene trees, (2) ensure concordance factor table is computed correctly, (3) check that your starting tree is reasonable.

AU test: All alternative trees rejected

This means the ML tree is significantly better than all tested alternatives. This is a strong result — but verify your alternative trees are correctly specified (check rooting, taxon names, Newick format).

Dayhoff6 recoding gives different topology

This is informative. It suggests that the original topology was influenced by compositional bias or saturation. The recoded result is typically more reliable for deep divergences, though it has less power (fewer character states). Report both and discuss which biological scenario each supports.

Mnemonics & Thumb Rules

Mnemonic

"Concatenate for Confidence, Coalescent for Correctness"

Concatenation gives high bootstrap (because of sheer data volume) but can be positively misleading under ILS. Coalescent methods (ASTRAL) are statistically consistent under the MSC. Always run both and compare.

Mnemonic

"AU for All Uses" — the best topology test

The Approximately Unbiased test is the recommended topology test. SH is too conservative (rarely rejects). KH requires pre-specified hypotheses. The AU test correctly handles selection bias and works for any set of candidate trees.

Thumb Rule — Network Complexity: Start with h=0 (tree), then h=1, h=2. If log-pseudolikelihood doesn't improve by >5 units when adding another reticulation, stop. Most real datasets have 0–2 reticulations. Networks with h>3 are rarely well-supported.

Thumb Rule — Species Trees: Need ≥50 gene trees for a reliable ASTRAL analysis. 200+ is ideal. Each gene tree should have ≥200 informative sites. Very short loci (< 300 bp) contribute mostly noise — either concatenate them or exclude them.

Module 1: Models & Bayesian Module 3: Divergence Dating

Phylogenetic Inference — Deep Dive Module 2

On this page

1.4 — Gene Tree / Species Tree Discordance

Sources of Discordance

Concordance Factors in IQ-TREE

DensiTree & DiscoVista for Discordance Visualization

1.5 — Species Tree Methods

Summary vs Co-Estimation Methods

ASTRAL-III & ASTRAL-Pro

StarBEAST2 — Co-Estimation Under the MSC

SVDquartets — SNP-Based Species Tree

When to Concatenate vs Use Coalescent Methods

1.6 — Phylogenetic Networks

HyDe — Hybridization Detection

SNaQ (PhyloNetworks in Julia)

PhyloNet — Network Inference

SplitsTree — Exploratory Network Visualization

1.7 — Topology Testing & Hypothesis Comparison

Available Tests

Four-Cluster Likelihood Mapping

1.8 — Long Branch Attraction & Systematic Error

Identifying LBA and Compositional Bias

Diagnostic Signs of Systematic Error

Amino Acid Recoding

Removing Fast-Evolving Sites or Taxa

Site-Heterogeneous Models as the Solution

AI Prompt Guide

Discordance & Network Analysis

⚠ Troubleshooting

ASTRAL: "Input gene trees have polytomies"

SNaQ: Pseudo-likelihood doesn't improve with more reticulations

AU test: All alternative trees rejected

Dayhoff6 recoding gives different topology

Mnemonics & Thumb Rules