BPP: Bayesian phylogenetics and phylogrography
The BPP program implements a series of Bayesian inference methods under the multispecies coalescent model with and without gene flow, including estimation of population sizes and species divergence times, inference of the species phylogeny despite conflicting gene trees, species delimitation, and inference of interspecific introgression.
The source code, installation guidelines, and tutorial for the latest version of BPP can be found in the links below:
- BPP version 4, GitHub repository for source code.
- BPP version 4, manual.
- BPP version 4, installation instructions.
- Tutorial: Flouri, T., et al. (2018). Species tree inference with BPP using genomic sequences and the multispecies coalescent. Mol. Biol. Evol. 35(10): 2585-2593. Please also read the 2015 tutorial on BPP v3.4.
The source code and tutorial for BPP v3.4 is available in the links below:
- BPP version 3, source code.
- BPP version 3, source code and executables for Mac OS X 7 and later.
- Tutorial: Yang, Z. (2015). A tutorial of BPP for species tree estimation and species delimitation. Current Zoology 61:854-865.
- The archive bpp3.4a.tgz includes the source code for all platforms, as well as executables for Windows.
- If the file extension of the file above is changed to .gz when the download finishes, you should change it back to .tgz before double-clicking.
- Download bpp3.4.macosx.tgz (source code and executables for Mac OS X 7 and later).
If you have questions, please post them at the BPP google group site here.
BPP replaces the old program MCMCcoal, which implements the Bayesian method of Rannala & Yang (2003) and Burgess & Yang (2008).
If you use BPP, please cite:
Also cite the original papers that describe the methods and algorithms, such as the following.
- Flouri, T., et al. (2018). Species tree inference with BPP using genomic sequences and the multispecies coalescent. Mol. Biol. Evol. 35(10): 2585-2593.
- Flouri T, Rannala B, Yang Z. (2020) A tutorial on the use of BPP for species tree estimation and species delimitation. Phylogenetics in the Genomic Era, 5.6:1–5.6:16. No commercial publisher, Authors open access book.
- Please read the 2015 tutorial on BPP v3.4 too.
Mario dos Reis has written the bppr R package, which can use the BPP output to do the following:
- Calibrate a BPP phylogeny to geological time: check the tutorial.
- Calculate Bayes factors using the stepping-stones approach in BPP: check the tutorial.
- The bppr R package is discussed in this tutorial.
Phylogenetic Analysis by Maximum Likelihood (PAML)
PAML is a package of C programs for phylogenetic analyses of DNA or protein sequences using maximum likelihood. It is maintained by Ziheng Yang and distributed under the GNU GPL v3. PAML is not good for tree making, although it may be used to estimate parameters and test hypotheses to study the evolutionary process once you have reconstructed trees using other programs such as RAxML-NG, IQ-TREE, etc.
Here is the PAML manual in PDF format. Please visit the PAML Wiki on GitHub for more details about how to install and run PAML programs, as well as for an interactive documentation for all PAML programs.
If you have any questions, please check the FAQs document or the PAML discussion group. Post your questions at the PAML discussion group.
- Inferring timetrees with bacteria: dos Reis M. (2022). In: Haiwei Luo (ed.) Environmental Microbial Evolution: Methods and Protocols. Methods in Molecular Biology, vol 2569. Humana, New York, NY. (in press). The GitHub repository with this protocol can be found via this link.
- Using the Bayesian sequential-subtree (BSS) approach to date phylogenomic datasets: Álvarez-Carretero S et al. (2022). A species-level timeline of mammal evolution integrating phylogenomic data. Nature 602, 263–267. A tutorial to reproduce these analyses can be found via this link.
- Using MCMCtree to date genome-scale datasets: dos Reis and Yang (2019) Bayesian molecular clock dating using genome-scale datasets. In: Anisimova (ed.) Evolutionary Genomics. Methods in Molecular Biology, vol 1910. Humana, New York, NY.
- You can use the mcmc3r R package to prepare your datasets before running MCMCtree to estimate divergence times using continuous morphological characters: check the tutorial.
- Run a Bayesian model selection analysis. You can find a specific tutorial to prepare your input files and then use the MCMCtree output to estimate the marginal likelihood and compute Bayes factors. This tutorial also describes how to perform parametric bootstrap of posterior probabilities. Check the tutorial.
- Compilation of MCMCtree tutorials. Please note that they were written with an old version of MCMCtree and hence some options might be out of date. Nevertheless, the tutorials can be consulted for theoretical and technical information about the following analyses:
- How to run MCMCtree using the exact likelihood calculation.
- How to run MCMCtree using the approximate likelihood calculation with both nucleotide and protein data.
- How to change the prior on the rates if you change the time scale.
- How to run the program infinitesites to estimate divergence times with infinitely many sites.
- Positive selection analyses: Álvarez-Carretero, S., Kapli, P., Yang, Z. Beginner’s guide on the use of PAML to detect positive selection, Mol Biol Evol. (submitted). The data and code used to run this protocol can be found in the positive-selection GitHub repository here.
- Estimating the omega ratio of protein-coding genes: Jeffares DC et al. (2015). A beginner’s guide to estimating the non-synonymous to synonymous rate ratio of all protein-coding genes in a genome. Parasite Genomics Protocols, Methods in Molecular Biology 1201, 65–90.
If you use PAML, please cite the following:
- Yang, Z. (1997). PAML: a program package for phylogenetic analysis by maximum likelihood. Comput. Appl. Biosci. 13, 555-556.
- Yang, Z. (2007). PAML 4: Phylogenetic Analysis by Maximum Likelihood. Mol. Biol. Evol. 24, 1586-1591.
If you use MCMCtree, please cite the following papers if you have used/run…
- … the approximate likelihood calculation to speed up analyses with phylogenomic datasets (calculating Hessian and gradient):
- … the models for continuous morphological characters implemented in MCMCtree:
- Álvarez-Carretero et al. (2019). Bayesian estimation of species divergence times using correlated quantitative characters. Syst. Biol. 68, 967–986. Also, please cite the mcmc3r R package if you use it (tutorial check the tutorial).
- … Bayesian model selection analysis for relaxed-clock models:
- … the Bayesian sequential-subtree approach and/or fitting skew-t distributions to fossil calibrations:
- Álvarez-Carretero et al. (2022) A species-level timeline of mammal evolution integrating phylogenomic data. Nature 602, 263–267. A tutorial to reproduce these analyses and step-by-step tutorials to run BASEML and MCMCtree can be found via this link.
Got questions? Get in touch.
Contact us if you have any questions about the Yang Lab.
Yang Lab
Click to email. z.yang@ucl.ac.uk