Standard Article

Phylogenetic Likelihood

  1. Leonardo de Oliveira Martins,
  2. Diego Mallo,
  3. David Posada

Published Online: 19 SEP 2013

DOI: 10.1002/9780470015902.a0005141

eLS

eLS

How to Cite

de Oliveira Martins, L., Mallo, D. and Posada, D. 2013. Phylogenetic Likelihood. eLS. .

Author Information

  1. University of Vigo, Vigo, Spain

Publication History

  1. Published Online: 19 SEP 2013

Abstract

The input data for any phylogenetic analysis is a set of characters belonging to different individuals or loci, and assumed to have a common ancestor. Given a set of aligned deoxyribonucleic acid (DNA) or protein sequences, the likelihood of a phylogenetic tree depicting their ancestry relationships will be proportional to the probability of the alignment having been generated along this tree. The likelihood can be used not only as an objective criterion to find the optimal phylogenetic tree, but also to compare trees and evolutionary models, always in a probabilistic framework. The likelihood is the central ingredient of any statistical phylogenetic analysis, as it makes the connection between the data (the alignment) and the model, including the tree, branch lengths and other evolutionary assumptions.

Key Concepts:

  • The phylogenetic likelihood is the probability of the DNA sequence alignment X given a model of nucleotide substitution with parameters θ and phylogenetic tree τ (topology plus branch lengths).

  • The phylogenetic likelihood can be similarly calculated for amino acid or coding sequences, and they are all based on the instantaneous probability of a change of state.

  • The phylogenetic likelihood is the basis for any probabilistic phylogenetic inference, for both classical and Bayesian analyses.

  • There are many substitution models available, and the likelihood allows us to compare them and to find the best model, as well as the best phylogenetic tree.

  • In maximum likelihood phylogenetic estimation, the objective is to find the set of parameter values, particularly tree topology and branch lengths, that maximize the likelihood function.

  • In a Bayesian setting, the posterior probability of a particular set of phylogenetic parameter values (e.g. topology and branch lengths) is proportional to their likelihood multiplied by the prior probability of these values. The objective is then to describe these values with their associated posterior probabilities.

Keywords:

  • likelihood;
  • tree inference;
  • substitution models;
  • dating;
  • molecular adaptation;
  • phylogenetic analysis;
  • probability