Increasing the number of discrete character states for continuous characters generates well-resolved trees that do not reflect phylogeny

Authors

  • Jérémie BARDIN,

    Corresponding author
    1. Sorbonne University, Pierre-and-Marie-Curie University (UPMC-P6), Paris, France
    2. National Museum of Natural History (MNHN), Paris, France
    3. The National Center for Scientific Research (CNRS), Paris, France
    • Correspondence: Jérémie Bardin, Museum National d'Histoire Naturelle, Centre de Recherche sur la Paleodiversité et les Paléoenvironnements UPMC, CNRS UMR 7207, UPMC, Case 104–4 Place Jussieu, 75252 Paris Cedex 05, France. Email: jeremiebardin@yahoo.fr

    Search for more papers by this author
  • Isabelle ROUGET,

    1. Sorbonne University, Pierre-and-Marie-Curie University (UPMC-P6), Paris, France
    2. National Museum of Natural History (MNHN), Paris, France
    3. The National Center for Scientific Research (CNRS), Paris, France
    Search for more papers by this author
  • Margaret Mary YACOBUCCI,

    1. Department of Geology, Bowling Green State University, Bowling Green, Ohio, USA
    Search for more papers by this author
  • Fabrizio CECCA

    1. Sorbonne University, Pierre-and-Marie-Curie University (UPMC-P6), Paris, France
    2. National Museum of Natural History (MNHN), Paris, France
    3. The National Center for Scientific Research (CNRS), Paris, France
    Search for more papers by this author

Abstract

Since the introduction of the cladistic method in systematics, continuous characters have been integrated into analyses but no methods for their treatment have received unanimous support. Some methods require a large number of character states to discretise continuous characters in order to keep the maximum level of information about taxa differences within the coding scheme. Our objective was to assess the impact of increasing the character state number on the outcomes of phylogenetic analyses. Analysis of a variety of simulated datasets shows that these methods for coding continuous characters can lead to the generation of well-resolved trees that do not reflect a phylogenetic signal. We call this phenomenon the flattening of the tree-length distribution; it is influenced by both the relative quantity of continuous characters in relation to discrete characters, and the number of characters in relation to the number of taxa. Bootstrap tests provide a method to avoid this potential bias.

Ancillary