THE NATURE OF CLADISTIC DATA

Authors


Abstract

Abstract— Cladistic data are the characters of organisms. Character is defined as a feature that can be evaluated as a variable with two or more mutually exclusive and ordered states. Cladistic characters must be treated as multistate variables, and coded as sequential numbers or in additive binary fashion. Any other interpretation and handling of cladistic data will introduce error into analysis. Character states cannot be treated independently as present or absent, i.e., as nominal variables, because redundancy is introduced into the data and information content is sacrificed. Non-additive binary coding demonstrates that treating cladistic variables as nominal data will lead to multiple, equally parsimonious solutions. Defining characters found universally in a group of organisms, but unknown outside those organisms have no alternative state that can be designated as absent. Absence, however, is valid as a character state if it can be shown to be apomorphic. When two or more character states occur within a taxon, that taxon must be coded as having an unknown state for that character, or the taxon must be split in two or more taxa. Continuously varying quantitative data are not suitable for cladistic analysis because there is no justifiable basis for recognizing discrete states among them. Quantitative data are questionable even when they exhibit mutually exclusive states because the states can be interpreted only in reference to an archetype, i.e., as implied homologies not subject to test.

Ancillary