SIMMONS, MARK P.1*, HELGA OCHOTERENA2, and JOHN V. FREUDENSTEIN1. 1The Ohio State University Herbarium, Ohio State University, 1315 Kinnear Road, Columbus, Ohio 43212; 2Instituto de Biologia, UNAM, Apdo. Postal 70-367, CP 04510, Mexico. - Amino acid vs. nucleotide characters: challenging preconceived notions.
The Soltis et al. (2000) 567-terminal simultaneous analysis of
atpB, rbcL, and 18S rDNA was used as an empirical
example to test the use of amino acid vs. nucleotide characters for
protein-coding genes at deeper taxonomic levels. Nucleotide characters
for atpB and rbcL have 6.5 times the amount of possible
synapomorphy as amino acid characters. The nucleotide-based jackknife
tree is much more resolved than the amino acid-based tree, for both
large and small clades. Nearly twice the percentage of well supported
clades resolved in the 18S rDNA tree are resolved using nucleotide
characters (88.5%) relative to amino acid characters (47.5%). The well
supported clades resolved by both character types are much better
supported by nucleotide characters (98.6% vs. 83.3% average jackknife
support). Nucleotide characters outperform amino acid characters even
when both matrices are reduced to the same amount of possible
synapomorphy (236 randomly selected informative nucleotide characters
vs. all 411 informative amino acid characters). For the reduced
nucleotide-based matrix, 72.1% of the well supported clades are
resolved, and the well supported clades resolved by both character
types are better supported by nucleotide characters (92.7% vs. 85.9%
average jackknife support). Although the performance of nucleotide
characters decreased with reduced sampling of terminals, amino acid
characters did not improve. Nucleotide characters outperformed amino
acid characters even with 90% of the terminals deleted, in order to
increase genetic distance between clades. Of the 14 cases of
conflicting resolution between the amino acid and nucleotide-based
jackknife trees, there is independent evidence for the phylogeny of 11
these groups. For 10 of the 11 cases, the independent evidence
supports the nucleotide-based topology. There is evidence of
convergence to the same amino acid specified by different codons
and/or artifacts caused by the use of composite characters for the
amino acid characters supporting eight of these contradictory clades.
Key words: amino acid, angiosperm phylogeny, atpB, character coding, phylogenetic information, rbcL