Metagenomic data, especially sequence data from large insert clones, are most useful when reasonable inferences about phylogenetic origins of inserts can be made. Often, clones that bear phylotypic markers (usually ribosomal RNA genes) are sought, but sometimes phylogenetic assignments have been based on the preponderance of blast hits obtained with predicted protein coding sequences (CDSs). Here we use a cloning method which greatly enriches for ribosomal RNA-bearing fosmid clones to ask two questions: (i) how reliably can we judge the phylogenetic origin of a clone (that is, its RNA phylotype) from the sequences of its CDSs? and (ii) how much lateral gene transfer (LGT) do we see, as assessed by CDSs of different phylogenetic origins on the same fosmid? We sequenced 12 rRNA containing fosmid clones, obtained from libraries constructed using DNA isolated from Baltimore harbour sediments. Three of the clones are from bacterial candidate divisions for which no cultured representatives are available, and thus represent the first protein coding sequences from these major bacterial lineages. The amount of LGT was assessed by making phylogenetic trees of all the CDSs in the fosmid clones and comparing the phylogenetic position of the CDS to the rRNA phylotype. We find that the majority of CDSs in each fosmid, 57–96%, agree with their respective rRNA genes. However, we also find that a significant fraction of the CDSs in each fosmid, 7–44%, has been acquired by LGT. In several cases, we can infer co-transfer of functionally related genes, and generate hypotheses about mechanism and ecological significance of transfer.