Comprehensive EST analysis of tomato and comparative genomics of fruit ripening


(fax +1 607 254 2958; e-mail


A large tomato expressed sequence tag (EST) dataset (152 635 total) was analyzed to gain insights into differential gene expression among diverse plant tissues representing a range of developmental programs and biological responses. These ESTs were clustered and assembled to a total of 31 012 unique gene sequences. To better understand tomato gene expression at a plant system level and to identify differentially expressed and tissue-specific genes, we developed and implemented a digital expression analysis protocol. By clustering genes according to their relative abundance in the various EST libraries, expression patterns of genes across various tissues were generated and genes with similar patterns were grouped. In addition, tissues themselves were clustered for relatedness based on relative gene expression as a means of validating the integrity of the EST data as representative of relative gene expression. Arabidopsis and grape EST collections were also characterized to facilitate cross-species comparisons where possible. Tomato fruit digital expression data was specifically compared with publicly available grape EST data to gain insight into molecular manifestation of ripening processes across diverse taxa and resulted in identification of common transcription factors not previously associated with ripening.