Molecular cloning and characterization of four novel LMW glutenin subunit genes from Aegilops longissima , Triticum dicoccoides and T. zhuko v

Molecular cloning and characterization of four novel LMW glutenin subunit genes from Aegilops longissima , Triticum dicoccoides and : 92 (cid:2) This paper reports cloning and characterisation of four novel low-molecular-weight glutenin subunit (LMW-GS) genes (designated as TzLMW-m2 , TzLMW-m1 , TdLMW-m1 and AlLMW-m2 ) from the genomic DNA of Triticum dicoccoides , T. zhuko v skyi and Aegilops longissima . The coding regions of TzLMW-m2, TzLMW-m1, TdLMW-m1 and AlLMW-m2 were 1056 bp, 903 bp, 1056 bp and 1050 bp in length, encoding 350, 300, 350 and 348 amino acid residues, respectively. The deduced amino acid sequences showed that the four novel genes were classified as LMW-m types and the comparison results indicated that the four genes had a more similar structure and a higher level of homology with the LMW-m genes than the LMW-s and (cid:2) i types genes. However, the first cysteine residue’s positions of TzLMW-m2, TdLMW-m1 and AlLMW-m2 were different from the others. Moreover, AlLMW-m2, TdLMW-m1 and TzLMW-m2 all possessed a longer repetitive domain, which was considered to be associated with good quality of wheat. The secondary structure prediction revealed that the content of b -strand in AlLMW-m2 and TdLMW-m1 exceeded the positive control, suggesting that AlLMW-m2 and TdLMW-m1 should be considered as candidate genes that may have positive effect on dough quality. In order to investigate the evolutionary relationship of the novel genes with the other LMW-GSs, a phylogenetic tree was constructed. The results lead to a speculation that AlLMW-m2, TdLMW-m1 and TzLMW-m2 may be the middle types during the evolution of LMW-m and LMW-s.

Wheat bread-making quality is largely determined by seed storage proteins present in the endosperm of the grain (SHEWRY and HALFORD 2002). The gliadins and glutenins are major seed storage proteins, which determine dough extensibility and elasticity (PAYNE 1987). Glutenins consist of high molecular weight (HMW) and low molecular weight (LMW) glutenin subunits (GS), which are held together by inter-and intra-molecular disulphide bonds to form the glutenin macropolymer. The HMW-GS represent approximately 10% of the total seed storage proteins and their functions have been well established (SHEWRY et al. 1992(SHEWRY et al. , 1995. The LMW-GS account for 40% of wheat gluten and 60% of glutenins. Increasing evidence showed that these proteins significantly affect doughquality characteristics (GUPTA et al. 1989(GUPTA et al. , 1991POGNA et al. 1990; NIETO-TALADRIZ et al. 1994;SISSONS et al. 1998;TANAKA et al. 2005), which are important for bread-making.
The coding genes of LMW-GS are located at Glu-A3, Glu-B3 and Glu-D3 loci on the short arms of chromosome 1 and 6 groups (D' OVIDIO and MASCI 2004). According to their electrophoretic mobility in SDS-PAGE and their isoelectric points (JACKSON et al. 1983), LMW-GS are classically subdivided into B, C and D groups. So far, three types of typical LMW glutenin subunits have been found based on the first amino acid residue of N-terminal sequences, viz. LMW-m, LMW-s and LMW-i types with respective methionine, serine and isoleucine as the first amino acid residue of N-terminal (D' OVIDIO and MASCI 2004). The LMW-s type subunits seem to be predominant (SHEWRY et al. 1983;LEE et al. 1999). It was found that two allelic groups detected in durum wheat, Hereditas 145: 92Á98 (2008) designated as LMW-1 and LMW-2, were related to poor and superior quality characteristics, respectively (POGNA et al. 1990). Relationship between LMW-GS and quality attributes in wheat has been studied and used for quality improvement of bread wheat (MA et al. 2005).
In the past fifteen years, different LMW-GS genes have been isolated and characterized in bread wheat (D'OVIDIO et al. 1992; VAN CAMPENHOUT et al. 1995;MASCI et al. 1998;CASSIDY et al. 1998;CLOUTIER et al. 2001;IKEDA et al. 2002;ZHAO et al. 2006ZHAO et al. , 2007 and some related species (D'OVIDIO and MASCI 2004). The first complete LMW glutenin subunit gene was separated by PCR from durum cultivar Lira (D'OVIDIO et al. 1992). IKEDA et al. (2002) isolated several LMW-GS genes from a soft wheat cultivar and classified them into 12 groups based on the N-and Cterminal sequences. More recently, some studies have focused on the characterization of LMW-GS genes from related species, including Ae. tauschii (JOHAL et al. 2004;PEI et al. 2007), cultivated einkorn (LEE et al. 1999;AN et al. 2006), Agropyron elongatum (LUO et al. 2005), Triticum dicoccoides (LI et al. 2007) and durum wheat (D'OVIDIO et al. 1997(D'OVIDIO et al. , 1999. In this paper, we isolated and characterized four novel LMW-GS genes, one each from Aegilops longissima (2n 02x 014, S l S l ) and Triticum dicoccoides (2n 0 4x 028, AABB) and two from Triticum zhukovskyi (2n 06x 042, A m A m AAGG) respectively which shall provide new gluten gene resources and insights into the origin and evolution of LMW-GS gene family in Triticum species.

Genomic DNA extraction and PCR amplification
Genomic DNA was isolated from single dry seeds according to the procedure of YAN et al. (2004) with slight modification. LMW-GS gene-specific degenerate primers were designed based on published LMW-GS gene sequence (D'OVIDIO et al. 1999;ANDERSON et al. 2001;IKEDA et al. 2002): LMW-1 (5?-ATCAT-CACAAgCACAAgCATCg-3?) and LMW-2 (5?-TTC-TTATCAgTAggCACCAA Cg-3?). PCR amplifications were performed in 50 ml reaction volume containing 2.5U of La Taq polymerase (TaKaRa, Japan), 100 ng of template DNA, 25 ml of 2)GC buffer II (MgCl 2 ' plus), 0.8 mM of dNTPs, and 0.5 mM of each primer. The reactions were carried out in a PTC-100 thermo-Cycler (MJ Research) using the following protocol: 948C for 2 min to denature the DNA, 35 cycles at 948C for 45 s, 588C for 1 min and 728C for 2 min, and a final extension at 728C for 10 min. The purified PCR products were separated in 1% agarose gels and the expected fragments were purified from the gels using Quick DNA extraction kit (TaKaRa).

DNA cloning and sequencing of LMW-GS genes
All reclaimed PCR products were cloned into pGEMÁ T vectors (Promega, USA) and transformed into Escherichia coli DH5a strain. DNA sequencing was performed by TaKaRa Biotech Inc., Japan. Each clone was sequenced three times to avoid possible error.
Sequence comparison, secondary structure prediction and phylogenetic analysis Bioedit 7.0 was used to complete the multiple alignment of different LMW glutenin subunits based on the complete amino acid sequences. DNAMAN 5.2.2 was used to construct phylogenetic tree. The PSIPRED method was used for secondary structure prediction, and the predicted results were obtained through Bhttp://bioinf.cs.ucl.ac.uk/ psipred/psiform.html by PSIPRED server (JONES 1999;MCGUFFIN et al. 2000;BRYSON et al. 2005). The previously reported XY-GluD3-LMWGS1 (Genbank accession AY263369) that was believed to have a positive effect on dough quality was used as control (ZHAO et al. 2004).

PCR amplification and cloning of LMW-GS genes
By using the designed primer LMW-1 and LMW-2 to amplify LMW-GS genes from Triticum dicoccoides Y19, Y21, Triticum zhukovskyi PI355707 and Aegilops longissima PI604108, a single band with a size of about 1000 bp was amplified in PI355707 and PI604108, while two bands in Y19 and three bands in Y21 were obtained (Fig. 1). After cloning and sequencing, four novel LMW glutenin subunit genes were obtained and designated as TzLMW-m1, TzLMW-m2, TdLMW-m1 and AlLMW-m2. The ORFs of these four genes were 1056 bp, 903 bp, 1056 bp and 1050 bp with 350, 300, 350 and 348 amino acid residues, respectively. These four novel LMW-GS genes were deposited in the GenBank database under respective accession numbers EF188292, EF188291, EF188290 and EF188289.

Molecular characterization and comparison analysis
LMW-GS genes usually contain eight conserved cysteine residues. The relative positions of the cysteines are also conserved except these of the first and seventh cysteines. The first amino-acid of N-terminal was all methionine for the four cloned genes in our study (Fig. 2), indicating that the LMW proteins encoded by the four genes belong to LMW-m type glutenin subunits. The first cysteine position of m-type LMW-GSs has previously been found at the 5th position within the N-terminal conserved domain or in the repetitive domain (LEW et al. 1992;MASCI et al. 1998). In our study, the first cysteine residue was found at the 65th position in the deduced amino acid sequences of two cloned genes TzLMW-m2 and TdLMW-m1, while the first cysteine residue of AlLMW-m2 was found at the 70th position. While the deduced protein sequences of these four genes all had similar primary structure to that of other LMW-m type subunits reported previously. There were differences in the N-and C-terminal domains and repetitive sequences, including site mutations, base substitutions, and deletions/insertions etc. A number of insertions were found by comparing TzLMW-m2, TdLMW-m1 and AlLMW-m2 to other LMW-m type subunits. Comparing to TzLMW-m1, there were insertions of one 15-peptide, two 5-peptide, one 7-peptide and one 17-peptide in TzLMW-m2 and TdLMW-m1. Due to these peptide insertions, TzLMW-m2 and TdLMW-m1 possessed longer repetitive domains than other LMW-m type subunits. This suggests that TzLMW-m2, TdLMW-m1 and AlLMW-m2 may be associated with good quality of wheat since a longer N-terminal repetitive domain was speculated to have positive influence on quality of wheat flour (MASCI et al. 1998(MASCI et al. , 2000. LEW et al. (1992) pointed out that LMW-m type subunits were mainly composed of polypeptides having the METSH-or METSC-of N-terminal amino acid sequences with the former group being more abundant than the latter. According to this, TzLMW-m2 and TdLMW-m1 belonged to typical LMW-m type subunit with the N-terminal amino-acid sequence METSH-, whereas TzLMW-m1 possessed a METSCsequence and AlLMW-m2 possessed a METNHsequence. IKEDA et al. (2002) further classified LMW-GS into 12 groups based on the alignment of N-and C-terminal conserved domains of the deduced amino acid sequences of mature proteins. Based on this, TzLMW-m2 and TdLMW-m1 were classified into group 2, whereas, AlLMW-m2 and TzLMW-m1 were not fall into any of these groups, suggesting that there may be more allelic variance of LMW-GS in nature than previously reported. The length of LMWm type subunits in repetitive domain was shorter than those of LMW-i and LMW-s type subunits (Fig. 2), which mainly due to the more deletions occurred in this domain. However, there is a deletion of 107 amino acids in repeat domain of AB062876, which resulted in a shorter sequence than other LMW-i type subunits. Table 1. The secondary structure prediction of the four deduced LMW-GSs.

Prediction of secondary structure
Protein secondary structure is the base of high complex spatial conformation and composes of four structural motifs including a-helix, b-sheet, b-turn and random coil. The method PSIPRED does not distinguish b-sheet and b-turn (JONES 1999) and gives them a general designation as b-strand. In general, b-strand is considered to endow the protein with high elasticity and to improve the capability to resist distortion in HMW-GS. Considering the fact that the conformation of LMW-GS has the same characteristics as HMW-GS, the content of b-strand in LMW-GS may have positive effect on dough quality (TATHAM et al. 1985(TATHAM et al. , 1987(TATHAM et al. , 1990. Until now the secondary structure predictions of glutenin subunits have not been reported widely because of their specific characteristics. The prediction results in our study (Table 1) showed that the b-strand number of TzLMW-m2 and TdLMW-m1 was 2, equal to the control (AY263369); AlLMW-m2 had 4 b-strands. AlLMW-m2 has the highest percentage of b-strand and TdLMW-m1 ranks the second, both exceeding the control. These results implied that AlLMW-m2 and TdLMW-m1 could be likely to have positive effects on dough properties. As far as the dispersal of every motif, the results showed that a-helix dispersed broadly in all domains except the glutamine-rich region (IV). The cysteine-rich region (III) and C-terminal conserved region (V) contained the most a-helix. In all of the predicted subunits, the motif b-strand dispersed in region III and V, with AlLMW-m2 having an extra b-strand in region IV.

Phylogenetic analysis of four novel genes and other LMW-GS genes
The four novel LMW-m type genes (this study) together with other 10 LMW-m type genes, four LMW-s type genes and seven LMW-i type genes from GenBank that are originated from different genomes of diploid, tetraploid and hexaploid species were used to construct a homology tree to analyze evolutionary relationships among different LMW-GS (Fig. 3). It was obvious that the homology tree was clustered into two clear branches that represent LMWi and LMW-m/LMW-s type genes. The LMW-i genes showed greater divergences to LMW-s and LMW-m genes with the homology being 70%, suggesting that the LMW-i genes had more distant lineage to LMW-s and LMW-m genes with the latter two being the adjacent lineages. Interestingly, within the LMWm/LMW-s clade, TzLMW-m2, TdLMW-m1 and AlLMW-m2 aggregated outside of the other LMWm and LMW-s genes and shared a homology of 80% with them, while TzLMW-m1 clustered closely with two LMW-m subunits (EMBL accession AY296753 and DQ681082) sharing a similarity of 94%; this subclade congregates with other LMW-m and LMW-s genes in turn. This implies that TzLMW-m2, TdLMW-m1 and AlLMW-m2 may be the middle types during the independent evolution processes of LMW-m and LMW-s. Two LMW-m type subunit genes (EMBL accession AY994362 and DQ681082) from Triticum aestivum were clustered into LMW-s type subunit genes and share a similarity of 88% with LMW-s type subunit genes, indicating that these two genes may be more similar to LMW-s type subunit genes in structure. As characterized in this work, the four LMW-GS genes isolated from wheat related species are expected to be used as new genetic sources for wheat quality improvement.