pmic7265-sup-0001-TableS1.doc3385K | Supplementary Table 1: Comparison of quantitative precision with different normalization methods in IDEAL-Q. Different SD distributions with method combinations A-H were calculated by using replication dataset. It shows marginal difference in terms of quantitative precision among method combinations A-H. And the relatively better method C was selected. Supplementary Figure 1: User interfaces of LFQuantViewer. LFQuantViewer can be divided into four parts, including Protein Table, Peptide Table, Peptide Information and Statistic Analysis. Protein Table shows the quantification information of all identified proteins, including protein access number, ratios, number of unique peptide and estimated abundance. Peptide Table shows the quantification information of the peptides which belong to the protein selected in Protein Table, including peptide IDs, sequence, peptide molecular mass, charge state, estimated abundance and spectral counts. For the peptide selected in Peptide Table, Peptide Information shows its quantification information of different experiments. In this part, peptide XIC of the selected experiment can be popped up through pushing the button "XIC Viewer". In Statistic Analysis, some statistic charts for peptide and protein, including "Abundance vs. Abundance", "Ratio histogram plot", "Ratio scattered plot" and "RT vs. RT", can be viewed through pushing the button "Peptide Analysis" or "Protein Analysis". Supplementary Figure 2: An example of rough XIC extraction step for the peptide (sequence: HSTIFENLANK, charge=2, m/z=637.3304) in UPS1 standard dataset. It consists of three main steps. 1) The intensities of the peptide and its isotope ions in MS1 within a pre-defined window around its seed time are extracted by setting a pre-defined mass tolerance. 2) For each time point, if both S/N and goodness of fit between the theoretical and observed isotopic distribution are more than pre-defined thresholds, the intensity of the peptide at this point is set to the fitting constant C, otherwise null. 3) If a time point is out of retention time range, or meets a retention time gap longer than a pre-defined threshold, then the raw XIC endpoint is said to find. Supplementary Figure 3: The scheme of the isotope distribution pattern criterion. The theoretical relative abundance of isotopic peaks are I_{i}^{T}, i = 1,2,…,n, which is calculated from the natural distribution of peptide elements with polynomial expansion method [4]. The observed isotopic intensities are I_{i}^{E}, i=1,2,…,n, which are extracted from the centriod MS1 spectra by searching the theoretical position of an isotopic peak with a pre-defined error tolerance. Then the relations between them can be expressed as: IiE=CIiT+ɛi,i=1,2,...,n. Supplementary Figure 4: An example of accurate peak picking step for the peptide (sequence: HSTIFENLANK, charge=2, m/z=637.3304) in UPS1 standard dataset. All raw data points (raw XIC, black line in Fig. A) are smoothed by a Savitzky-Golay filter. For accurate peak picking step, firstly, LFQuant gets a so-called structuring XIC (thick black line in Fig. B) by a Savitzky-Golay filter with large-scale frame size (frame = [the number of data points/2]). Then, detection of peak positions is performed on the structuring XIC, which is achieved by detected the local minimums of the structuring XIC. The peak boundaries are determined by the two adjoined minimum points within which the seed time exists, the quantification value is computed on the rough XIC (red line in Fig. A and Fig. B). Supplementary Figure 5: Correlation of peptide retention times in 2 replicates under different LC parameters. The raw data comes from the data set which is available at http://www.marcottelab.org/users/MSdata/Data_02/ [5]. The data set consists of eight runs which were performed on an LTQ-Orbitrap varying a range of parameters for optimization. Each run contains four raw files that were performed by different SCX salt steps. The first two raw files in run 1 were selected to illustrate the performance the retention time prediction model. The retention times of 565 commonly identified peptides in the two raw files were used to construct the reversible model. It is not difficult to see that the reversible model can depict the nonlinear effect of retention time shift and overcome the effect of outliers. Supplementary Figure 6: Parameter (XIC retention time range) selection for replication dataset in IDEAL-Q. In order to save time, 2 runs of 10 replicated runs were imported into IEADL-Q. R-Square which is calculated by peptide abundances between two runs is used to estimate peptide quantitative precision. It is clear that the parameter when XIC retention time range is 1.5 outperforms other parameters in terms of precision. Supplementary Figure 7: Parameter (XIC slope) selection for UPS1 standard dataset in IDEAL-Q. In order to save time, only five runs, which contain sample A, sample B, sample C, sample D and sample E, were imported into IEADL-Q. The box plots of peptide ratios at XIC Slope with [20000, 8000, 28000] are shown in Fig. A-C, respectively. It is clear that the parameter when XIC Slope is 20000 outperforms other parameters in terms of quantitative accuracy. Supplementary Figure 8: Parameter (XIC retention time range) selection for UPS1 standard dataset in IDEAL-Q. In order to save time, only five runs, which contain sample A, sample B, sample C, sample D and sample E, were imported into IEADL-Q. The box plots of peptide ratios at XIC retention time range with [0.2, 0.5, 1.0, 1.5, 2.0, 2.5] are shown in Fig. A-F, respectively. It is clear that the parameter when XIC retention time range is 0.2 outperforms other parameters in terms of quantitative accuracy. Supplementary Figure 9: Comparison of quantitative accuracy with different normalization methods in IDEAL-Q. Different protein ratio distributions with method combinations A-C and E-G are calculated by using UPS1 standard dataset. It is easy to see that the method A (i.e. no run level normalization and no peptide ratio normalization) whose estimated ratios are closer to the expected ratios outperforms other method combinations in terms of quantitative accuracy. Supplementary Figure 10: Comparison of quantitative precision at peptide level with different quantitative software using the replication dataset. In this figure, the peptides whose quantitative values are all more than zero in 10 replicated runs were used to compare the quantitative precision at peptide level. It was achieved by calculating the coefficient of variation of peptide abundances. The box plots indicate that IDEAL-Q is the least precise at peptide level, while the coefficients of variation are relatively low in other three tools. In addition, LFQuant performs slightly better than MassChroQ and MaxQuant. For MassChroQ quantification sees Suppl. Text 4. Supplementary Figure 11: Comparison of quantitative precision at protein level with different protein ratio calculation methods using the replication dataset. For the method LFQuant-Intensity which is the same as protein ratio calculation methods of MaxQuant, protein ratios are calculated by using peptide raw intensities and normalized by the number of peptide identifications, while protein ratio estimation of LFQuant is achieved by calculating a weighted mean of the ratios of its identified unique peptides. The box plots of standard deviation show that protein ratio calculation method using peptide ratios is better than the method using peptide raw intensities. Supplementary Figure 12: Box plots of log 3 UPS1 protein ratio between sample E/D, D/C, C/B, and B/A with lower UPS1 protein amounts (20+6.7 fmole, 6.7+2.2 fmole, 2.2+0.74 fmole, 0.74+0.24 fmole). The expected ratio of UPS1 proteins between sample E/D, D/C, C/B, and B/A is 3. Box plots indicate that the quantitative accuracy decreases with lower UPS1 protein amount, and LFQuant outperforms MaxQuant slightly, and outperforms IDEAL-Q obviously in terms of accuracy. Supplementary Figure 13: Comparison of quantitative accuracy at peptide level with different quantitative tools using UPS1 standard dataset. These figures show the box plots of the peptides ratio of UPS1 proteins with different quantitative tools in the case of 3-fold (Fig. A), 9-fold (Fig. B), 27-fold (Fig. C) and 81-fold (Fig. D) change, respectively. And the number at the top of each figure is the number of quantified peptide of different tools. On the whole, LFQuant and MassChroQ produced similar results, except that MassChroQ produced larger interquartile distance in the case of 81-fold change, and LFQuant provided the more number of quantified peptides than MassChroQ. For MaxQuant, the medians were similar with the ones of LFQuant and MassChroQ, but the interquartile distances were always larger than the ones of LFQuant and MassChroQ. In addition, the interquartile distance increased with higher ratios, indicating poorer quantitative accuracy. That is rather more obvious for IDEAL-Q. For MassChroQ quantification see Suppl. Text 4. It is noted that the peptide ratios were all overestimated by using LFQuant, MassChroQ and MaxQuant in the case of 3-fold, 9-fold and 27-fold change. This kind of overestimation can be corrected by using our normalization method in LFQuant to some extent (as shown in Suppl. Fig. 14). Supplementary Figure 14: The box plots of peptide ratio before and after normalization of LFQuant in the case of 3-fold (Fig. A), 9-fold (Fig. B) and 27-fold (Fig. C) change. In LFQuant, peptide ratios are normalized by using linear regression normalization method before protein ratios calculation. Box plots show that the overestimation in the case of 3-fold, 9-fold and 27-fold change can be corrected by using our normalization method in LFQuant to some extent. Supplementary Figure 15: Comparison of quantitative accuracy at protein level with different protein ratio calculation methods using UPS1 standard dataset. The interquartile distance of the method using peptide ratios is a litter smaller than the method using peptide intensity, indicating better accuracy at protein level. That is rather more obvious in the case of 81-fold change. Supplementary Figure 16: The XICs of the peptide (sequence: SLDFLNQSFIQQK, charge=2, m/z=784.4132) extracted from replicate dataset (1percent_yeast20ul05.mzXML and 1percent_yeast20ul09.mzXML) using IDEAL-Q (Fig. A) and LFQuantViewer (Fig. B). The expected log ratio of this peptide between 1percent_yeast20ul05 and 1percent_yeast20ul09 is 0. However, the log ratio calculated by IDEAL-Q is 0.5590, while the log ratio of LFQuant is 0.0123 which is much closer to the expected ratio. From Fig. A, it is not difficult to see that the right endpoint of the XIC detected by IDEAL-Q from 1percent_yeast20ul09 is wrong, and the proper one should be around 56.5 minutes. Supplementary Figure 17: The XICs of the peptide (sequence: AAFDEDGNISNVK, charge: 3, m/z: 373.8615) extracted from UPS1 standard dataset (6D003.mzXML and 6E003.mzXML) using IDEAL-Q (Fig. A) and LFQuantViewer (Fig. B). The expected log 3 ratio of this peptide between 6D003 and 6E003 is -1, because it is a unique peptide of UPS1 standard protein (P04040ups). However, the log 3 ratio calculated by IDEAL-Q is 0.4048, while the log 3 ratio of LFQuant is -1.1881 which is much closer to the expected ratio. From Fig. A, it is not difficult to see that the XIC around 44 minutes detected by IDEAL-Q from 6E003 is wrong, and the proper one should be around 42.3 minutes. |