Label‐free single cell proteomics utilizing ultrafast LC and MS instrumentation: A valuable complementary technique to multiplexing

Abstract The ability to map a proteomic fingerprint to transcriptomic data would master the understanding of how gene expression translates into actual phenotype. In contrast to nucleic acid sequencing, in vitro protein amplification is impossible and no single cell proteomic workflow has been established as gold standard yet. Advances in microfluidic sample preparation, multi‐dimensional sample separation, sophisticated data acquisition strategies, and intelligent data analysis algorithms have resulted in major improvements to successfully analyze such tiny sample amounts with steadily boosted performance. However, among the broad variation of published approaches, it is commonly accepted that highest possible sensitivity, robustness, and throughput are still the most urgent needs for the field. While many labs have focused on multiplexing to achieve these goals, label‐free SCP is a highly promising strategy as well whenever high dynamic range and unbiased accurate quantification are needed. We here focus on recent advances in label‐free single‐cell mass spectrometry workflows and try to guide our readers to choose the best method or combinations of methods for their specific applications. We further highlight which techniques are most propitious in the future and which applications but also limitations we foresee for the field.


INTRODUCTION
In contrast to the already well-established single cell genomics and transcriptomics techniques, single cell proteomics (SCP) is still in its infancy. As a young and emerging field, SCP attracts tremendous attention because the proteome defines cellular identity and function. SCP therefore has enormous potential to not only understand cellular diversity but also in investigating pathogenesis. As such it is highly relevant in diagnostics and for the development of therapies for many diseases such as cancers.
Single cell sequencing would be unthinkable without nucleotide amplification. Polymerase chain reaction (PCR) catapulted nucleotide analysis into a new era, enabling technological breakthroughs in science and paving the way for society-transforming applications such as COVID-19 diagnostic tests. In proteomics and particularly in SCP a PCR for proteins would certainly be a gamechanger and lead to similar transformations in the field. Unfortunately, an innovation to amplify proteins is not in sight and it is quite doubtful whether it ever will be.
Instead, the field tries to deal with the analysis of infinitesimal low quantities and relies on lossless sample preparation combined with highest sensitivity mass spectrometers.
Initially limited to very large cells such as blastomeres or oocytes [6,7], SCP has matured substantially and now allows the identification and quantification of more than a thousand proteins for many cell types [8][9][10][11][12]. While the low abundance of protein sample material still poses a substantial challenge, major technical advancements in loss-free sample preparation, nanoflow liquid chromatography (LC), and mass spectrometry (MS) as well as data analysis algorithms have advanced the field of SCP considerably. Initially, most SCP studies were utilizing a classical label-free data dependent analysis (DDA) workflow. To achieve highest throughput and increase sensitivity, the field is now shifting towards sample multiplexing employing isobaric labeling such as tandem mass tag (TMT)-labeling as well as non-isobaric labeling [13][14][15]. By combining up to 18 cells using TMTpro into a single F I G U R E 1 Typical steps of a label free single cell proteomics workflow and its potential applications.
sample, sensitivity or rather protein abundance per analytical run is substantially improved by a factor of 18. Including a carrier proteome in TMT-labeling workflows, one can further increase peptide signal intensities. To this end, one TMT channel, termed the carrier channel, is populated with up to 200 single cells, with the adjacent one or two channels being left out to avoid contamination spilling over from the carrier channel, while the remaining channels are populated by single cells. This results in triggering more and richer fragmentation spectra leading to a significant increase in peptide and protein identifications, while still allowing relative quantification based on the reporter ions of the respective TMT channel [16]. This method, originally termed single cell proteomics by mass spectrometry workflow (SCoPE)-MS, has been introduced by Budnik et al. in 2018 [16] and the use of multiplexing and carrier proteomes in SCP has since been adopted widely [17][18][19].
Still, TMT-labeling workflows suffer from several disadvantages.  [20,21]. Second, MS2-based quantitation after TMT-labeling is well-known to suffer from ratio distortion due to unintended co-isolation of multiple peptides [22][23][24]. While the extent of ratio distortion is larger the wider the isolation window is set, even at narrow isolation windows, ratio distortion from co-isolation cannot be completely avoided. Thirdly, the dynamic range is very limited, and quantification might be negatively affected by highly abundant carrier channels. The majority of the peptide ions collected before fragmentation will arise from the carrier proteome whereas only few ions are collected from the actual single cells resulting in quantification inaccuracy [14,25,26]. Fourthly, data independent analysis (DIA) is becoming more and more popular in the field but is challenging to utilize in combination with isobaric labeling [27].
Hence, label-free single cell proteomics (LF-SCP) presents a viable and complementary option that does not introduce any bias, since no carrier is used. By this they outcompete multiplexed approaches thanks to higher quantitative accuracy. In this review, we focus on label-free approaches to investigate single cells and discuss strengths and limitations, current challenges, and future perspectives for the field. Typical workflow steps and applications are summarized in Figure 2.

SAMPLE PREPARATION FOR LF-SCP
Proper sample preparation is likely the most crucial step of single cell workflows as all further steps relies on its sensitivity, robustness, and reproducibility. Efforts made to process single cells or ultra-low inputs label-free are summarized in Table 1.
Preparation of ultra-low input samples is limited by their extremely low peptide concentrations. Peptide losses by adsorption to surfaces are among the biggest concerns. In addition, high excess rates of protease over protein have to be used since protease activity is also concentration dependent [28]. This, however, leads to intense background signals of the used protease resulting in signal suppression F I G U R E 2 Number of publications bearing the term "Single Cell Proteomics" listed on PubMed for each year since first mentioned in 2004.
that limits reachable dynamic range. Poor reproducibility is another important key limitation of SCP workflows. Random errors, differences in sample transfer steps or storage, that would not be of concern when analyzing higher-concentration bulk samples yield in significant changes in results of SCP workflows.
To alleviate these challenges, researchers minimize the volumes used during sample preparation. This lowers the covered surface area available for adsorption processes and increases peptide and protease concentration at the same time. Most protocols therefore process their samples in volumes ranging from few microliters down to nanoliters (Table 1). As a tradeoff, this introduces reproducibility issues due to sample drying and inaccurate pipetting of low volumes. Both can be largely avoided by workflow automatization including specific measures to reduce sample drying such as working under high humidity and artificial hydration [29] or working under a protective oil layer [12,30].
The reader is referred to another excellent review for a comprehensive overview on the degree of automatization in current workflows [31].
Another crucial point to reduce sample losses is to omit sample transfer or cleanup steps whenever possible. Even for protein handling amounts of 50 and 2 μg sample loss ranges from 15% to 89% respectively upon multiple transfers [32]. A single HeLa cell is estimated to contain much less material, only 150-250 pg protein [33], which highlights how critical it is for the success of any single cell experiment to avoid sample loss. As a result, most ultra-low input workflows complete all processing steps within a single pot or even container-less [34].
Cleanup steps are required for most bulk proteomic workflows to remove MS incompatible reagents as urea, SDS, or cellular debris. In contrast, especially label-free workflows with only a single cell as input usually cannot afford to risk sample losses or to introduce sample variability through a cleanup step. This leads to the usage of MS compatible detergents, such as n-dodecyl β-D-maltoside (DDM) [35,36] or skipping of otherwise common steps such as reduction and alkyla-  apply online sample cleanup and concentration using loading columns [29,34,37,38] or StageTips [39].

LIQUID CHROMATOGRAPHIC SEPARATION
The tremendous importance of chromatographic separation performance for (complex) proteomic samples was already demonstrated back in 2016 by Shishkova et al. [46] Indeed, nanoflow liquid chromatography is most commonly used in proteomics since peptide ionization by electrospray ionization is improved at low flow rates, resulting in improved sensitivity [47,48]. While long gradients at low flow rates show best separation power, their bottleneck is speed. In case of very long gradients, peak-broadening again occurs causing less intense peaks. This limits sensitivity, which is crucial for ultra-low inputs ( Figure 3A). To keep pace with multiplexed approaches, shortest possible run-to-run times are pursued in LF-SCP. In conclusion, shortest possible gradients employing low flow rates would be ideal to achieve the best sensitivity and throughput. In line with these considerations, there indeed seems to be a sweet spot for gradient length.
Furthermore, scan speed and required fill times of the MS are also factors that need to be considered for the choice of the minimal gradient length. Too long gradients on the other hand suffer from peak broadening and loss of sensitivity. Our data show, that for a single cell level input of 250 pg using a DDA method, 808 protein groups are identified with a 10 min active gradient, which is increased to 1485 protein groups when using a 30 min gradient and drops again down to 1320 protein groups for a 50 min active gradient [49].
This again highlights that sharp chromatographic peaks are desirable. They were further shown to be triggered closer to their apex in data dependent acquisition (DDA) thanks to lowered median offset time between actual peak apex and MS2 spectra triggering time [50].
Highly resolved peaks additionally reduce the chance for co-isolation of precursors within their isolation window [51] (if not intended by broad m/z windows, see description of "WWA" in section "Ultrafast mass spectrometry for label-free (LF) SCP"). While this is less important for label-free approaches, minimizing co-isolation is essential for achieving high accuracy for reporter ion quantification in multiplexed workflows. A reduced overlap across (sharp) peaks further leads to lowered competition between peptides in the ion source which augments their intensity in the resulting MS spectrum.
Flow rates as low as 790 pL/min using 2 μm inner diameter open tubular columns have enabled identification of more than 100 and ∼1000 proteins from sub-single-cell amounts of 0.75 or 75pg respectively in a 30 min active gradient [52]. Although these results are impressive, such low flow-rates could only be achieved by splitting the solvent flow, and further improvements will be needed to allow for a loss-free sample injection of a single cell [50]. As a trade-off, low flow rates require long time periods for column loading, washing, and reequilibration. This has a negative effect on throughput, since the actual run-to-run time is much longer than the active gradient.
A large-scale plasma study demonstrated that indeed the chromatographic system is responsible for 80% of MS idle times. Loading, washing, and equilibration may take up to 20 min or more which makes short gradients for high throughput less attractive [53,54]. Increasing flowrates before and after the active gradient largely addresses this issue ( Figure 3D). With this, run-to-run times were shown to be successfully shortened to 20 min for a 10 min gradient [45,49]  In a recent preprint from our lab, we evaluated the use of ultrashort gradients, down to 5 min with a 7.4 min final run-to-run time for single cell inputs. Here, we are limited by the Orbitrap scan speed that leads to loss of proteomic depth when using such short gradients.
MS1-only data acquisition is an exciting yet immature alternative that enabled identification of 750 proteins from single cell input in 5 min [55]. In addition, quantitative accuracy was reported to be superior over standard DDA for the MS1-only approach [56].
Another chromatographic method designated for high throughput SCP is Whisper implemented by Evosep [57]. Using low flowrates of only 100 nL/min and short gradients, 20-40 SPD can be processed and this was already successfully used to quantify up to 2000 proteins from a single cell [39].
The usage of multiple columns ( Figure 3C) is another smart strategy to improve throughput. While one analytical column is used for peptide separation the other is washed and re-equilibrated to keep MS downtime low. Such a setting was already successfully used to analyze 200 single cell proteomes a day using a label-free approach [58].
Such tricks should however be considered with care to maintain retention time (RT) stability across runs. Carefully adopted systems allow to maintain RT variation between both columns below 2% [59].
However, especially for very short gradients this might already be problematic, even more in case of flipped instead of linear shifts of RTs ( Figure 3B) for match between run (MBR) algorithms or for software tools relying on RT prediction to score peptides. Arguably an excellent alignment across replicates is also of greatest importance to reduce missing values in biological studies with large sample numbers as is for targeted approaches with scheduled triggering of precursors of interest.
Trap-and-elute setups are commonly used to deliver samples onto the column. Thereby the analyte is first loaded to a trapping column between two ports of a switching valve at high flow rate. After that, the analyte is either eluted into the analytical column via the gradient or by inversing the flow direction on the trapping column (backflushing) ( Figure 3C) [60]. shows excellent compatibility with current LF-SCP workflows [39,61].
The disadvantage of trapping columns and StageTips is that sample retention can be incomplete, leading to sample losses. SCP injection volumes are usually small enough to be applied directly onto the analytical column in a reasonably short time. In conclusion, there is no consensus whether or not trap and elute setups should be used in SCP workflows (see Table 1).
Analytical columns with the highest possible chromatographic resolution are used to separate peptides in SCP workflows. Packed bed type columns are commonly used. Increasing column length and decreasing silica particle diameter has historically led to improved performance and increased operating pressures [61][62][63]. Two decades ago, the introduction of highly ordered porous pillar arrays revolutionized the field due to their superior chromatographic resolution while backpressure is significantly reduced, allowing for higher flow rates and throughputs [64,65]. The highly ordered layout reduces dispersion due to flow path variability which is a significant cause of peak broadening in conventional columns [66]. Subsequently superficially porous and non-porous materials were introduced. Although the superficially porous particle architecture has been known for more than 50 years [67], its application in μ-pillar array columns (μPACs) for low and ultralow input samples was demonstrated very recently [44,68]. μPAC columns bearing nonporous pillars were further developed for ultralow input amounts and showed superior performance compared to conventional porous particle packed columns [69].

ULTRAFAST MASS SPECTROMETRY FOR LF-SCP
Based on the requirements of ultra-low-input samples and compatible optimal chromatographic settings (see section "liquid chromatographic  [20,71] and cannot be resolved using state of the art TOF devices, this is of less importance for label-free workflows. Speed therefore seems to be the greater advantage here but still the sensitivity of TOFs was a hurdle in early single cell approaches. Recently, the Mann and coworkers [39] modified the geometry, glass capillary, and ion optics of a their timsTOF Pro instrument to improve its robustness against contaminations and to increase ion transmission by a factor of 4.7. With this the updated device, now called timsTOF SCP, is sensitive enough to perform single cell level studies. Using the timsTOF SCP and benefiting from the high TOF scan speed, they demonstrated the successful and reproducible quantification of >3900 proteins from 1 ng of HeLa input. Using a data independent acquisition (DIA, see also below), the resolution coefficient of variation (CV) was <10% and data completeness was at 92 % across five replicates.
In contrast, the lowered scan speed of Orbitraps results in less datapoints per peak triggered and higher CVs especially for DIA methods with sometimes long duty cycles. For DIA, quantification on MS2 level is preferred as it is believed to be more accurate by better dealing with co-elution bias [72,73]. There are however tricks to reduce the cycle time to eventually end up with more datapoints per peak for quantification. Using high resolution mass spectrometry data independent acquisition (HRMS1-DIA) [74] the m/z range of interest is segmented and intermediate MS1 scans are scheduled thereby increasing the number of available MS1 scans at cost of MS/MS scans. This potentially improves quantification on MS1 level but hampers it on MS2 level. In a recent preprint [45] from Erwin Schoofs group this strategy was shown to not only yield in more identifications but also and more importantly yielded in more datapoints per peak, hence lowering CV and improved quantitative precision. Another adoption of this trick termed wide isolation window high-resolution MS1-DIA (WISH-DIA) combines HRMS1-DIA with isolation windows widened to 10-100 m/z. WISH-DIA improves protein identifications most ideally at a 40 m/z isolation window, and allows for longer injection times and higher resolution without affecting total cycle time.
Another alternative to Orbitraps and TOF analyzers are linear ion traps (LITs). Although rarely used, their benefit for low input samples has been recently demonstrated [75]. Similar to TOFs, they allow for very fast scanning rates of up to 125,000 Da/s [75] making them especially exciting for DIA methods and for ultrafast chromatographic separations. It was again the lab of Erwin Schoof presenting an optimized DIA method using an Orbitrap for MS1 and an LIT for MS2, that clearly outperforms measurements using the Orbitrap for sample inputs <10 ng both by means of peptide identifications and robust quantification. It seems that in addition to speed, the main advantage of LITs is sensitivity, which is superior over Orbitrap sensitivity. However, this comes with a high noise level as trade off and results in difficulties in data analysis. While highly multiplexed (i.e., TMTPro) samples cannot be resolved using the LIT due to its limited resolutions, the signal to noise ratio can be significantly improved when combining with a FAIM-SPro interface to successfully quantify representative proteomes from ultra-low inputs in label-free approaches with high throughput [61].
Besides mass detectors, ion mobility as additional separation dimension can decrease chemical background noise to enhance sensitivity.
Compatible with Orbitrap instruments, the FAIMSPro prevents neutral ions from entering the mass spectrometer, improving robustness and sensitivity for proteomics experiments [76,77]. The resulting increase in signal over noise was shown to improve proteome coverage especially for low input samples. Furthermore, high-field asymmetricwaveform ion-mobility spectrometry (FAIMS) improves quantitative precision both using data dependent and independent acquisition [78,79]. As an alternative to FAIMS, the timsTOF devices from Bruker introduce a so-called trapped ion mobility funnel that adds the collisional cross section of analytes as additional dimension to LC-MS enabling improved proteome coverage at reduced analysis time [80,81]. Similar to FAIMS, the signal to noise ratio can be increased by excluding singly charged ions from analysis. These devices have recently been used very successfully for low inputs down to individual cells. Brunner et al [39]. identified more than 2000 proteins from one cell (see also Table 1) and in a recent technote of Bruker [82] more than 1500 protein groups were quantified using a label-free workflow within the proteoCHIP.
The choice of proper data acquisition parameters is at least as influential as the choice of the instrumentation. As highlighted before, pushing sensitivity and throughput is amongst the most important aspects for ultra-low input samples. This results in methods with low target intensities and high injection times to enable deeper proteome coverage. Currently DDA is arguably the most prominent way to go in proteomics. Here an MS1 full scan is acquired, and top n precursor ions are selected based on pre-defined parameters for fragmentation. This is repeatedly done throughout the entire analytical gradient and enables a straightforward identification and quantification of peptides.
However, it adds a high degree of stochasticity to the data which might lead to lowered run-to-run reproducibility especially for single cell level inputs where low abundant precursors might be (randomly) selected or not. In DIA, the entire mass range of interest is divided into bins of either fixed or variable sizes usually ranging from 10 to 30 m/z. As already aforementioned, even wider windows up to 100 m/z have been tested and wider than usual bins were shown to be beneficial for SCP [45]. In theory all precursor ions in the entire mass range are fragmented and are scanned sequentially. This promised to facilitate improved proteome coverage and reproducibility across runs but due to increased spectrum complexity, data analysis is more challenging [83]. In recent years, DIA experiences a renaissance as the advantages are predominant and data analysis is powerful enough to handle complex chimeric spectra nowadays (see Section 5). Of note, recently two-way communication-based strategies gained popularity.
Advanced algorithms are thereby leading a complex decision process on collection time, isolation window or fragmentation energy on-thefly. This can also be used to enable a close matching in used time windows for monitoring specific peptides in targeted approaches. This however requires real time searching and spectral matching. Thanks to more and more powerful yet affordable computers and advanced software, such approaches will likely dominate the field in the near future [84][85][86].
In line with the trend of more complex spectra, our lab also pioneered in testing ultra-wide windows in DDA (termed wide window acquisition [WWA]) workflows to on purpose fragment >1 precursors.
Using WWA the number of peptides identified form an individual spectrum was boosted to up to 10, which improves the overall proteome coverage without the need to elongate gradient time. By using WWA more than 1000 proteins were identified from single cell input compared to max 700 using a conventional isolation window of 1 Thompson [69].
Another simple but innovative and clever strategy to improve coverage, speed and at the same time remove stochasticity, is to record data without any MS/MS fragmentation. Here, identifications are based exclusively on high resolution precursor masses and RT prediction. Skipping fragmentation and recording of the resulting MS2 spectra frees up measurement time by factor 10-20 and allows for proteome-wide analyses within 5 min. Its quantification efficiency was recently demonstrated to be comparable to multiplexed TMT-based approaches and to DIA workflows [55,56,87,88]. The enormously short gradients are of high potential for label-free proteomics as they allow to reach throughputs of roughly 200 SPD. In our lab, we successfully tested MS/MS free data acquisition to identify close to 750 proteins from a single cell level input [11]. This approach remains to be evaluated by more labs to gain confidence in reliability of F I G U R E 4 Degree of data completeness (1/fraction of missing values) is dependent on data acquisition. 250 pg of tryptic HeLa digest were repeatedly injected from the same vial, creating a dataset without any biological variability. The fraction of peptides found in n replicates is plotted when either using data dependent analysis (DDA) or data independent analysis (DIA) to acquire data. All data was recorded on the same Orbitrap Exploris 480 (Thermo) using the same 5.5 cm μ-pillar array columns (μPAC) analytical column and a 20 min active gradient. Raw data can be accessed free of charge via the ProteomeXchange Consortium in the PRIDE [90] partner repository with the dataset identifier PXD039208. For DDA data was analyzed using CHIMERYS and quantified with apQuant [91] with match between run (MBR) enabled. For DIA Spectronaut v16 was used in directDIA mode including spectral matching across all files.
identifications not based on fragment spectra, but the authors of this review believe it has the potential to move the single cell field forward

DATA ANALYSIS PLATFORMS FOR LF SCP
MS2-based peptide identification algorithms have improved tremendously over the last 20 years, resulting in numerous excellent soft-ware solutions from both, commercial and academic developers (e.g., Sequest HT [91], Spectronaut [92], DIA-NN [71], MaxQuant [93,94], PEAKS Studio [95], and MSFragger [96]). However, all these algorithms have been developed and optimized for proteomic bulk samples with a number of assumptions including, (i) that the number of fragment ion peaks matches between theoretical and observed spectrum, (ii) fragment ion peak intensities are substantially greater than background signals, and (iii) repeated spectra of the same precursor will look near identical. A recent study has demonstrated that these assumptions might not be fully correct for single cell spectra, for which Boekweg et al. have found loss of annotated fragment ions, blurring between signal and background noise based on reduced ion intensity as well as distinct fragmentation patterns as compared to bulk spectra [97].
Compared to typical bulk proteomics samples, the greatest challenge for SCP is that considerably fewer proteins can be identified and quantified. While around 6000 -proteins can be identified from HeLa whole cell lysates with relative ease, most single cell studies fail to identify more than 1000 proteins from a single cell as shown in Table 1.
Unsurprisingly, even sensitive instruments such as the Orbitrap require long ion accumulation times of >100 ms to collect sufficient ions for interpretable fragmentation spectra, which negatively impacts cycle time prolonging duty cycle times and hence deteriorating the number of datapoints per peak & consequently quantification. While this is not such a massive limitation in DIA methods, in DDA aiming to fragment only a single precursor at any given time it critically reduces the number of precursors that can be triggered. The AI-driven search algorithm CHIMERYS in contrast allows to identify >10 peptide sequences from a single MS2 spectrum thereby increasing the number of identified peptides and proteins per sample typically by 100% and 50%, respectively, when using WWA applying 4-12 m/z precursor isolation widths [69]. CHIMERYS does this by predicting peptide properties such as RT and peptide fragment intensities using the INFERYS 2.0 deep learning framework, that was trained on millions of spectra from tryptic and non-tryptic peptides [99]. This has allowed us and others to achieve proteomic depths in single cell samples of over 1000 proteins at an false discovery rate (FDR) of 1% without the use of libraries or identification transfer between samples [11].
As described in the previous chapter, missing values are a substantial challenge especially when using traditional DDA-based approaches. Already at the bulk level for small DDA studies with only 18 samples, the typical fraction of missing peptide values is ≥17%, which becomes substantially worse in single cell studies (see Figure 4 and recent literature [100,101]). MaxQuant, as one of the most popular data analysis tools, manages to reduce missing values by peptide identity propagation (PIP) via accurate mass and RT mapping on the MS1 level using the match-between-runs (MBR) algorithm [94,102].
Recently, however, more sophisticated approaches to reduce missing values have been published including the IonStar [103] and the IceR [100] algorithm. IonStar avoids the need to detect isotope peak patterns completely, which would limit sensitivity. Instead, it solely uses ion-based PIP applying direct ion current extraction instead [103].
In contrast, IceR combines the use of feature-based and ion-based PIP in a hybrid strategy, reaching very high degrees of sensitivity and data completeness. This reduces the fraction of missing values drastically to a few percent in most cases, whereas missing values for traditional tools such as MaxQuant even using MBR are in the range of >15% for bulk analyses and >50% for single cell studies [100].
Another option to improve data completeness is the use of DIA DDA's inherent stochasticity (Figure 4). DIA typically reduces missing values substantially to less than 10% when 20 samples are analyzed and is therefore becoming increasingly popular both in regular bulk proteomics as well as low input and single cell studies [61,104,105].
Data analysis for DIA runs is typically performed with either Spectronaut [93] or DIA-NN [72]. As the two leaders in the field of DIA data analysis, both of deliver excellent sensitivity and data completeness.
While Spectronaut is available as stand-alone software tool, DIA-NN can be used either as a stand-alone version or integrated into the FragPipe platform [106].
With the availability of the additional ion mobility dimension on timsTOF instruments, the classical DIA approach has been devel-

APPLICATIONS OF LF-SCP AND RECENT STUDIES
Initial research efforts in single cell method development have mostly aimed towards pushing technical boundaries to enable deeper proteomic coverage, improved data completeness and quantification to lay the foundation for relevant and reproducible biological and clinical studies. Indeed, these efforts are bearing fruit and label-free proteomics has now matured enough to allow exploring more applied studies to investigate biological questions of high importance to society such as cellular development from stem cells to highly differentiated and specialized cells with major implications in tissue regeneration and in vitro fertilization (IVF) as well as many other fields. In oncogenesis, SCP could help researchers track the formation of drug resistance and highlight potential ways to circumvent it. SCP could also enhance understanding of the onset of various types of cancer to not just help treat cancer but even prevent it in the first place. Particularly for larger and medium sized cells such as oocytes, skin cells or hepatocytes LF-SCP should be highly relevant, while very small cells such as lymphocytes or erythrocytes might dictate the use of multiplexing workflows due to the ultralow amount of initial protein available. Several recent manuscripts have applied LF-SCP to biological questions and report encouraging results with first interesting insights into biology (Woo et al. [9]).
One of these works describes the heterogeneity on the single cell proteome level of human oocytes upon in vitro and in vivo maturation [109]. This is of particular relevance for IVF approaches. Guo et al [109]. sampled 36 human oocytes from 13 different donors and identified 2382 proteins of which 2094 could be quantified. The oocytes belonged to one of three conditions including immature germinal vesicle oocytes (GV) as well as in vitro matured (IVM) and in vivo matured (IVO) oocytes. A total of 176 proteins were found to be differentially abundant between GV and IVO oocytes and 45 between IVM and IVO oocytes. Among these proteins, maternal effector proteins were identified potentially related to the observed decreased fertilization, implantation, and birth rates using IVM oocytes. Next to the single cell proteome, also the single cell transcriptome was evaluated, which showed low correlation to the SCP data with correlation coefficients of <0.2 when comparing fold changes. This again highlights the added value of SCP data over single cell transcriptomics data. Interestingly, IVM oocytes displayed higher proteomic inter-cell variability as compared to GV and IVO oocytes potentially suggesting homogenous oocyte states in vivo. This could also suggest a higher variability of IVM oocyte quality as compared to IVO oocytes, which would be consistent with the observed reduced biochemical and clinical pregnancy, as well as live birth rates when using IVM oocytes during IVF [110].
The authors investigated the potential origin of this heterogeneity and found significant correlation of the Estradiol/follicle ratios (E2/fol) with the Euclidean distance from each oocyte to the median of the group. This suggests that E2/fol level fluctuations may contribute to the heterogeneity of in vitro-matured oocytes. It has also been shown previously that low levels of E2/fol are associated with low implantation rates as well as higher risks of single and triple pronucleus formation and abortion [111].
In another study of interest, Li, et al. used mass-adaptive coatingassisted single-cell proteomics (Mad-CASP) methodology to investigate CD34 + peripheral blood mononuclear cells (PBMCs) in the arterial blood of chronic total occlusion (CTO) patients for comparison to healthy donors [112]. CTO is characterized by a (near) complete blockage of one or more coronary arteries for 3 months or longer leading to restricted blood flow to the heart and serious health complications such as a heart attack. Circulating CD34 + progenitor cells play important roles in vascular repair thereby affecting cardiovascular health and longevity, which renders them highly interesting targets for CTO research [113]. The authors' Mad-CASP approach includes the coating of the sample containers with a synthetic peptide to minimize adhesive sample losses. The synthetic peptide contains mostly hydrophobic amino acids with a tryptic cleavage site every five amino acids resulting in the liberation of short tryptic peptides upon digestion. This allows to also saturate LC-surfaces such as tubings and columns thereby preventing sample peptide losses during chromatographic separation. The short length of the tryptic peptides leads to them not being detectable in the classical m/z range of tryptic sample peptides of 350+ m/z avoiding dynamic range issues in the MS.
This system allowed the authors to identify 23%-63% more proteins when compared to uncoated vials. For the analysis of the CD34 + blood cells, the authors first generated 2000-cell libraries for each of the six donors (three healthy, three CTO patients) from which they could

CONCLUSIONS, PERSPECTIVES, AND FUTURE CHALLENGES
SCP, and particularly LF-SCP, has matured is becoming widely available now to an increasing number of proteomics labs that had previously shied away from the technical obstacles. The ease of implementation of LF-SCP allows scientists to enter the single cell field in a straightforward manner. Early studies of LF-SCP in human fertility research and cardiovascular disease have resulted in relevant and encouraging data and should foreshadow the widespread application of LF-SCP to many other burning biological and pathological questions in areas such as tumor heterogeneity, hematopoiesis, neurobiology, or cellular differentiation and organ development. LF-SCP is expected to be most relevant to the investigation of larger cells with diameters >15 μm such as fibroblasts, macrophages, cardiomyocytes, or oocytes, while smaller cells would highly benefit from multiplexed approaches. However, LF-SCP offers a high degree of quantitative accuracy and data completeness when utilizing DIA, which is not always the case for multiplexed approaches. In comparison to cytometry by time-of-flight (CyTOF), LF-SCP offers a more comprehensive view of the cellular proteome representing a hypothesis-generating approach, while the targeted approach of CyTOF rather facilitates hypothesis-driven research by monitoring dozens of pre-defined proteins rather than hundreds.
Diligent planning and avoidance of sample losses during the entire workflow starting with loss-less sample preparation, powerful and rapid peptide separation rates, sensitive and ultrafast MS, and sophisticated data analysis strategies are essential to successful LF-SCP analyses. By utilizing and combining these high-end methodologies the up to 2000 proteins were identified from HeLa cells, which is expected to improve even further. Current limitations and challenges for LF-SCP are associated with sample throughput and sensitivity, which are currently being tackled by ultrafast LC gradients and the use of MBR to libraries of hundreds to thousands of cells with great success.
Including MBR more than 2500 proteins quantified were reported in a recent technote from Evosep [113]. With several single cell omics technology coming of age, the perspective of realizing multi-omics analyses from an individual cells becomes extremely attractive and has already been realized for transcriptomics and proteomics using the nanodroplet splitting for linked-multimodal Investigations of trace samples (nanoSPLITS) technology [114]. In a very innovative work of Mahdessian et al. [115], single cell proteogenomics, although not from the very same cell yet, was successfully applied to generate a spatiotemporal map of human proteomic heterogeneity at subcellular resolution. The data was correlated with single-cell transcriptomics and cell cycle state information. The authors found that cell cycle progression explains less than half of all cellular heterogeneity and conclude that post translational regulation is predominant over regulation by transcriptomic cycling. Deep visual proteomics, as demonstrated by the Mann and coworkers [43], is another potentially groundbreaking approach to improve our understanding of spatial proteomics in future.
AI driven image analysis of different cell phenotypes is combined with single cell or even single nucleus laser microdissection for analysis by MS. This allows to preserve spatial context and map proteome abundances to (sub-)cellular phenotypes. The authors of that study justifiably claim that this might be a key technology for future research on cancer progression, diagnostics, and drug development. Arguably deep visual proteomics could be extended to any system that can be microscopically imaged. Combining SCPs with other omic-techniques like metabolomics or lipidomics could be potentially an even more relevant approach to comprehensively map phenotype-defining molecular signatures. Furthermore, classical MS-based SCP might be combined with high potential techniques such as mass cytometry and imaging mass cytometry in future as already impressively demonstrated in linking individual breast epithelial cells to age, parity, and BRCA2 status [116] and as summarized in an excellent review elsewhere [117]. We therefore believe that LF SCP represents an indispensable approach in the SCP field to assess cellular heterogeneity to identify rare subpopulations with high relevance in biology, which offers specific benefits over multiplexed SCP or CyTOF with an exciting outlook for future applications.