A practical guide to interpreting and generating bottom‐up proteomics data visualizations

Mass‐spectrometry based bottom‐up proteomics is the main method to analyze proteomes comprehensively and the rapid evolution of instrumentation and data analysis has made the technology widely available. Data visualization is an integral part of the analysis process and it is crucial for the communication of results. This is a major challenge due to the immense complexity of MS data. In this review, we provide an overview of commonly used visualizations, starting with raw data of traditional and novel MS technologies, then basic peptide and protein level analyses, and finally visualization of highly complex datasets and networks. We specifically provide guidance on how to critically interpret and discuss the multitude of different proteomics data visualizations. Furthermore, we highlight Python‐based libraries and other open science tools that can be applied for independent and transparent generation of customized visualizations. To further encourage programmatic data visualization, we provide the Python code used to generate all data figures in this review on GitHub (https://github.com/MannLabs/ProteomicsVisualization).

often an afterthought. Consequently, data assessment, interpretation and visualization often remain exclusive abilities of experts familiar with the data and capable of handling it programmatically. This drastically slows down method dissemination and knowledge transfer to a broader audience from different research fields. Due to this required expertise, communication with non-experts in proteomics is often suboptimal. While there are several reviews that either focus on standalone software tools [31,32] or cover computational aspects of the visualization process by making an overview of available R libraries [33], they do not necessarily provide insight to non-experts in proteomics on why certain visualizations are important or how to interpret them.
In this review, we provide an overview of several common types of visualizations, focusing on their use and interpretation rather than the software. We also demonstrate how such visualization can be interactively created with Python, one of the most common programming languages in science that has a low threshold to learn and use. Following the main steps of proteomics data analysis, we first describe the visualization of raw data and peptide identification with a special focus on novel MS instrument types and data acquisition modes. Next, we cover

VISUALIZATION OF PROTEOMICS DATA
In brief, a standard MS-based bottom-up proteomics workflow can be described as follows (see fig. 1 in [6]). Proteins are enzymatically digested into short, MS-accessible peptides and separated using a LC setup that is directly coupled to a mass spectrometer (LC/MS setup).
The MS then measures both intact peptide masses and the corresponding masses of peptide fragment ions that are generated on the fly, which is called tandem mass-spectrometry (LC-MS/MS setup). The resulting peptide and fragment ion spectra are then used to identify which peptides were present in the sample based on a reference proteome, commonly provided as species-specific protein FASTA file. With many

Statement of significance
We review data visualizations used to evaluate and communicate bottom-up proteomics data. Critical aspects are explicitly explained by presenting concrete use-cases of raw and processed proteomics data. As practical guidance, we highlight publicly available Python-based tools and provide our own codebase for data visualizations that are presented herein. This should help the interdisciplinary use of bottomup proteomics by ensuring a common ground for data communication and by enabling independent data exploration and visualization.
strategies available, identified peptides are then quantified and their information is aggregated to the protein level by protein inference.
Strategies for peptide and protein quantification vary from absolute quantification within samples to relative quantification across samples. A more detailed introduction to bottom-up proteomics is available elsewhere [34]. In table 1 we provide an overview of the analysis steps, visualizations and most important pitfalls/best practices covered in this review. Many of the recommendations we make apply beyond the proteomics field and many statistical aspects are beautifully explained in the ''Points of significance'' series in Nature Methods.

Raw data visualization
At the heart of all proteomics projects is the raw data acquired by the MS [35] and unsatisfactory analysis results can often be traced back to low data quality. Evaluating the raw MS data quality is therefore a critical first step during data analysis, yet it is often neglected. Data quality is commonly assessed by visual exploration of the raw MS data, as it can reveal a variety of flaws of samples and instrumentation alike [31]. Alternately, various computational quality control methods are also available in the field and are extensively covered in literature [36].
In this section, we cover standard visualizations of raw MS data on pre- against the retention time (blue line in Figure 1A). Problems that can be revealed inspecting the TIC are poor peak separation (very broad peaks), unstable spray or MS failure (intensity drops) and mistakes in sample preparation (low intensity, few peaks, unexpected overall shape) [38,39]. Another major issue is saturation of the whole LC-MS system, for example, by overloading or contamination. This can be revealed by the base peak intensity (BPI) plot, which shows the intensity of the most abundant ions detected over time (red line in Figure 1A). If the system is saturated one can see plateaus in the BPI trace. It is generally advisable to have a reference TIC and BPI plot for the sample type and instrument setup used to be able to detect anomalies.
It can further be important to follow up on individual detected ions or groups of ions, to evaluate, for example, the spread of contaminants, the peak shape of quality control ions or the quality of identified peptide features. To this end, extracted ion chromatograms (XICs) are commonly used ( Figure 1B). The desired mass and charge range is extracted from the raw data and its intensity is plotted against the retention time. In doing so it is important to set adequate boundaries to the mass range (m/z tolerance), accounting for mass errors and coeluting ions.

Precursor maps.
To get an overview of the whole range of precursor masses detected along the retention time, a two-dimensional MS1 map can be used [40,41]. It shows the intensity (color) of observed precursor masses (x-axis) across the chromatographic retention time (y-axis) as a heatmap ( Figure 1C). Same as for the TIC, it is advisable to have a reference image for this to be able to see anomalies, as they could again hint at technical issues with the instrument.

Recent developments in MS instrumentation introduced ion mobil-
ity as an additional separation dimension [9,11,13], which should be evaluated in a similar way as the m/z dimension. Akin to the twodimensional MS1 map, precursor signal intensities can be visualized in the ion mobility dimension against the m/z dimension ( Figure 1D).
This heatmap would be even more informative if it showed the intensity across all three dimensions (retention time, ion mobility and m/z). While this is in principle possible, the resulting visualizations are hard to interpret intuitively and improving them is one of the remaining challenges in proteomics data visualization [42].

Visualizations at the fragment level
The first principal step of aggregating raw MS spectra into proteomic data is the identification of analyzed peptide sequences. The two required elements for sequence identification are the measured peptide fragment (MS2) spectra and the sequence search space, both of which depend on the acquisition mode and to a lesser extent the quantification strategy used [43]. We cover label-free data-dependent acquisition (DDA) and data-independent acquisition (DIA) here.

DDA.
In the classical DDA approach the MS instrument isolates and fragments individual selected peptide ions from the precursor scan (MS1), most commonly the top-N most intense ones. The spectra are then searched against a sequence database that contains masses, sometimes also intensities, of peptide fragments from in silico protein digestion and fragmentation [44][45][46].
It can be important to manually evaluate the MS2 spectra and the identifications based on them, particularly when follow-up experiments hinge on a single or few proteins or even peptides. To do so, one can look at the individual MS2 spectra, highlighting the N-terminal and C-terminal fragment ions of the single selected precursor (Figure 2A).
Underneath the spectrum itself, the sequence of the identified peptide and the position of identified N/C terminal fragment ions are indicated.
Depending on the exact fragmentation method used, the peptide bond breaks at different positions, yielding different pairs of ions, most commonly b/y ions. Issues that can become apparent here are co-fragmentation of several peptides (many more fragments visible) or other isotopes of the same peptide (isotopic clusters for fragments), or poor fragmentation (very few ions and intense precursor peak). To check the quality of the peptide-spectrum-match against the library, mirrored spectra are commonly used ( Figure 2B). Here the theoretical fragment masses are shown on a mirrored y-axis, which makes it immediately apparent which fragments are missing or should correctly be identified in the measured spectrum.

DIA.
In DIA mode, instead of isolating a single precursor mass, mass ranges containing multiple precursors are isolated and fragmented for every MS1 scan, covering more precursors, but yielding more complex MS2 spectra. For a general introduction to DIA we suggest this review [49].
Due to the increased complexity, the simple MS2 spectrum visualizations lose most of their relevance and a spectral library containing only masses and intensities is no longer sufficient for identification. DIA libraries therefore additionally contain the retention time and if applicable the ion mobility of the precursor ions to narrow the search space at each time point [20,[48][49][50]. On top of the fragment masses, the exact coelution of fragments and their precursor is now crucial for scoring candidate identifications. To assess the quality of DIA identification, it is therefore most common to look at the elution profiles of all fragments associated with a specific precursor. Ideally, they should form a single sharp peak together with the precursor ( Figure 2C). Indicators of peak misassignment would be peak shifts or blending additional peaks of individual fragments. Here, measuring ion mobility can lead to higher confidence, as fragments should correlate along this dimension as well.
Both dimensions together can be visualized in heatmaps for the precursor and all its fragments in retention time and ion mobility space, colored by intensity ( Figure 2D).
Additional complexity. Independent of the acquisition mode MS spectra can be complicated by peptide modifications, but the same visual techniques apply. Modifications can be either biologically generated PTMs (e.g., phosphorylation) [51,52], artifacts introduced during sample preparation (e.g., oxidation) [53] or sample labelling techniques (e.g., TMT [54] or EASITag [55]). Depending on the exact type, modifications lead to additional peaks for neutral losses or reporter ion series in MS2 spectra, or even require an additional level of fragmentation (MS3) to acquire additional fragments. To interpret these complex spectra more specialized background knowledge that goes beyond the scope of this review is required.  [62]). Generated using [58]. (C) Lollipop plot displaying phosphosites, their intensity and localization probability (bubble size) (PXD010697, [77]).

Peptide and PTM visualization
When moving from raw data to aggregated peptide and protein quan- To evaluate differential sequence coverage between samples, overlapping peptides should be collapsed to a single line per sample to avoid clutter ( Figure 3B).
If PTMs are measured, their position, intensity and localization probability can be visualized per modification site. If only the position needs to be visualized in the context of identified peptides, they can simply be added to these peptide views (start mark in Figure 3B). If a PTM's intensity and/or site probability are of interest a lollipop plot can be used ( Figure 3C). These can for example be found on Phospho-SitePlus [57]. Here, the size of the markers reflects the site probability and their vertical position reflects the intensity. For any of these visualizations it can be very informative to include additional annotation traces, for example, showing tryptic cleavage sites, and protein domains. This is for example possible using AlphaMap [58], which was also used here to create Figure 3B. With these visualizations in hand, various aspects of observed peptide and PTM signals associated with a protein of interest can be visualized and easily compared with data available in external databases. In doing so it is important to keep in mind that not all peptides are unique for just a single protein [59,60].

Protein quantity visualization and basic analysis
Aggregating peptide quantifications into protein quantifications is anything but a trivial task and highly depends on the inference strategy and quantification method used [61]. Agnostic to the quantification method, the assignment of peptides to proteins is not always uniquely possible, and therefore proteomics studies often talk about protein groups [59,60]. These usually consist of any number of proteins that could be contained in the sample based on a set of shared non-unique, or ''razor'' , peptides identified. Most protein groups consist of genetically closely related proteins, like isoforms or paralogs. From here on out we will focus on the analysis of protein groups independent of the inference and quantification method used, but want to point out that each quantification method comes with individual parameters and visualizations used for quality control. All following visualizations can in principle also be applied on the peptide level, but are mostly used on the protein level. We will start with the evaluation of single condition samples and simple two-condition comparisons by the example of a knock-out versus wildtype experiment [62] and then move on to more complex experimental designs and protein networks in the following sections.
Range and reproducibility. Once protein groups are quantified the first thing to look at is the distribution of their intensities. This is frequently done using log-intensity histograms ( Figure 4A) or boxplots.
These can indicate if certain samples have different intensity distributions, which might necessitate normalization, or a significantly reduced depth. They can further be used to assess the distribution of certain protein categories relevant to the downstream analysis, like imputed values or reverse database hits as in Figure 4A.
The dynamic range of a dataset is another important parameter as the measurement of low abundant proteins is a major limitation in untargeted bottom-up proteomics. To display it, a protein rank plot can be used ( Figure 4B). Depending on the quantification method and the downstream processing, the y-axis can represent either raw intensity units or estimates of absolute protein quantities (e.g., iBAQ [63], proteomic ruler [64]). In full proteome studies, the highest abundant proteins typically include cytoskeletal and ribosomal proteins and, depending on the proteomic depth, the lower tail includes, for example, signaling proteins and transcription factors. (1) For square cutoffs, the horizontal threshold is selected based on a desired multiple hypothesis testing corrected p-value (or FDR). The vertical fold-change cutoff is set with regard to the experimental power, which is the probability of detecting an effect of a certain size, given it actually exists. When using square cutoffs, the power should always be indicated as in Figure 4E, regardless of whether a fixed power is used to calculate the fold-change cutoff or the other way around [67].
(2) For nonlinear volcano lines, an s0 parameter is set instead of a specific fold-change cutoff [68]. The s0 parameter is added as a constant to all standard deviations used in the t-tests and can roughly be interpreted as the assumed systematic error of the measurements, thereby setting a lower bound on the fold-change as a function of the measured standard deviation.
In both methods the boundaries on the fold-change ensure that the biological variability exceeds the numerical variability introduced by measurement noise or imperfect normalization. Both methods are valid if applied correctly, but yield slightly different hitlists and are both highly dependent on the arbitrarily selected parameters. It should also be kept in mind that either method still has a false discovery rate and protein groups can be on either side of the boundaries by mistake. The boundaries rather serve the purpose of generating a statistically sound list for further downstream analysis. Importantly, multiple hypothesis correction always has to be performed and documented. Usually this is done either by Benjamini-Hochberg correction or by performing a permutation test. For square cutoffs the y-axis usually shows the corrected p-value (not done here to ease comparison). Enrichment analysis. One common analysis to do downstream of a volcano analysis is to look at overrepresentation of biologically relevant groups of proteins (e.g., biological pathways of cellular compartments) in the hitlist compared to the overall proteome (methods reviewed in [69]). This is usually done by a Fisher's exact test [70] or gene set enrichment analysis (GSEA, [71]) based on systematic annota- tions available, for example, through gene ontology [72,73]. Often this is done using online tools that use the whole theoretical proteome as background. However, bottom-up proteomics is not able to quantify all proteins and unidentified proteins should not be included in the background for an enrichment analysis [74]. Thus, only tools that can consider the specific background should be used (e.g., String [75] or Panther [76]). The three main values resulting from an enrichment analysis per candidate group are enrichment factor, group size and multiple hypothesis testing corrected p-value, which can be visualized together ( Figure 4G). From this one could now draw biologically relevant conclusions, linking the prior difference between the compared samples to enriched sets of protein groups. If differential enrichment in several samples is displayed, the x-axis can be used to display the different samples and the size can be switched from group size to p-value. Perseus is a common tool to generate many of the aforementioned visualizations and to run most underlying analyses, including the enrichment analysis [29]. However, given the output of the statistical analysis almost any comprehensive visualization tool can create these Figures.

Multi-conditional and multidimensional experimental designs
With increasing throughput, thanks to improvements in MS instrumentation, more complex experimental designs became practical. Common multi-conditional designs include time course experiments [77] and profiling experiments across subcellular compartments [78] or protein complex fractionation [79]. Two-and multi-conditional designs can further be combined into multidimensional experiments with each other (e.g., measuring subcellular profiles over time [80] or in different genetic backgrounds [78]) and with additional variables (e.g., demographic parameters in clinical sample cohorts [81]). In this section we use a comparative spatial proteomics dataset [78] for demonstration purposes.
Dimensionality reduction. While the full scope of a two-condition experiment can easily be displayed in two-dimensional, higher dimensional experiments require dimensionality reduction for visualization.
Just selecting two dimensions can be useful if a direct comparison is needed, but this will always disregard biological variability added by other dimensions. This is problematic because it can mask correlated or orthogonal effects.
One universal tool to incorporate these effects into dimensionality reduction is PCA [82]: The data is usually scaled and log-transformed and then linearly transformed onto a new coordinate system, such  Figure 5C). If many PCs have a similar contribution to the overall variability, this indicates independent underlying variables. In contrast, a single high variability PC often indicates that several of the underlying variables are at least partially dependent.
Other dimensionality reduction algorithms are tSNE [83] ( Figure 5D) and UMAP [84] ( Figure 5E). The major difference between PCA and tSNE/UMAP is that the latter performs non-linear transformations, whereby distances between individual proteins become incomparable. Their advantage is that they usually achieve visually more obvious separation of protein clusters in return and can provide performance benefits for two-dimensional clustering algorithms. In principle, these techniques can also be applied to a [sample x protein] rather than a [protein x sample] matrix to look at the data from a different perspective. Mixing data types. If other data types (e.g., clinical parameters, additional ''omics'' data, quality parameters) are integrated with proteomic data, it is likely that none of the visualizations above can be applied. In that case one can turn to dimension plots having either parallel coordinates or categories. These have multiple parallel axes that can each represent a different data type with individual ranges. Here, every line represents a dataset (e.g., protein or sample) and connects the data points across the parallel dimensions ( Figure 5H).
If all dimensions are categorical, the group sizes and membership combinations are displayed instead.

Network representations of proteomic data
Many extensive proteomics studies, such as interactomics [86,87], proteome profiling based [88104] or extensive clinical studies [89], focus on networks between proteins or could be mined for them. Any experiment that yields enough data to identify or quantify the physical or phenotypic relation between several pairs of proteins is sufficient to can be generated using STRING [76] and biological pathway graphs are provided by Reactome [90]. For networks based on quantifica-

CUSTOM PROGRAMMATIC DATA VISUALIZATION
In the previous sections we have described several commonly used visualizations in the proteomics field, along with available software tools to create them. However, depending on the experimental design and specific focus of a study, it might still be challenging to find a fitting visualization in one of these tools. A scientist might want to create something entirely novel, or just customize the Figure beyond the capabilities of the tool that you are using. Besides these practical limitations, the data visualization process can also contribute to low transparency and reproducibility in scientific papers by use of closed source software and lack of documentation [97]. These challenges

Proteomics data visualization in Python
For this review we chose Python as a programming language, because it is widely known for its readability and versatility, as well as a shallow learning curve for new developers and a very active, supportive and collaborative community. The latter is particularly useful considering that ''open code'' and community engagement can benefit researchers by saving time and funding resources [98]. As a primer for proteomics visualization in R, we recommend [33]. Similar to R, Python already has a large variety of well-documented and well-maintained libraries for scientific computing [99]. pymzML [110,111] An mzML data parser for fast access and handling of the data with integrated data visualization.
Pyteomics [112,113] A framework for proteomics data analysis, supporting different data formats.
pyOpenMS [114] A library for the analysis of proteomics and metabolomics data.
multiplierz [115] A scriptable framework for access to manufacturers' formats via mzAPI.
PaDuA [116] A Python package optimized for the processing and analysis of quantified (phospho)proteomics data.
AlphaTims [117] A Python package for efficient accession and visualization of Bruker TimsTOF raw data.
AlphaMap [58] A Python package for the visual annotation of proteomics data on the peptide level with sequence specific knowledge.
spectrum_utils [118] A Python package for processing and visualization of MS/MS spectra. One overall challenge of data visualization is how to efficiently handle big data. Big data is particularly challenging, because the simultaneous display of thousands of data points usually leads to occlusion of information (as can be seen in Figure 5A) and oftentimes misinterpretation. Common workarounds are down sampling, reduced opacity (as in Figure 5E), replacement by summary statistics (as in Figure 5F) and more. While these methods can often improve data display, the full data scope should always be evaluated and in many cases, it cannot be replaced. An easy way to visualize it without occlusion is offered by the Datashader library (https://datashader.org). It rasterizes the data space similar to a histogram, but in two-dimensional and encodes the number of points per two-dimensional bin by color ( Figure 1C, Figure 3C). This facilitates quick visualization of patterns or structures in big data sets.
Due to the amount of data contained in most proteomic studies, there is usually more biological insights to be gained than can be described in a single publication. While uploading datasets to repositories is generally mandatory nowadays, data can be made even more accessible by providing a dedicated online resource or even an analy-

Open science tools
To enable full accessibility, transparency and reusability of custom visu- Firstly, it is important to fully document what any code is doing and to provide necessary context, akin to wet-lab protocols and documentation. A modern software development tool supporting this is Jupyter (https://jupyter.org), which is compatible with Python, R and Julia. It integrates code, execution output (e.g., visualizations) and static documentation in a single interactive, but freezable file format. The documentation is written in the very simple markdown syntax, which allows standard text formatting and inclusion of complex elements like images and formulas. In recent MS-based proteomics publications, one already sees links to the study specific code provided in Jupyter [98,105]. Given a suitable Python environment and access to the data anybody can thereby reproduce results transparently. In case local hardware is limiting code execution, community resources can be used. Specifically, Google provides a free but powerful Jupyter notebook environment called Google Colab [106].
Secondly, it is important to share code publicly and since code usually continues evolving after publication it is crucial to transparently keep track of code versions, dependencies and contributions. The community standard tool for version control is Git, complemented by the public hosting service GitHub [107], which is free to use for scientific projects. Beyond sharing versioned code, it is also a social coding platform that enables community contributions like peer-review and ensures transparent attribution of code contributions to authors.
For code that requires interactive execution, or creates interactive elements, GitHub provides integration online hosting solutions like Binder

CONCLUSION
In this review we have summarized data visualizations specific to the proteomics field, from raw data to complex experimental designs. As this field is rapidly progressing and highly translational, we decided to not only cite existing tools for visualization, but to further provide guidance towards creating common data visualizations programmatically and interpreting them critically and correctly. As the options for experimental design are constantly evolving we could not cover all flavors of proteomics data visualization herein. It will be exciting to see how interactive web technologies and virtual reality will improve the way we visually explore proteomics data in the years to come, especially with regard to current limitations on three-dimensional visualization.
Lastly, we want to encourage our readers to try out different visualization types and visual channels interactively for the data they have at hand and to view data visualization as a creative, yet crucial step of science and science communication.