Protocol for parallel proteomic and metabolomic analysis of mouse intervertebral disc tissues

Abstract The comprehensiveness of data collected by “omics” modalities has demonstrated the ability to drastically transform our understanding of the molecular mechanisms of chronic, complex diseases such as musculoskeletal pathologies, how biomarkers are identified, and how therapeutic targets are developed. Standardization of protocols will enable comparisons between findings reported by multiple research groups and move the application of these technologies forward. Herein, we describe a protocol for parallel proteomic and metabolomic analysis of mouse intervertebral disc (IVD) tissues, building from the combined expertise of our collaborative team. This protocol covers dissection of murine IVD tissues, sample isolation, and data analysis for both proteomics and metabolomics applications. The protocol presented below was optimized to maximize the utility of a mouse model for “omics” applications, accounting for the challenges associated with the small starting quantity of sample due to small tissue size as well as the extracellular matrix‐rich nature of the tissue.


| INTRODUCTION
The advent of "omics" technologies has transformed how biological systems are investigated, and consequently how diseases are diagnosed and treated. 1 These approaches are particularly well suited to chronic, complex diseases such as musculoskeletal pathologies. 2 In recent years, next-generation sequencing-based "omics" methodologies such as genomics and RNA-Seq have seen a drastic increase in use due in part to standardization, including library preparation, instrumentation, and data analysis. Over time, increased use of these methodologies has resulted in a substantial reduction in cost, making them even more accessible for use by the scientific community-at-large. 3 Mass spectrometry (MS)-based "omics" methodologies such as proteomics and metabolomics are more nascent, with metabolomics being the most recent. As such, there is a lack of consensus on methodologies for sample preparation and bioinformatic analysis. 4 Proteomics can quantitatively assess the relative abundance of thousands of proteins in a tissue or circulating in plasma with the sensitivity and dynamic range to allow for high-throughput biological insights and biomarker discovery. 5 For instance, proteomics has been used to identify biomarkers of mortality in older men, 6 characterize the effects of sustained weight-loss, 7 determine the composition of cartilage, 8 and how cartilage responds to injury and inflammation. 9 Metabolomics allows for the comprehensive analysis of all small molecule metabolites in a diseased tissue or in circulation to develop a functional readout of the pathological state of an organism. 1 This technology has been used to develop biomarkers and therapeutic targets for numerous disorders including diabetic nephropathy, renal failure, cardiovascular disease, and prostate cancer. 2 Since MS-based proteomic and metabolomic techniques can be used for many sample types, it can be difficult to standardize instrument parameters, therefore optimization must be done on a sample-by-sample basis. Moreover, metabolomic techniques have been difficult to standardize due to differences in physicochemical properties of metabolites. The differences in metabolite properties may necessitate the use of several techniques (eg, LC-MS, GC-MS, NMR) to be fully comprehensive.
However, the comprehensive data collected by these "omics" modalities has the potential to drastically transform our understanding of the molecular mechanisms of disease, how biomarkers are identified, and how therapeutic targets are developed. Therefore, it is essential that standardized protocols be developed within specific fields of research to enable comparisons between findings reported by multiple research groups and move the application of these technologies forward.
One such field of research is intervertebral disc (IVD) biology, which lacks standardization of disease models and model organisms.
Numerous models of common spine pathologies such as IVD degeneration have been reported, induced by aging, 10 mechanical loading (ie, compression 11 or tail-loop 12 ), surgical injury, [13][14][15] or genetic manipulation. [16][17][18][19] Many of these models have been validated in multiple model organisms including cow, pig, sheep, goat, dog, rat, and mouse, for which each has advantages and disadvantages. 20 The lack of standardization in IVD biology can also be exemplified by recent efforts to determine a standardized set of cell type-specific phenotypic markers.
For experimental strategies exploring the molecular mechanisms of disease, or seeking to identify disease biomarkers or therapeutic targets, a key model organism is the mouse. This is due to the relatively short gestation period and lifespan, ease of genetic manipulation, and robustness of bioinformatic databases compared to those of other organisms. Work by our group and others have established the strength of mouse models to study IVD biology and common spine disorders. We have explored the effect of mechanical loading on the IVD in mice [42][43][44] and used transgenic mice to study IVD development, 45 disc degeneration, 16,46 and diffuse idiopathic skeletal hyperostosis (DISH). [47][48][49][50] Important insights have likewise been provided by others using mouse models to study disc development, [51][52][53] inflammation, 54 IVD degeneration, 10,13,55-58 calcification, 59,60 and scoliosis. [61][62][63] The multitude of available mouse models of spine pathologies allows for global molecular comparisons to uncover novel biological insights.
The use of unbiased "omics" approaches increases the likelihood of uncovering novel pathways implicated in spine pathologies, and therefore candidate targets for therapeutic interventions and novel biomarkers.
Transcriptomics technologies have been applied extensively to study the IVD, identifying numerous genes associated with IVD degeneration in model organisms 37,39,46,64,65 as well as humans. [66][67][68][69] In comparison, the use of proteomics has been limited, with a few studies of IVD degeneration in humans, [70][71][72] although access to tissues at various stages of diseases is limited. Proteomics has also been used in mice for global characterization of the healthy IVD, 73 to examine the response to mechanical loading, 42 to characterize different mouse strains, 55 and to investigate ectopic calcification in the IVD. 48 To date, there have been no metabolomic studies of murine IVD tissues, and the analysis of human IVD tissues is limited to a single unbiased metabolite screen using high-resolution magic angle spinning (HR-MAS) nuclear magnetic resonance (NMR), 74 which is much less sensitive than MS based metabolomics. 75 Despite the limitations to the starting quantity of sample associated with the small size of the mouse, it is possible to gain novel insights into mechanisms, biomarkers and therapeutic targets of IVD pathologies using optimized protocols for proteomic and metabolomic analyses.
Herein, we describe a protocol for parallel proteomic and metabolomic analysis of mouse IVD tissues, building from the combined expertise of our collaborative team ( Figure 1). Our group has lead technical development in proteomics [76][77][78][79][80] and applied these methodologies to develop biomarker panels to improve classifications F I G U R E 1 Schematic overview of protocol for simultaneous assessment of proteomic and metabolomic changes of ovarian carcinomas, 81 evaluate the potential of multipotent stromal cells for pancreas regeneration, 82 and developed optimized protocols to characterize extracellular matrices, [83][84][85] or increase the detection of low-abundant proteins in extracellular matrix-rich samples. 86 These studies led to the development of the current protocol for label-free quantitative proteomics of murine IVD tissue. 48 Our group has also used metabolomics to develop biomarkers of chronic kidney disease, 87 muscle response to exercise in diabetes, 88 kidney function, 89 and characterize the association of the microbiome to atherosclerosis. 90 The metabolomics methods used previously to assess kidney and muscle tissues required minimal adaptation for use with murine IVD tissue.
The protocol presented below was optimized to maximize the utility of a mouse model for "omics" applications, accounting for the challenges of minimal starting quantity of sample due to small tissue size as well as the extracellular matrix-rich nature of the tissue. This protocol could be used to standardize tissue isolation, sample preparation, fractionation, and run parameters to allow comparative analysis between datasets generated from different research groups using mouse models to study IVD biology. The protocol was developed to investigate proteomic and metabolomic changes in annulus fibrosus tissue from the thoracic spine of a transgenic mouse model of diffuse idiopathic skeletal hyperostosis (DISH). For these analyses, the methods are expected to be robust and reproducible. While the methodologies are sensitive, a primary limitation is obtaining enough tissue from each sample (discs will have to be pooled) and ensuring complete homogenization of small fibrous tissues. Incomplete homogenization will result in a lower number of features being detected and ultimately less peptides and metabolites being identified. Importantly, for metabolomics, prior to experimentation, ensure the detector, lockspray and calibration setups have been performed and the sample cone has been cleaned by sonication in formic acid.
In addition, our method will detect molecules with m/z between 50 and 1200. Features greater than 1200 m/z will not be captured.
For our specific experimental question, the protocol was designed to isolate protein and metabolites from the annulus fibrosus of IVDs within a specific anatomical region from each mouse; sample isolation should therefore be more straightforward for experiments that allow for pooling of IVDs from multiple anatomical regions or those focused only on one type of analysis. This protocol is not limited to a particular genetic background of mouse as IVD size is generally consistent across strains. Furthermore, this protocol is not limited to thoracic IVDs as lumbar and -caudal IVDs would be even larger, and thus easier to isolate samples, though cervical IVDs may be challenging. Theoretically, this protocol could be used for NP tissue, however, there are typically fewer NP cells compared to AF cells in an IVD, so tissue yield by weight (or total protein for proteomics) would need to be equivalent to AF, requiring the use of more IVDs. However, using a whole IVD should not present any challenges compared to use of AF alone. Furthermore, these methodologies should be applicable to other small rodent models such as the rat, as tissue composition is very similar, and tissues are even larger.

| ANTICIPATED RESULTS
Proteomics: Based on our experiments using 2-or 6-month-old C57/Bl6 mice, AF tissues isolated from 4 thoracic IVDs should yield at least 1.5 mg of tissue (wet weight). This should lead to a total protein yield between 40 and 80 μg per sample, of which 25 μg is needed per run. With fractionation, this proteomics protocol consistently quantified >5000 unique proteins per sample (98% of identified proteins were quantified), a greater than 2-fold increase in detection compared to our previous method that did not use fractionation and used a different MS instrument 73 (Table 1).
Importantly, proteins are detected from all cell compartments (cytoplasm, nucleus, mitochondrion, plasma membrane, extracellular matrix) (Figure 2), suggesting this protocol reduces potential bias of high-abundance extracellular matrix proteins in IVD tissue that could obscure low-abundance intracellular proteins.
Metabolomics: Due to the low concentration of metabolites in the IVD compared to plasma, the 300 μL of disc metabolite preparation (generated from eight thoracic AFs;~3 mg of tissue) will allow for 2 to 3 metabolomics runs, if samples are pooled for a validation run with analytical standards. The disc metabolite preparation will not yield high-quality results after 6 months of being stored at −80 C (loss of signal intensity due to low starting quantity), but plasma samples will last over 1 year (retaining the same signal intensity). Based on this protocol, detection of >300 features should be expected in both plasma and IVD samples (Table 2). However, it is important to carefully consider sample size for metabolomics studies, as abundance of metabolites is often highly variable even within sample groups. We suggest a minimum of 10 biological replicates stratified by sex to reduce sex-related differences in metabolite abundance. Metabolomics validation is also critical, whereby one should aim for level 1 validation by analytical standard of at least five metabolites-of-interest based on criteria developed in the field. 91 Overall, the use of this protocol should be expected to provide additional biological insights into molecular mechanisms of spine pathologies, biomarker development, and therapeutic targets beyond the use of genomics or transcriptomics alone. Furthermore, the acquisition of data by multiple "omics" modalities from the same animal allows for integration of multiomics datasets in the future when more powerful bioinformatic tools are developed.

| Metabolomics
For the Waters ACQUITY system, set injection volume of 2 μL for plasma metabolomics, 5 μL for disc tissue metabolomics. Randomize the sample injection order for both plasma and disc runs.
Gradient conditions are as follows.  5 Make transverse cuts within the sacral region ( Figure 3I) below the bottom rib ( Figure 3J) and above the top rib to isolate the cervical, thoracic, and lumbar spines ( Figure 3K). 9 Once an IVD is isolated, use a scalpel to scrape away the hard, cartilaginous endplate from both surfaces ( Figure 3R) and then lacerate the AF on one side ( Figure 3S) and place IVD briefly into PBS to allow the NP to leak out ( Figure 3T; as previously reported 46 ). Scrape along inner AF to remove any remaining NP tissue.

| Gradient conditions
10 Quickly transfer the AF to aluminum foil pouch ( Figure 3U,V) and immediately snap freeze in liquid nitrogen ( Figure 3W). Repeat to collect a total of four thoracic AFs adding each to a single pouch for proteomics. Repeat to collect an additional eight thoracic AFs into a single pouch for metabolomics ( Figure 3X,Y). Overview of proteomics is provided in Figure 4A. CRITICAL STEP: Do not touch protein pellet at this step as it will stick to the pipette tip and significant sample loss will occur.
Ensure the pellet is removed from the side of the tube by a short vortex to ensure maximum surface area exposure to enzymes.
12 Place samples in ThermoMixer at 37 C overnight (approximately 18 hours) at 700 RPM.
13 Add an additional aliquot of 6.25 μL Trypsin/Lys-C (1:100 enzyme to protein ratio vol/vol) to each sample and continue mixing in ThermoMixer at 37 C at 700 RPM for an additional 4 hours.
CRITICAL STEP: Centrifugation must be done following acidification to remove insoluble material that will interfere with injection into the mass spectrometer.    Overview of metabolomics is provided in Figure 4B.
1 For metabolite extraction, transfer tissue to Navy RINO screw-cap tubes (Next Advance, cat. no. NAVYR1) and add 300 μL cold acetonitrile to each sample tube.
2 Homogenize using the Bullet Blender Storm (Next Advance, cat. no. BBY24M) set to Time: 5, and Power: 12, two times.
Note: We found that using the Bullet Blender was the most efficient method for multiple samples. However, this step is for simple tissue homogenization which could be done using a mortar and pestle with liquid nitrogen to freeze the tissue or a tissue homogenizer.
3 Remove the metal beads from RINO tubes using a strong magnet slid upwards slowly along the outside of the tube.
4 Immediately place tubes into −20 C freezer for 20 minutes to allow precipitation of proteins.
5 Cool microcentrifuge to 4 C while samples are in freezer.
7 Remove supernatant and transfer samples to new 1.5 mL microcentrifuge tubes.
PAUSE POINT: Samples can be stored at −80 C for up to 6 months.

| Sample preparation (timing:~5 hours)
Total volume of solvent required for protein precipitation will depend on number of samples to be analyzed. Acetonitrile (ACN) is the organic solvent used in this protocol for protein precipitation. One hundred and fifty microliters of acetonitrile solvent containing internal standards will be added to 50 μL of each sample for protein precipitation. We recommend making 1.2 times the required amount of precipitation solvent.
1 To prepare the precipitation solvent, add chlorpropamide, atenolol-d7, flurazepam, and DL-2-aminoheptanedioic acid to the appropriate volume of LC-MS grade acetonitrile to achieve concentrations of 1.1 μg/mL, 500 ng/mL, 50 ng/mL, and 17.5 μg/mL, respectively. This solution should be prepared fresh.
Note: Chlorpropamide, atenolol-d7, flurazepam, and DL-2aminoheptanedioic acid will serve as the internal standards in the experiment. In our experience, using these internal standards allows analysis of positive and negative ionization for reverse phase chromatography.
2 Thaw samples on ice. Identical sample preparation was done for both AF and plasma samples. 10 Vortex the dilution for 10 seconds.
11 Pool a small volume from each water dilution tube (see note below) into a separate 1.5 mL microcentrifuge tube. This pooled sample will be used for quality control and will be injected at regular intervals throughout the metabolomics run.
CRITICAL STEP: The amount taken from each diluted sample will depend on the number of samples. As large metabolomics runs take a long time and the pooled sample will be injected frequently, you must ensure there is adequate volume to last the entire analytical run. We recommend pooling enough from each sample to reach a total volume of 300 μL or more.

| Data analysis
1 Copy the project folder from run, which contains the "Data" folder (Waters .raw data files), to the computer where analysis will be completed.
Note: We use a separate computer to analyze the data so other samples can be analyzed by LC-MS while data analysis is occurring.
Data analysis can be performed on the same computer as acquisition if desired.
2 Open the script "Convert Waters MSe file to mzData file" (Supporting Information File 1) in RStudio. This script uses the convert.waters.raw package (which will need to be installed) to convert .raw data files to .mzData files. Set the input folder to the full file path name of the "Data" folder and set the output folder to the same file path name of the "Data" folder, but with "/converted" at the end. This will create a subfolder within "Data" called "converted" and put all the converted files into it.
CRITICAL STEP: If using Windows and copying the file path name into RStudio, make sure the slashes separating the directories are forward slashes ("/") and not backslashes ("\"). R uses forward slashes to denote separate directories, whereas Windows uses backslashes.
3 Run the script.
4 Once the script is finished, create a subfolder in "converted" called "Neg." Move all .mzData files from the negative ionization mode into the "Neg" subfolder. Within the "Neg" subfolder, create subfolders for each experimental group, including the pooled run (ie, G1, G2, G3, Pool). Place all data files into their appropriate subfolders according to sample group.
CRITICAL STEP: Ensure that each file has been put into the correct group folder. The script will not run as intended if this is not the case.

IPO script:
The IPO script (Supporting Information File 2) is used to optimize parameters for XCMS data processing for peak picking.
1 In your project folder create another subfolder called "IPO." Inside "IPO," create two folders: "IPO RPLC Neg" and "IPO RPLC Pos." 2 Copy the .mzData data files of all pooled injections from negative ionization mode, minus the first five pooled injections, into the "IPO RPLC Neg" folder created in the step above. Repeat for positive ionization mode pooled files into "IPO RPLC Pos." 3 Change the "setwd" and "save. image" parameters in the IPO script to the appropriate target folders.
5 Copy the parameters generated by IPO into a word document.

XCMS script:
XCMS is designed to provide automated processing of LC-MS metabolomics data.
8 Use the next lines of the XCMS script to make an annotated diffreport (annotateDiffreport( ) function, part of the CAMERA package).
9 Continue with the script to generated box plots and extracted ion chromatograms for all detected metabolites with the diffreport( ) function. Before you do this, you will need to denote the column numbers of the annotated diffreport that contain your sample groups. The groups will always start at column 13 (ie, G1, G2, G3, and Pool are columns 13-16).
10 Repeat steps 3-9 for the positive ionization (Supporting Information File 4) mode files.
11 Create combined positive-and negative-ion diffreport using the combine XCMS script (Supporting Information File 5). The first part of this script loads in files that were saved in the previous scripts for negative and positive ionization modes (saved just before the CAMERA package was loaded).
12 Follow the rest of the script, and you will generate the files "cam-AnotNeg.csv" and "camAnotPos.csv".
13 Use the last script (for combined annotated diffreport) to prepare a final output file for EZInfo (Supporting Information File 6). This script will take your annotated diffreport for negative (cam-AnotNeg.csv) and positive ionization (camAnotPos.csv), normalize each mode to an internal standard, combine both modes, evaluate the quality control (pooled samples) variability, and reorganize/ tidy up the data for upload into EZInfo. The steps for all of these processes are outlined in detail in the script itself.
Note: This step can be done outside of R if desired.  8 View the possible fragments between the different candidate IDs using this strategy. The lower the number beside a fragment, the more likely it is your metabolite-of-interest. 9 Once you have determined some metabolites-of-interest, you will need to order analytical standards to validate the metabolite ID using the steps below.

| Metabolite validation
To confirm the identity of metabolites of interest, analytical standards must be purchased and run in tandem with experimental samples. If the purchased standard has the same retention time, m/z, and fragmentation spectrum as the unidentified target analyte in the experimental samples, it is considered a "level 1" (highest level possible) identification according to a previously defined categorization system of metabolite identification. 91 Steps of metabolite validation for one metabolite of interest are as follows: 1 Create a stock solution of your analytical standard. The concentration of stock solution and the solvent used will depend on the quantity of standard purchased and chemical properties of the compound, respectively.
2 For your metabolite of interest, determine which experimental group yielded the highest average signal intensity from the metabolomics run.
3 Pool plasma and/or AF samples from the experimental group with highest average signal intensity. The amount taken from each sample will depend on the number of samples you have in that experimental group. We recommend a total pooled volume of 100 μL.
4 Add your stock standard solution to LC-MS grade water and plasma/disc pooled samples to achieve a concentration of 100 μM and a total volume of 50 μL, creating a "spiked" water sample, and a "spiked" plasma/disc pooled sample. Prepare a separate aliquot of 50 μL of plasma/disc pooled samples with no standard added ("non-spiked").
CRITICAL STEP: Use a minimal amount of stock standard solution to avoid drastically altering the composition of biological matrices.
We recommend no more than 1% stock (v/v).
Note: 100 μM is a recommended starting point that typically results in a strong signal for most metabolites.
5 Continue with sample preparation for "spiked" water, "spiked" plasma/disc pooled sample, and "non-spiked" plasma/disc pooled sample. Sample preparation from this point forward is identical to steps 3-13 in the metabolomics sample preparation section, excluding step 10.
6 Analyze prepared samples with the same UPLC and MS parameters as described previously. 7 Compare retention time, m/z, and fragmentation spectrum between the various "spiked" samples and the "non-spiked" sample.
If all three categories are match, you have level 1 metabolite identification ( Figure 5).

CONFLICT OF INTEREST
The authors state that there are no conflicts of interest for this study.
F I G U R E 5 Example validation run for metabolomics using phenyl sulfate identified as a level 1 metabolite