Integration of mass spectral fingerprinting analysis with precursor ion (MS1) quantification for the characterisation of botanical extracts: application to extracts of Centella asiatica (L.) Urban

Abstract Introduction The phytochemical composition of plant material governs the bioactivity and potential health benefits as well as the outcomes and reproducibility of laboratory studies and clinical trials. Objective The objective of this work was to develop an efficient method for the in‐depth characterisation of plant extracts and quantification of marker compounds that can be potentially used for subsequent product integrity studies. Centella asiatica (L.) Urb., an Ayurvedic herb with potential applications in enhancing mental health and cognitive function, was used as a case study. Methods A quadrupole time‐of‐flight analyser in conjunction with an optimised high‐performance liquid chromatography (HPLC) separation was used for in‐depth untargeted fingerprinting and post‐acquisition precursor ion quantification to determine levels of distinct phytochemicals in various C. asiatica extracts. Results We demonstrate the utility of this workflow for the characterisation of extracts of C. asiatica. This integrated workflow allowed the identification or tentative identification of 117 compounds, chemically interconnected based on Tanimoto chemical similarity, and the accurate quantification of 24 phytochemicals commonly found in C. asiatica extracts. Conclusion We report a phytochemical analysis method combining liquid chromatography, high resolution mass spectral data acquisition, and post‐acquisition interrogation that allows chemical fingerprints of botanicals to be obtained in conjunction with accurate quantification of distinct phytochemicals. The variability in the composition of specialised metabolites across different C. asiatica accessions was substantial, demonstrating that detailed characterisation of plant extracts is a prerequisite for reproducible use in laboratory studies, clinical trials and safe consumption. The methodological approach is generally applicable to other botanical products.

prerequisite for reproducible use in laboratory studies, clinical trials and safe consumption. The methodological approach is generally applicable to other botanical products. Phytochemicals are the primary source of medicines in many countries. 4 Their use can reach as much as 80% of people in indigenous populations, 5 and they are becoming increasingly popular in Western countries. [6][7][8] Upon ingestion, the dose-dependent effects of these compounds may have a variety of benefits for human health. 9 Numerous new reports popularise the use of plant-derived supplements, including phytochemicals, for promoting human health. The industry is responding to this demand by expanding the diversity of plant-derived products and dietary supplements. However, concomitantly, concerns regarding the quality, safety and purity of these products have been stimulated by an increasing number of toxicity reports involving plant-derived supplements. 10 Botanical supplements are typically complex mixtures of specialised metabolites of which some are biologically active compounds.
The phytochemical profiles of botanical preparations can differ significantly from batch to batch since they can be affected by geography, genetics, ontogenetic stage, plant materials used and post-harvest processing methods among others. 11,12 The stochastic nature of the phytochemical composition influences the biological and pharmacological activity of the product and thus severely impacts the reproducibility of preclinical studies and clinical trials. Many botanical supplements are plant extracts instead of raw materials. In addition to concerns of adulteration and misidentification of raw materials, the phytochemical profiles in extracts depend highly on the extraction methods used. 13 Thermal and chemical treatments of plant extracts may cause the degradation of phytochemicals. 14 During the preparation of botanical supplements, composition and levels of phytochemicals may change, which in turn may affect the bioactivity of the preparations.
Unless disease treatment is claimed, the US Food and Drug Administration (FDA) regulates plant extracts as food; therefore, the requirements to demonstrate safety and efficacy are less stringent compared to pharmaceuticals. 6 Due to enhanced efforts in improving the characterisation of unregulated over-the-counter botanicals, advanced analysis strategies are needed to ascertain authentication and consistency of botanical supplements.. 6,10,[15][16][17] In the past, the analysis of phytochemical preparations was largely In this proof-of-concept study, we applied a high resolution MSbased workflow in conjunction with post acquisition interrogation for the characterisation of aqueous extracts of the medicinal plant Centella asiatica (CA), a member of the Apiaceae family, which has been used to improve memory and mental health. [18][19][20][21] Recent studies in humans and rodent models are supportive of C. asiatica preparations as complementary medicine to improve memory in ageing-related mild cognitive decline and potentially Alzheimer's disease. [21][22][23] Centella asiatica has also been reported to possess other biological activities benefiting human health including anti-inflammatory and immunostimulant properties, promoting wound healing, and ameliorating leprosy, lupus, tuberculosis and gastric ulcers. 19,21,[24][25][26] The activities of C. asiatica have largely been attributed to its constituent triterpenes, saponins and sapogenins. 27 Almost 60 compounds belonging to these and other phytochemical classes have been reported in C. asiatica. 21 For many of these, their role in C. asiatica's biological activity or their mode of action is not known, and many others, as yet unidentified compounds are present which may also contribute to its activity. We report here a new phytochemical analysis workflow that allows both untargeted fingerprinting for determining distinct phytochemicals in various C. asiatica plant extracts, the tentative identification of 117 of these compounds, and simultaneously the accurate, targeted quantification of eight caffeoylquinic acids, seven flavonoids, five hydroxycinnamic acids and four pentacyclic triterpenoids. We report the analytical method's characteristics including limit of detection (LOD), limit of quantification (LOQ), dynamic range and reproducibility for the quantification method. The integrated workflow was applied to the characterisation of aqueous extracts of C. asiatica plant materials from multiple sources.  (6), caffeic acid (7), epicatechin (8), 1,5-dicaffeoylquinic acid (9), 1,3-dicaffeoylquinic acid (10), rutin (11), dihydroferulic acid (12), 3,4-dicaffeoylquinic acid (13), 3,5-dicaffeoylquinic acid (14), ferulic acid (15), 4,5-dicaffeoylquinic acid (16), naringin (17), isoferulic acid (18), quercetin (19), madecassoside (20), asiaticoside (21), kaempferol (22), madecassic acid (23) and asiatic acid (24 Caffeoylquinic acids are prone to degradation or isomerisation under certain conditions including pH, light exposure, and temperature. [28][29][30] To protect compounds from degradation, all standards and samples were prepared in methanolic solutions containing 0.1% v/v formic acid and kept in the dark at −20 C until analysis.

| Plant materials and preparation of aqueous extracts of Centella asiatica
The identity of the plant materials was confirmed at the supplier (Oregon's Wild Harvest) by organoleptic analysis and Fouriertransform infrared analysis (FTIR). Identity was further verified at Oregon Health and Science University by thin-layer chromatographic comparison of zone profiles with earlier batches of C. asiatica, and with reference standards of characteristic triterpenes (asiatic acid, madecassic acid, asiaticoside, madecassoside) as well as caffeoylquinic acids known to be found in C. asiatica.  23 The preparation of the C. asiatica water extracts was reported previously. 23,31 In brief, dried C. asiatica extracts were prepared by refluxing aerial parts of the plant (stems, leaves and flowers but not roots; 80 grams per 1 litre of deionised water) for 1.5 h, cooling for 30 min to allow for handling, and then filtering the suspension to remove plant debris. The aqueous extracts were freeze-dried and stored at −20 C and analysed within 2 months of preparation. This method was modified from earlier work by Veerendra et al. 32,33 who showed that exhaustive water extraction of C. asiatica produced a residue with greater cognitive enhancing properties than extracts made with methanol or chloroform.
For quantification of the individual compounds in dried C. asiatica extracts, a stock solution was prepared as follows. Briefly, 10 mg of each freeze-dried extract powder was resuspended in 10 mL of aqueous methanol (70% v/v with 0.1% v/v of formic acid) by sonication (30 min, 25 C; see Supporting Information Figure S9), centrifuged (14000 × g for 10 min) and filtered with 0.22-μm polyvinylidene fluoride (PVDF) Whatman filters before analysis. This procedure was used to prepare extracts from eight different accessions of the plant materials labelled from CA1 to CA8. Aliquots of 1 mL from each extract were pooled to generate a quality control sample (QC) used for evaluating LC-MS/MS platform performance.

| Fingerprinting of Centella asiatica extracts by untargeted data-dependent analysis
For the chemical profiling analyses, a pooled CA sample (QC sample) was used. Untargeted high-performance liquid chromatography (HPLC) combined with high resolution accurate LC-MS/MS was conducted using a Shimadzu Nexera UHPLC system connected to an AB SCIEX TripleTOF ® 5600 mass spectrometer equipped with a Turbo V ionisation source operated in the electrospray ionisation (ESI) mode.

Chromatographic separation was achieved using an Inertsil
Phenyl-3 column (4.6 mm × 150 mm, 100 Å, 5 μm; GL Sciences, Torrance, CA, USA). The injection volume was 10 μL. Three technical replicates were conducted. Gradient elution was performed using a mobile phase consisting of solvent A, water containing 0.1% v/v formic acid, and solvent B, methanol containing 0.1% v/v formic acid.
Flow rate was 0.4 mL/min. The chromatographic method was 30 min, and the gradient design was as follows: an initial 1 min at 5% B, followed by 5 to 30% B from 1 to 10 min, then 30 to 100% B from 10 to 20 min, hold at 100% B from 20 to 25 min, and then return to 5% B from 25 to 30 min.
Data-dependent acquisitions (DDAs) were conducted for obtaining precursor and fragment ion information for aiding in annotating compounds in the CA extracts. 34 DDA analyses were con-  In Figure S1 the workflow is described to obtain in putative annotations [Level 2 (L2) annotations 35,36 ] based on exact mass, isotopic pattern and MS/MS spectral data. In addition, three manual data eval-  Table 1. It is important to note, that these compounds are tentative annotations only in accordance to the guidelines described by Sumner et al. 35 and will need to be further confirmed in future work.
Raw data processing was performed using Progenesis QI™ software with METLIN™ plugin V1.0.6499.51447 (NonLinear Dynamics, Newcastle Upon Tyne, UK) and entailed peak picking, alignment and searching of multiple databases to assist in compound annotations.
For the current study, we searched the mass spectral data against METLIN, 37 40 Progenesis QI™ uses as built-in search engine Metascope and provides a "score" for the quality of the compound annotation, using a range from 0 to 100, with 100 being a perfect match based on the mean of multiple similarity metrics. 41 The current data were evaluated based on the accurate mass similarity, isotope similarity, and fragmentation score (ranging from 0 to 60 representing how well the observed data matches the spectral library entries or the theoretical fragment data based on the bond dissociation approach which is a computational method that calculates expected fragments based on theoretically derived bond dissociation energies 42 ). Progenesis QI's fragmentation algorithm is described by Wolf et al. 42 and Horai et al. 43 A Progenesis QI score ≥ 50 is typically reached when isotopic pattern similarity is above 90%, MS/MS spectral data similarity is > 50% and the deviation of the accurate mass from the exact mass is lower than 5 ppm. Progenesis QI score ≥ 50 was considered as adequate for being considered as a candidate for putative annotation (L2 annotations according to Sumner et al. 35 ). This score is more rigorous than previous reports using Progenesis QI™ with a score > 31.6, 44 putative annotations based only on accurate mass and isotope similarity 45 or with a mass error of 20 ppm. 46 Additional features were assigned by querying and comparison with KNApSAcK online library. 47 Supporting Information Table S1 lists identified (L1) and putatively assigned (but unverified) metabolites (L2 annotations), and provides access to the following properties: RT, monoisotopic ion mass, ions observed and molecular formula. Figure S7 compiles positive matches (red lines) with the entries in the respective spectral libraries. In the case of structural isomers, the best match (highest score) against the MS/MS spectral data was selected.
To illustrate chromatographic performance, we used extracted ion chromatograms (XICs) of 22 annotated ions detected in the positive ion mode ( Figure S2) and 24 detected using the negative ion mode ( Figure S3).

| Chemical similarity network and clustering
We built a chemical space network based on the compounds listed in Tanimoto coefficient represents an associative coefficient with a value ranging from 0 to 1, numerically expressing the structural similarity between a 2D binary comparison (0 being no similarity and 1 being complete similarity). 49,50 A Tanimoto coefficient greater or equal to 0.68 indicates that the compounds being compared are structurally similar and statistically significant at the 95% confidence interval. 51 Tanimoto coefficients were exported to Cytoscape (V3.6.1) for graphic visualisation and a 2D structural similarity network was created for the compounds listed in Table 1. Summary of detected compounds in Centella asiatica extracts (pooled sample) by extensive querying and comparison with spectral data (METLIN, HMDB, ChEBI and our in-house library) and compound libraries (KNApSAcK and PantMAT) using Progenesis QI™ and applying the workflow shown in Figure S1. Compounds are labelled with their respective PubChem CID. Additional parameters are shown in Table S1. Categories were assigned according to structural similarity using Tanimoto algorithm, and they may correspond to more than one compound class. Compounds confirmed using authentic standards are shown in bold [Level 1 (L1) identifications]; all other compound assignments are based on Level 2 annotations (MS/MS spectral matches are presented in Figure S7). Eighty-seven compounds that were detected for the first time in C. asiatica extracts are denoted with an asterisk '*'

| Accuracy and recovery experiments
To test the accuracy of the method using precursor ions, three stan-

| Application of precursor ion (MS1) quantification method for plant extracts
For precursor ion quantification (MS1 quantification) of phytochemicals in extracts, the same chromatographic runs for untargeted analysis were used for quantification of distinct phytochemicals.

T A B L E 2
Analytical parameters for authentic standards. Exact m/z used for extracted ion chromatogram (XIC), retention times (RTs), limit of detection (LOD) and limit of quantification (LOQ), percentage of accuracy for three concentrations, and percentage of relative standard deviation (%RSD) are given for 24 Figure S1. To our knowledge, this analysis includes 87 compounds that have been reported in plants previously but have been now detected for the first time in C. asiatica extracts. 17,19,21,24,25,27,[52][53][54][55][56][57][58][59] MS/MS spectra and spectral matches of the newly detected compounds in CA are provided in Figure S7. Some of the most abundant compounds include six di-caffeoylquinic acid isomers, quinic acid, mono-caffeoylquinic acids, and several glycosides, such as asiaticoside, madecassoside and quercetin 3-O-glucoside. It is noteworthy that the current chromatographic separation conditions resolved di-caffeoylquinic acids isomers 3,4-, 3,5-and 4,5-di-caffeoylquinic acids (Table S1, Figure S5). Analytical parameters for the annotated compounds, namely m/z, RT, detected adducts and molecular formulas are shown in detail in Table S1. When compounds were detected in both ion modes, the one with the highest signal-to-noise (S/N) ratio was included. Annotated compounds include five hydroxycinnamic acids, nine mono-and di-caffeoylquinic acids, 12 terpenoids, 13 flavonoids, 11 hexosides among other phytochemicals (Table 1).
In order to capture the similarities and differences in metabolite composition observed for the eight CA accessions available to us, we  Figure 5. The area under the curve has been averaged across three replicates. The colours in the heatmap indicate the z-score which was calculated by subtracting the mean of the peak areas for a metabolite across different samples and dividing it by the standard deviation of the metabolite across all the samples. The red colour indicates positive zscore, the white colour indicates zero z-score, whereas the blue colour indicates negative z-score. Higher intensity of the colour in the scale indicates a higher magnitude of the z-score. The dendrogram on the x-axis indicates the degree of similarity between the metabolites, the closer the metabolites the higher the level of similarity in them and the metabolites have been clustered using hierarchical clustering. Similarly, the dendrogram on the y-axis indicates the degree of similarity between the different samples (different CA accessions), the closer the samples the higher the level of similarity in them and they have been clustered using hierarchical clustering (Ward, Euclidean distance). PCA was performed using MetaboAnalyst V4.0 [Colour figure can be viewed at wileyonlinelibrary.com] these two samples are linearly related, indicating similarities in metabolite contents for CA4 and CA6 extracts. The heatmap with hierarchical clustering ( Figure 2D) visualises the precursor ion peak areas for 14 compounds evaluated in all CA extracts; these compounds were selected because they were present at relatively high concentration in all plant accessions and authentic standards were commercially available. Peak areas were averaged across three replicates. The dendrogram on the y-axis indicates the degree of similarity or difference between the CA compound levels in the CA accessions, e.g. CA3 and CA8 are closer in the clustering tree indicating higher similarity of the compound levels in extracts CA3 and CA8, whereas extract CA6 is separated in the dendrogram from CA3 and CA8 indicting little similarity of the compound levels of the C6 extract with the levels found in the extracts of CA3 and CA8.  (Table S1) found consistently in the aqueous extract of all eight CA accessions, which was created using the Tanimoto similarity score. 49 Compounds are arranged in 14 interconnected clusters that are structurally similar at the boundary nodes at the 95% confidence level. We used the 2D structural similarity network to support our tentative annotations of metabolites and the associated categorisation into compound classes or clusters (

| Accurate quantification of phytochemicals in extracts using precursor ion (MS1) quantification
In this work, we demonstrate the suitability of using a LC  Figure S4).  Table 2). The analytical accuracy for three known concentration samples at the low (0.05 ppm), medium (0.50 ppm) and high (5.0 ppm) calibration curve intervals ranged from 87 to 125% (Table 2). The RSD was measured for a solution of 1 mg/L and ranged from 6.8 to 24% for nine repetitions measured in a span of 6 months ( Table 2).

| Method validation for selected compounds
The availability of high-resolution accurate mass data allowed us to obtain comprehensive fingerprints for botanical extracts that can be further interrogated post acquisition to obtain accurate quantification of phytochemicals by extracting the precursor ions and using the area under the peak for quantification (MS1 quantification) in the same analytical run. Quantified compounds showed good linearity over three orders of magnitude (0.005-5.0 mg/L, r > 0.990, Table S2).
Matrix effects are frequently observed when analysing complex samples. In order to evaluate matrix effects in the CA extracts, pooled CA extract samples were spiked with the 24 available standards. TICs obtained for a CA extract and the same sample after standard addition is shown in Figure S6. For plant extracts, recoveries of individual compounds ranged from 71 to 144% and 91 to 132% for 0.25 and 5.0 ng on-column, respectively (Table 3), confirming the feasibility of the proposed procedure for quantitative analysis of marker compounds in CA extracts.
A range of three orders of magnitude is typical for time-of-flight (TOF) analysers, which is a disadvantage when compared with the dynamic range of triple quadrupole analysers, which usually feature a linear dynamic range that extends over six orders of magnitude. Nevertheless, the high resolution allows us to obtain chemical fingerprint and quantification of marker compounds in the same analytical run, 64 F I G U R E 4 Extracted ion chromatograms (XICs) of 18 compounds that were used for precursor ion (MS1) quantification. Individual analytical parameters are shown in Table 2 Our study shows that CA extracts were particularly rich in monocaffeoylquinic acids, such as 3-caffeoylquinic acid, 4-caffeoylquinic acid and 5-caffeoylquinic acid, and di-caffeoylquinic acids, such as 1,3-di-caffeoylquinic acid, 1,5-di-caffeoylquinic acid, 3,4-dicaffeoylquinic acid, 3,5-di-caffeoylquinic acid and 4,5-dicaffeoylquinic acid, as well as some triterpenoids such asiaticoside, madecassoside and their aglycones (Table 3, Figures 4 and 5). CA extracts also contained several flavonoids and hydroxycinnamic acid derivatives. Figure S8 compiles positive matches with certified standard compounds. Figure 4 shows XICs for 18 selected compounds quantified in CA water extracts using area under the curve of the precursor ion acquired in DDA mode.
Under the HPLC conditions used, 1,3-and 1,5-di-caffeoylquinic acid co-elute since they are stereoisomers (structures shown in Figure S4). For all other compounds quantified in this study, the combination of suitable separation conditions with extraction of molecular ion (MS1) chromatograms at accurate m/z values ( Figure S5, Table 2) enabled detection limits in the low nanomolar to picomolar range for 24 phytochemicals (Tables 2 and S2). The optimised analytical procedure minimised interferences by improving the chromatographic separation of isomers with similar fragmentation patterns and thereby also optimised detection limits.
In this study, eight accessions of C. asiatica were quantified using precursor ion (MS1) quantification ( Figure 5). Comparing samples CA6 and CA1, the concentrations of asiaticoside varied by 19.9-fold, asiatic acid by 9.1, and madecassoside by 1.5. In the case of di-caffeoylquinic acids, comparing CA5 and CA1, 4,5-dicaffeoylquinic acid varied 11-fold. Mono-caffeoylquinic acids presented less variation across the accessions ( Figure 5). This emphasises the importance of establishing rigorous analytical procedures for botanical extracts and supplements to ensure product integrity and batch-to-batch reproducibility.
To conclude, we developed a method for untargeted and targeted characterisation of CA extracts using the same chromatographic run.
The combination of suitable separation conditions with mass spectral data acquired with high resolving power using DDA enables the Overall, the current study underscores the need for methods to efficiently analyse highly complex plant extracts to support the standardisation of botanicals destined for preclinical studies, clinical trials and commercial products.