Species identification and decay assessment of Late Pleistocene fragmentary vertebrate remains from Pin Hole Cave (Creswell Crags, UK) using collagen fingerprinting

Ancient bone remains are widely utilized when investigating vertebrate biodiversity of past animal populations but are often so highly fragmented that the majority of specimens cannot be identified to any meaningful taxonomic level. Recently, high‐throughput methods for objective species identification using collagen peptide mass fingerprinting have been created to overcome this with the added indication that they could also offer a means of relative ageing through decay measurement. Here we explore both species identification and decay measurements for the Pin Hole Cave ‘microfaunal’ assemblage, the site that has been designated as a representative for Marine Oxygen Isotope Stage 3 in Britain in terms of its suite of mammalian fauna. We explore the technique's potential to corroborate the faunal diversity established previously by macroscopic studies and evaluate the decay measurements across the species boundary. The results support that the analysis of fragmentary remains by collagen fingerprinting can yield a more diverse set of fauna, and offer additional information relating to taphonomy, than the analysis of morphologically intact bones on their own. However, although useful for identifying likely contaminations of an assemblage, there was an unexpected decrease in the decay measurements observed for some megafauna compared with much younger microfauna, indicating that other factors need to be carefully monitored before it could be used as a relative ageing technique in Quaternary deposits.

Animal bone is one of the most common finds on archaeological and palaeontological sites and can be particularly informative on the temporal and geographical distributions of extant and extinct taxa, as well as ways in which past human groups interacted with and/ or managed animal fauna that they shared their immediate environment with. However, faunal assemblages are often highly fragmentary in nature, whether this is due to human-mediated processing of bone, the actions of other carnivores prior to burial, or through unintentional taphonomic damage that occurred after the bones entered the burial environment, such as trampling and sediment compaction (Lyman 1994). The subsequent alterations that occur to the original bones result in a loss in morphological integrity, impeding species identification based on methods that utilize predefined sets of morphological criteria.
Several alternative approaches to species identification of fragmentary bones have been considered, ranging from histology (Hars anyi 1993;Hillier & Bell 2007;Cuijpers & Lauwerier 2008;Greenlee & Dunnell 2010), to biomolecular analyses (Lowenstein et al. 2006;Blow et al. 2008). Histological analysis, which can be considered a type of morphological analysis on a microscopic scale, is relatively low cost but has not been adopted widely in archaeology or palaeontology due to the amount of time, expense and technical expertise required in relation to the limited taxonomic resolution available (Cuijpers 2006). The most obvious biomolecular approach is the use of ancient DNA (aDNA), which yields the greatest taxonomic resolution and can be carried out at ever-decreasing costs. However aDNA analysis still requires substantial investment in laboratory procedures and more importantly often fails due to poor DNA survival rates in temperate environments, driven by a number of different extrinsic factors such as soil hydrology, pH and temperature (Nielsen-Marsh & Hedges 2000). Other biomolecules are available that might also provide species information, such as lipid analysis. Lipids can yield information relative to particular groups (e.g. separating ruminants from non-ruminants) but the taxonomic resolution provided by lipid analysis remains particularly limited. Perhaps one of the most promising compromises between taxonomic resolution and endogenous signal (i.e. biomolecule) survival is in the use of proteins. Immunoassay techniques were initially utilized (Fletcher et al. 1984;Hyland et al. 1990;Cattaneo et al. 1992), but within the past decade an increasing number of applications of soft-ionization mass spectrometry has been developed.
The use of collagen amino acid variation is increasingly being applied to archaeological material (Buckley et al. 2008a(Buckley et al. , 2009(Buckley et al. , 2010Richter et al. 2011). Initially described as a type of 'Zooarchaeology by Mass related domesticated taxa such as sheep (Ovis) and goat (Capra) faunal remains (Buckley et al. 2009(Buckley et al. , 2010Buckley & Kansa 2011). Applications of this method have been expanded to include a much wider range of wild taxa Buckley et al. 2014) and the technique has been shown to offer genus level identifications in most mammalian taxa, with species level resolution in some groups such as camels (Rybczynski et al. 2013) and rodents .
Archaeological caves represent an invaluable source of information that encapsulates early human and animal activity and faunal (and vegetation) histories that can allude to past climatic changes. Due to the geographical isolation of Britain, diagnostic groups of particular mammalian taxa are considered to be useful in Quaternary biostratigraphy (Schreve 2001), where the 'mammal assemblage zone' for Marine Isotope Stage 3 is based on the distinctive fauna described from the Lower Cave Earth in Pin Hole Cave (Currant & Jacobi 2001). The primary aims of this work were to determine the extent to which an analysis of fragmentary remains collected from an archaeological site can support or improve upon those based on morphology alone, as well as to evaluate the potential for decay measurements as a means of relative ageing across species. Specifically, we compare assemblage composition between morphologically identifiable bones excavated in the 1920s to the fragmentary material excavated in the 1980s. The secondary aim of this study was to improve our understanding of the taphonomic history of the Pin Hole Cave faunal assemblage through the species identification of the fragmentary bone remains. For example, we were interested in determining whether the fragments were predominantly from taxa likely to have been gnawed upon by the hyaenas that intermittently occupied the caves or whether fragmentation is more likely to have occurred from human activity. Finally we discuss how this newfound molecular-based knowledge compares to what we knew solely from the morphologically identified material.

Study site -Pin Hole Cave, Creswell Crags
The Creswell area is situated on the border between the counties of Derbyshire and Nottinghamshire (UK), and is dominated by a north-south linear outcrop of Upper Permian Cadeby Formation limestone (formerly known as the Lower Magnesian Limestone). The limestone outcrop consists of a prominent west-facing escarpment from which the surface of the outcrop slopes gently to the east where it is overlain by younger rock formations. Creswell Crags constitutes a narrow fluvial gorge running west-east through this limestone outcrop with the entrances to the caves on both the north and south sides of the gorge; radiometric dating of flowstones has shown that the oldest sediments in the caves date to c. 300 ka (Rowe et al. 1989).
The cave has an entrance in the north side of the Creswell Crags gorge and measures 31 m long by~1-2 m wide. To the east of the main passage about 18 m from the entrance there is a small chamber called the 'Inner Chamber'. Similar to today, it is likely that fine-grained sediments washed down into the cave through small fissures in the limestone, mainly in Devensian times (c. 50 to 10 ka). Excavations in the late 19th century and early 20th century revealed that there were at least two principal sediment bodies dating to the Pleistocene period, an upper red cave earth and a lower yellow cave earth but with faunal remains and lithic artefacts found throughout both (Armstrong 1932). The thickness of these deposits increased with distance from the present-day cave entrance, indicating that the sediment pile dips to the south towards the cave entrance (Jacobi et al. 1998). Further excavations were carried out in the 1980s of two small areas from the remaining sediments approximately 30 m into the cave, with one~1.591.0 m at the top of the sequence, and the other~1.090.5 m investigating much earlier deposits at the base (Jenkinson 1989) to more carefully obtain microfaunal remains. The deposits as a whole are thought to span from 50-40 ka (Higham et al. 2006) through to the Lateglacial period (Hedges et al. 1989) including three phases of human occupation at the start of this period by Neanderthals, and subsequently between c. 40-28 and c. 12 ka by anatomically modern humans. It is the faunal remains from the 1980s excavations that have been analysed using collagen fingerprinting in this study.

Material and methods
In brief, collagen was solubilized through partial demineralization with 0.6 M hydrochloric acid for 18 h followed by ultrafiltration into 50 mM ammonium bicarbonate and digestion with sequencing grade trypsin (Promega, UK) at 37°C overnight. The peptide digests were then diluted and fingerprinted (i.e. peptide mass fingerprints (PMFs)) with a Bruker Ultraflex II matrix assisted laser desorption ionization time of flight (MALDI-ToF) mass spectrometer (reflectron mode resolution >10 000) with up to 2000 laser acquisitions. In order to cover all appropriate megafaunal species, the collagen peptide markers for 21 taxa (Mammuthus, Cervus, Rangifer, Ovibos, Capra, Ovis, Bos, Bison, Sus, Capreolus, Equus, Diceros, Panthera, Ursus, Crocuta, Meles, Canis, Vulpes, Alopex, Oryctolagus and Lepus) were taken from previous studies, with additional samples of the elk (Alces), giant Irish elk (Megaloceros), wolverine (Gulo), pine marten (Martes), weasel (Mustela), otter (Lutra) and beaver (Castor) being included here (Figs 1, S1, S2). Using this information, previously published sequence information as well as fingerprints, these markers were identified for all taxa in this study with novel ones confirmed through tandem sequencing where possible. Tandem mass spectra were acquired through the use of an LC-Orbitrap Elite mass spectrometer following Wadsworth & Buckley (2014) and searches using Mascot with a local database (Buckley et al. 2015) improved with select additional sequences obtained from a protein BLAST search (Table S1), particularly for red fox, beaver and walrus (the latter for improved understanding of peptide marker labels). Error tolerant searches (5 ppm and 0.5 Da error tolerances on precursor and tandem spectra, respectively) were used to confirm proposed peptide markers as homologous (e.g. Figs S3-S7). The ancient material consisted of screening the 12 317 fragments from Jenkinson's 1980s excavations (reposited in Creswell Crags Heritage Centre) for which peptide fingerprints were obtained in Buckley et al. (2016) following a similar but less damaging method (i.e. demineralization with 0.3 M HCl over 4 h). Although described in detail in Buckley et al. (2016), the acid-solubilized collagen was ultrafiltered using 30 kDa molecular weight cut-off ultrafilters into 50 mM ammonium bicarbonate, digested with trypsin overnight at 37°C and subsequently diluted in alpha hydroxycinnamic acid matrix and spotted for analysis using a MALDI-ToF mass spectrometer, also collecting 2000 laser acquisitions for each. One example from each taxonomic group identified is presented in the Supporting Information (Figs S8-S12). Where the sequence information is assumed, each peptide marker is annotated with a labelling format in which the first number indicates the alpha chain (i.e. 1 or 2 in type I collagen, which is made up of two alpha 1 chains and one alpha 2 chain), followed by the letter 't', representing the enzyme, and then the consecutive number of the peptide from start to end of the tropocollagen molecule (including telopeptides but excluding signal peptide sequences) assuming complete enzymatic cleavage. For example, 2t67 is the 67th peptide of the alpha 2(I) chain (this was labelled simply as 'G' in previous publications).

Species biomarkers for macrofaunal bone collagens
As can be seen from Table 1, most taxa of interest could be separated down to at least the genus level using the markers previously described. Although the collagen markers for hyaena and lion had previously been described, one of the additional markers here provides a robust addition to the resolving ability between these closely related groups (Fig. S1). Also as expected, the wolverine (Gulo) shares the majority of its markers with the badger (Meles) fingerprint (except peptide 2t67 at m/z 2999.4 rather than m/z 2957.4), the only previously published mustelid taxon, but could not be easily separated from its closer relatives the weasel (Mustela) or pine marten (Martes) despite being of different genera (cf. Figs 1, S2). Other exceptions included the inability to separate red deer (Cervus) and the giant Irish elk (Megaloceros).

Pin Hole Cave collagen fingerprint identifications
From the 12 317 fingerprints from Buckley et al. (2016) initially screened for murine rodent remains, 782 could be matched to the larger mammalian taxa described above (Table S2). As expected, the most abundant taxon identified was reindeer (Rangifer), making up almost half of the identified fragments ( Fig. 2), with relatively very few from other deer. The next most abundant taxa were predominantly herbivores, such as the woolly rhinoceros (Coelodonta; 12%), bovines (Bos/Bison; 5%), wild horse (Equus; 7%) and woolly mammoth (Mammuthus;~4%). Collectively the herbivores accounted for~72% of the remains to the exclusion of the hares which themselves were a further 11% (Fig. 2). The two most abundant carnivores were also some of the largest, namely the bear (Ursus;~5%) and hyaena (Crocuta;~5%) with a very small number of lion (Panthera; 1%) fragments. The medium range carnivores, namely the red fox, arctic fox and wolf (Fig. 3), collectively made up <5% of the total assemblage along with an equally small number of mustelines (~1%). A few specimens were also identified as ovine (sheep) and porcine (boar), likely Holocene contaminants (see below).
Post-translational modificationmeasuring glutamine decay for relative age estimation Early in the development of collagen peptide analysis for the species identification of animal bone we proposed the potential of post-translational modifications (PTMs), such as asparagine and glutamine deamidation, as a means to estimate the relative age of archaeological bones from a particular site (e.g. Buckley et al. 2010). We have further investigated PTMs in this study in order to explore whether or not they could be used to identify much younger 'contaminant' Holocene taxa within the Pleistocene deposits. In our results (Fig. 4) we generally did observe much lower levels of glutamine deamidation in the m/z 1105 (1t47) peptide in the caprine and boar than we did in all but one of 23 Apodemus specimens, a species that went extinct in the early Pleistocene (Berry 1969), only returning during the Lateglacial interstadial c. 15 ka (Yalden 1982). By Table 1. Collagen peptide markers screened in this study following Buckley & Kansa (2011) as detailed in Buckley (2016), including lettering labels from Buckley et al. (2009); typically the most abundant of several hydroxylation variants where present (AE16). *Absence not unexpected due to the presence of amino-terminal proline and its impact on digestion with trypsin (Keil 2012). P Note that Panthera/Crocuta also appeared most readily distinguishable from the P markers (at m/z 2216/2246 respectively) described elsewhere for pinniped separation , identified here through sequencing of the walrus collagen (Fig. S7). E This marker was excluded from future studies due to lack of observation in other ancient samples (Buckley & Kansa 2011), but found well preserved enough here for distinctions in these taxa. c Cervid here refers to Megaloceros and Cervus. comparison to the same decay measurements for two Pleistocene fauna, arctic fox (Alopex) and hyaena (Crocuta), the latter were on the whole significantly more degraded than the former, both of which more degraded than both the modern reference material and the Holocene fauna. Interestingly, a much greater level of decay was observed for the field mice (Apodemus) than the older megafauna (with hyaena becoming extinct c. 30 ka (Stuart & Lister 2014)).

Comparison of the 1980s excavations with earlier Pin Hole faunal assemblages
Several authors have reported on interpretations of the faunal remains from the earlier 1920s excavations carried out by Leslie Armstrong and his team (Armstrong 1932 Fig. 3. Collagen peptide mass fingerprint spectra from Pin Hole specimens focused on m/z 1400-1800 to show the newly proposed markers that distinguish amongst canids (peptide markers useful for taxonomic discrimination with selected mammals are underlined). number of identified specimen (NISP) results from the fragmentary specimens excavated in the 1980s to the minimum number of individuals (MNI) results determined from the morphologically identifiable 1920s excavations we see a higher representation of the main prey items, such as the reindeer, woolly rhinoceros and bison, the main exception to this phenomenon was with the horse remains (Fig. 5). Likewise the majority of carnivore taxa were represented with lower counts in the more fragmentary remains, including the fox, wolf, musteline and bear; with hyaena being the clear exception (Fig. 5). Of the species not reported by Jacobi et al. (1998) as being present within the lower cave earth assemblage but hypothesised as probably having been present based on contemporary faunal assemblages from nearby caves, the arctic fox was observed in this study (e.g. Fig. 3). Even though there are inherent limitations to this study, in that we were unable to compare MNI with NISP from exactly the same excavations, this is still a particularly strong example of the potential of collagen fingerprinting for moving beyond the taphonomic bias of species inferences from only examining morphologically intact remains.

Significance of the identifications of the fragmentary remains
Although there is the obvious limitation that such an analysis as presented here only proves to increase the NISP rather than the more useful count of MNI, it can be highly informative of the taphonomy of the assemblage. In this case we observed a relatively high number of fauna that are known prey for scavenging hyaena (Bison, Rangifer, Coelodonta, etc.). Therefore it is unsurprising that these are relatively more fragmented as well, as high levels of fragmentation are characteristic of bone accumulations in hyaena dens (Kuhn 2005). The only exception to this observed pattern is the relatively low frequency of horse in the fragmentary assemblage. As the horse skeleton is no more likely to survive diagenesis than the more robust remains of mammoth or rhinoceros, the reason for this observation could speculatively be either due to the ability to morphologically identify the remains of horse more easily, or perhaps more likely due to temporal differences between the presence of horse and hyaena.  There was a very small number of unexpected taxa, including at least two boars and three sheep, but these are most likely to be intrusive Holocene material, a conclusion that is supported by deamidation measurements (Fig. 5). A noticeable lack of human remains within the fragments, despite being known indirectly from the abundant stone artefacts present at the site, could also reflect connected spatial and temporal differences, where the two bone accumulators (humans and hyaena) do not tend to occupy the immediate ecospace for long (Discamps et al. 2012).

Collagen fingerprinting for species identification
The collagen fingerprinting technique has several key advantages over other approaches, namely that it can yield objective identifications and is cost-effective when considering human processing (analysis) time where hundreds of samples can generate results well within a 2-day period, and that it uses a robust biomolecule known to yield collagen fingerprints from specimens millions of years old (Rybczynski et al. 2013) and using more sensitive techniques even claimed to survive in dinosaur remains (Asara et al. 2007;Schweitzer et al. 2009), but not without criticisms (Buckley et al. 2008b;Kaye et al. 2008;Bern et al. 2009). Although we have emphasised how promising this relatively new technique of species identification is, the taxonomic resolution remains the key limitation when compared with aDNA (although it is considerably greater than can be achieved using the other molecular and nonmolecular methods discussed above). Even though the method is typically capable of obtaining genus-level information, with elephantids being a notable previously known exception ), and to the species level with some (Rybczynski et al. 2013;Buckley et al. 2016), there were further limitations identified in this study in some lineages. In particular these were within the carnivores, particularly the feliformes of the order Carnivora, where distinction was only observed at the level of subfamily (e.g. pantherine vs. feline), and the mustelids, where distinction could be made between the badgers, otters and mustelines, but not confidently within the musteline genera themselves. However, within the caniform taxa of the order Carnivora the three canid genera of interest to this study could be discriminated: Canis, Vulpes and Alopex. Previous research into the taxonomic resolution of species identification by collagen fingerprinting indicated that, at least with medium-large mammals (and therefore linked with generation times), divergence times of c. 4-5 million years appeared sufficient to yield identifiable discriminatory markers. Here, the collagen fingerprints were unable to separate the two Felinae taxa, which diverged c. 7.2 Ma (Flynn et al. 2005;Agnarsson et al. 2010) and yielded unexpectedly similar spectra for Panthera and Crocuta, a much greater divergence. It is likely therefore that different mammalian lineages have different rates of molecular evolution that are impacting upon collagen fingerprint-based speciation. However, the chemical properties of these differing peptides are also known to affect their amenity to this form of soft-ionization mass spectrometric analysis. It should be noted that although the 1t66/67 peptide biomarker is particularly valuable for separating some taxa , it is not as ideal for others given that the more common variant (at m/z 2216.1) is in a region of similar m/z to two other peptides (1t49 and 1t77 (Buckley 2016)).

Post-translational modifications for stratigraphy
Following on from our suggestion that glutamine deamidation could potentially be a useful indicator of the relative age of specimens within a deposit (Buckley et al. 2010), this concept has been explored more recently with mixed results. Some studies imply that it offers potential for relative ageing (Wilson et al. 2012) whereas others argue against this, albeit using much more complex liquid chromotography-tandem mass spectrometry data rather than PMFs (Schroeter & Cleland 2016). Our approach solely utilizes PMF data, which represent a relatively unfractionated set of collagen peptides present in comparison to the LC-based methods. However, we did observe the clear distinction between the megafaunal taxa that were almost certainly from much younger intrusions to the deposit when compared with other Pleistocene megafauna (Fig. 4).
When looking specifically at taxa that could be constrained to some extent to particular time periods (at least with Apodemus and Crocuta), we noticed that the smaller taxa tended to have a much greater level of glutamine decay than the larger taxa. The high levels of decay observed for Apodemus could be a reflection of the fact that this measure of deamidation is affected by pH (Robinson & Rudd 1974), and that the transit through the acid digestive tract of the owl that accumulated these remains could speculatively have substantially increased the rate of this type of decay. However this would not explain the observations for greater decay in the assumed younger arctic fox remains; either the artic fox presence is much earlier than expected or, a more likely explanation, the size and robustness of the skeletal remains is clearly having an impact on the decay levels measured. Further evaluation of such species-specific biases would be needed to confirm the relative impacts of the above diagenetic factors affecting decay and therefore overall preservation. Although the results presented here offer a promising means to identify remains that are likely contaminants, further investigations into the range of peptides most appropriate for this must be further evaluated given that PTM data from LC-based analyses are thought inappropriate as a relative ageing tool (Schroeter & Cleland 2016). Although the more appropriate approach would clearly be the use of radiocarbon ( 14 C) dating, the majority of microfaunal bone fragments are typically too small. Additionally, assuming that the bone specimens were large enough for dating, this remains a pursuit that is orders of magnitude more costly where ZooMS collagen fingerprint analysis can identify samples that are ideal for collagenbased 14 C dating (Harvey et al. 2016) from large assemblages such as seen in Pin Hole Cave.

Conclusions
Through the application of collagen fingerprinting we were able to show that morphologically unidentifiable fragments of bone still yield potential information that are largely consistent with faunal compositions from morphologically intact remains, but can shed further light on taphonomic biases that would otherwise go unnoticed. Additionally, we were able to uncover the presence of a species previously unconfirmed from Pin Hole Cave; the arctic fox. The fingerprinting methods offer value in their amenability to high-throughput species identification of thousands of specimens within a short period of time. However, although we demonstrate the ability to identify contaminant (relatively recent) specimens within ancient deposits using the secondary advantages of PTM analysis, its potential use as a means of relative ageing needs much further investigation.

Supporting Information
Additional Supporting Information may be found in the online version of this article at http://www.boreas.dk. Table S1. Concatenated COL1A1 and COL1A2 sequences. Table S2. List of the accession numbers and the faunal identifications from collagen fingerprinting for the 782 Pin Hole Cave specimens identified.