Influence of testing environment and loading rate on intervertebral disc compressive mechanics: An assessment of repeatability at three different laboratories

Abstract In vitro mechanical testing of intervertebral discs is crucial for basic science and pre‐clinical testing. Generally, these tests aim to replicate in vivo conditions, but simplifications are necessary in specimen preparation and mechanical testing due to complexities in both structure and the loading conditions required to replicate in vivo conditions. There has been a growing interest in developing a consensus of testing protocols within the spine community to improve comparison of results between studies. The objective of this study was to perform axial compression experiments on bovine bone‐disc‐bone specimens at three institutions. No differences were observed between testing environment being air, with PBS soaked gauze, or a PBS bath (P > .206). A 100‐fold increase in loading rate resulted in a small (2%) but significant increase in compressive mechanics (P < .017). A 7% difference in compressive stiffness between Labs B and C was eliminated when values were adjusted for test system compliance. Specimens tested at Lab A, however, were found to be stiffer than specimens from Lab B and C. Even after normalizing for disc geometry and adjusting for system compliance, an ∼35% difference was observed between UK based labs (B and C) and the USA based lab (A). Large differences in specimen stiffness may be due to genetic differences between breeds or in agricultural feed and use of growth hormones; highlighting significant challenges in comparing mechanics data across studies. This research provides a standardized test protocol for the comparison of spinal specimens and provides steps towards understanding how location and test set‐up may affect biomechanical results.


| INTRODUCTION
In vitro mechanical testing of intervertebral discs (IVDs) provides a valuable tool for investigating mechanisms of disc injury and degeneration, mechanical integrity of biological repair strategies, or efficacy of medical devices. Generally, these tests aim to replicate in vivo conditions as accurately as possible, however, the complex structure and loading condition of the spine means that simplifications are necessary. As a result, there are often inconsistencies in methods used for sample preparation and testing between studies, making direct comparisons between studies a significant challenge faced by the field. There are a number of studies that have proposed standardized test methods, 1,2 primarily for testing multi-level specimens in pure moment bending. However standard methods, specifically for axial compression tests are not available, even though physiologically the spine is subjected to axial loads and a large number of research studies limit their testing to axial compression.
Previous studies demonstrated the importance of disc hydration, [3][4][5] preload, [6][7][8][9] and test rate. [10][11][12][13][14][15] More recently, studies have compared biomechanical testing between laboratories, to investigate the application of pure moments to multi-level spinal specimens 16 and to assess differences between six-axis testing systems. 17 These studies have highlighted the importance of consistent methods for data processing, yet there is still a lack of consistency in specimen preparation and testing conditions used throughout the spinal community. 18 For example, differences in axial preloading, as commonly applied during spinal testing, may have substantial effects on mechanical loading outcomes 19,20 and recovery behavior. 21 These differences make comparisons across studies challenging, or impossible.
At the 2019 Annual Meeting of the Orthopaedic Research Society (ORS), the ORS Spine Section discussed a need for consensus in biomechanical testing approaches used by the community. One challenge in developing standardized protocols is being able to determine which protocol best represents physiological loading. However, moving towards consistent test protocols, similar to ASTM International standards for testing of materials, is important for comparing data across studies. Therefore, the aim of this study was to perform axial compression experiments on bovine bone-disc-bone motion segments at three institutions using the same testing methods to determine whether experimental findings could be replicated across institutions, and to identify which parameters were critical for achieving comparable results to allow a move towards more standardized methods. Sample preparation and mechanical testing was performed at the University of California -Berkeley (Berkeley, California), University of Exeter (Exeter, UK), and Imperial College London (London, UK), here on in referred to as Lab A, B, and C, respectively. As part of this investigation, experiments provide data regarding the effects of testing in air, wrapped in saline soaked gauze, or in a saline bath, at different strain rates.

| Sample preparation
Bovine IVDs have been used in this study as they have been shown to have biomechanical and biochemical similarities with nondegenerate human discs. 19,22,23 Intact bovine tails were acquired from butchers or abattoirs local to each of the three institutions involved in this study, from which a total of 36 (12 at each institution) bone-disc-bone motion segments were acquired. Bovine tails were stored frozen at −20 C on the day of acquisition, and each tail was thawed overnight at 4 C prior to dissection and testing. Two specimens were obtained from each tail by removing soft tissue, with care taken not to damage the IVDs. The tail was then cut transversely through the first, second, and third caudal vertebral bodies at mid-height to obtain two bonedisc-bone specimens. During preparation, specimens were kept hydrated through regular spraying of phosphate buffered saline (PBS, 0.15 mol/L). The width in the sagittal and coronal plane of each IVD was measured using digital calipers. The IVD height was measured using either x-ray or micro CT, depending on equipment availability at each institution (Lab A: micro CT, Lab B: micro CT, and Lab C: x-ray).
For x-ray measurements, a calibration stick was placed in line with the sample so that lengths could be measured. For all measurements three repeats were made and an average was calculated.
Each specimen was secured in polymethyl-methacrylate (PMMA) bone cement such that the transverse plane of the IVD was horizontal and parallel with the loading platens. The specific methods to secure specimens in place differed between institutions, but broadly followed the same process of centering one vertebra within a specimen pot and fixing it in place with PMMA. The specimen was then flipped upside down to fix the other vertebral body in PMMA, while ensuring the two PMMA pots were parallel. This resulted in each institution having specimens fixed in metal pots that could be compressively loaded on testing machines ( Figure 1).

| Experimental procedure
Experiments were carried out on an MTS MiniBionix 858 (MTS Systems Corp., Eden Prairie, Minnesota), an Instron E10000 (Instron Ltd., High Wycombe, UK), and an Instron 8872 (Instron Ltd., High Wycombe, UK) at Institutions A, B, and C respectively. The load string for each of these setups in shown in Figure 2. During the process of mounting the specimens on their respective testing machines care was taken to ensure samples were not subjected to any tensile loads.
Once mounted specimens were subjected to an equilibration period which involved a compressive load of 50 N for 5 minutes, followed by a 5 N compression for 15 minutes. This was followed by a compressive preload of 50 N for 5 minutes to simulate a physiological compressive force prior to the first load cycles. A 50 N compressive preload was selected to provide an initial pressure of 0.08 MPa based on previous bovine tail cross-sectional area measurements of 622 ± 71 mm 2 24 , which is comparable to the intradiscal pressure of 0.08 to 0.11 MPa measured in-vivo in healthy participants during various lying postures. 24 Five cycles of axial compression were applied with a triangle wave between 50 N and 1000 N. A peak load of 1000 N was selected to provide an intradiscal pressure of 1.6 MPa (based on previous bovine tail cross-sectional area measurements of 622 ± 71 mm 2 24 ). 24,25 This loading was on the higher end of physiological internal pressures but low enough to ensure that damage during testing was minimal. Three loading frequencies were applied to each specimen, comprising fast (5.00 Hz), medium (0.50 Hz), and slow (0.05 Hz). To reduce the effects of repeat loading on changes in water content and losses in disc height between tests, each specimen was always tested at the fast rate first, followed by the medium rate, and finally the slow rate. Between testing at each frequency, specimens were allowed to recover with a compressive load of 5 N for 15 minutes, followed by the application of the compressive preload of 50 N for 15 minutes (Table 1).
Previous studies report that three to five cycles is generally sufficient to obtain a repeatable force-displacement response. 1,26 Pilot testing at each institution confirmed that force-displacement response was repeatable after 3 cycles, with less than 2% variation in mean stiffness after the third cycle. Similarly, pilot testing was conducted to confirm that equilibration and recovery periods between testing rates allowed for sufficient disc height recovery, while still ensuring that a physiological preload (50 N compression) was maintained prior to cyclic loading.
Three variations of testing environment were investigated: air (but regularly sprayed with PBS to keep specimens hydrated); wrapped in PBS soaked gauze and food packing plastic to minimize dehydration due to evaporation; and in a PBS bath. All tests were performed at room temperature (22 C). Four specimens were tested in each environment at each institution, resulting in 12 tests in each F I G U R E 1 Typical specimens tested at Labs A, B, and C, with (A) showing a specimen tested in a saline bath F I G U R E 2 Schematic of experimental test setup and load string for a typical air test at each lab environment and frequency, a total of 36 specimens, and 108 tests across all test rates (Table 2).
In order to account for differences in clamp-to-clamp stiffness of each test machine, including all fixtures, a compliance test was performed at each institution with test fixtures in place. Tests were completed using five axial compressive triangle wave cycles applied between 50 N and 1000 N at 0.05 Hz.

| Data analysis
Data was acquired at 1000 Hz for the fast rate tests, 100 Hz for the medium tests, and 10 Hz for the slow tests. Force-displacement data from the fifth cycle was used to determine compressive stiffness, which was calculated using a linear regression between 500 and 900 N of the loading portion of the compressive cycle. This stiffness was converted to a modulus using the width and height measurements taken of each IVD. To calculate the area of the sample from the width each IVD was assumed to be circular in cross-section.
The system compliance test was used to find a fourth power polynomial relationship of the load with respect to displacement over the compressive part of the final clamp-to-clamp test cycle at each institution. Using this polynomial, a compliance displacement for any load measured during the actual bovine sample experiments could be calculated and subtracted resulting in a compliance corrected force-displacement response of the sample itself (not including any displacements of components of the test setup). No adjustments were made to the load data as this is not affected by system compliance. This method of compliance correction provided a consistent way to correct test displacement even with non-linear system stiffness.
A 3x3 mixed factorial analysis was completed to compare the dependent variables of stiffness and compressive modulus (both uncorrected and corrected values) across the independent variables at each institution: environment (air, gauze, and bath); and test frequency (5.00, 0.50, and 0.05 Hz). Following these analyses institutions were compared using one-way ANOVA. The specimen height, and specimen area used in the tests were compared across the three institutions using ANOVA. All tests were completed with a significance level of .05, and post-hoc analyses were completed in cases of significance, with a Bonferroni correction used to minimize the risk of type I errors. All statistical analyses were completed in The test loading protocol used for each specimen at all three institutions. Shaded steps indicate the load cycles at the fast, medium and slow test frequencies Step

| Specimen details
No significant differences were observed between institution in terms of disc area (mean ± SD) of 489 ± 71 mm 2 , 436 ± 92 mm 2 , and 439 ± 35 mm 2 for Labs A, B, and C, respectively, P = .127), however, the disc height of specimens from Lab A (6.75 ± 0.60 mm) were approximately 20% greater than discs from Labs B (5.48 ± 1.02 mm, P = .033) and C (5.46 ± 1.62 mm, P = .029). No significant differences were observed in terms of disc area (P = .919) or height (P = .854) between the three environments.

| Effect of environment
No significant differences were observed at any institution in either the stiffness or compressive modulus between any test environment using data before (P > .272 and P > .227, respectively) or after (P > .286 and P > .238, respectively) compliance correction. Therefore, when completing comparisons between institutions, the environment factor was pooled.
F I G U R E 3 Boxplots from each institution at each test rate of the, A, uncorrected stiffness, B, corrected stiffness, C, uncorrected compressive modulus, and D corrected compressive modulus. Gray brackets indicate a significant difference (P < .05) in test rate identified through the mixed factorial analysis, black brackets indicate a significant difference (P < .05) between institutions at a given test rate identified through ANOVA

| Effect of test rate
A small but significant decrease in compressive stiffness was observed between 5 Hz and 0.05 Hz (P < .001) and between 0.5 Hz and 0.05 Hz (P < .001) at Lab A. These differences were maintained after compliance correction (P < .007; Figure 3). Similarly, a small but significant difference was seen in compressive modulus between 5 Hz and 0.05 Hz (P < .001) and between 0.5 Hz and 0.05 Hz (P < .001), which was maintained after compliance correction (P < .004). Due to the significance observed between test rates at Lab A, the rate factor was not pooled in comparisons between institutions.

| Effect of institution
As the mixed factorial analysis showed no difference between environment, this data was pooled to compare results between institution; as there was a significant difference in rate at Labs A, rate data was not pooled to compare institutions. Therefore, ANOVA were used to compare institutions at each rate for both the uncorrected and corrected stiffness and compressive moduli. These analyses showed that specimens at Lab A were significantly different to those at Labs B and C at all test rates in stiffness (P < .034, Figure 3A), corrected stiffness (P < .001, Figure 3B), and corrected compressive modulus (P < .002, Figure 3d). There were no significant differences between Labs B and C (P > .699).
Overall, the compressive stiffness of specimens from Lab A were approximately 20% lower than those at Labs B and C before compliance correction (Table 3 and Figure 3A). The compliance correction resulted in the compressive stiffness of specimens at Lab A being approximately 35% higher than at Labs B and C. There was no significant difference in the stiffness of specimens tested in Labs B and C either before or after compliance correction, but the mean stiffnesses were closer after compliance correction ( Figure 3B). Compliance correction in Labs B and C increased disc joint stiffness by 5% to 14%; however, compliance correction in Lab A was greater due to a polymer base used for the PBS bath (Figure 2A). No significant differences were observed between institutions in terms of compressive modulus before (P > .271) compliance correction, but the compressive modulus of specimens from Lab A were significantly higher after compliance correction (P < .002; Figure 3c). Variation in the data increased after correcting for specimen geometry. SDs for stiffness measurements were 6.3% to 22% of the mean. In contrast, SDs for compressive modulus for each test environment was 8.5% to 40% of the mean.

| DISCUSSION
This is the first study to investigate the axial stiffness of intervertebral discs using a combined approach to investigate the effects of institution, testing environment, and test rate. This provides novel biome- further confidence that the differences in disc stiffness between labs was due to the bovine samples, rather than the test setup or compliance correction method. Although the discs at all labs are expected to be from animals of roughly the same age (18 months), large discrepancies in disc stiffness may be due to genetic differences between different breeds of cows, differences in agriculture feed processes between countries, or differences in sex. Discrepancies in IVD mechanical properties have previously been suggested to be dependent upon breed, 1 particularly if there are differences in internal geometries such as NP:AF ratio. Additionally, American farmers often use FDA approved steroid hormones to increase the growth rate of livestock, 27 whilst the use of growth hormones are banned in the UK.
Had system compliance not been considered, the results would have shown that the stiffness and modulus of specimen tested at Lab A were lower, rather than higher than specimens tested at Labs B and C, as was the case after compliance correction. This highlights the importance of accounting for system compliance, either through compliance correction methods, as used in the present study, by using a decoupled displacement measurement system, 10,13,28 or a contactless measurement system with appropriate accuracy for the relatively small displacements that occur during axial testing of the IVD. 29  The compressive loads of 50-1000 N used for the present study were selected to cause intradiscal pressures that range from a lying posture to high physiological loading (intradiscal pressure range: 0.08-1.6 MPa). 24 However, because the mean cross-sectional area of 455 mm 2 for the specimens in the present study was less than the 622 mm 235 previously reported in the literature, the pressures applied to the specimens was higher than planned. However, the pressure at 50 N, which is expected to be related to an intradiscal pressure of 0.11 MPa, which is still comparable to the intradiscal pressure during lying postures, and the pressure at 1000 N would be 2.2 MPa, which is quite high. However, it should be noted that the conversion between applied load and intradiscal pressure were extrapolated from coupling in vitro and in vivo data from human discs, and may not directly apply to bovine discs. 24,25 In conclusion, although the test rate can have an effect on the compressive properties, this is small compared to the interspecimen and inter-laboratory variability. Even after accounting for specimen dimensions and clamp-to-clamp compliance, compressive mechanical properties were found to be different between institutions, which may be due to the variability of specimens local to different research laboratories.
Care must therefore be taken to consider this when comparing results between studies and the following recommendations are suggested to move towards more standardized results between labs: 1. Any compliance in the load string should be reported and accounted for in post-test analysis to ensure only IVD displacements are assessed.
2. For tests shorter than 90 minutes, testing environment and rate does not make a large difference. Therefore, we suggest wrapping specimens in in PBS soaked gauze during testing to reduce the need for additional fixtures (eg, water bath) and to prevent the IVD from becoming dehydrated, as would be the case when testing in air. Tests only need to be conducted at a single rate; we recommend 0.5 Hz or 1 mm/s as they are closest to physiological rates that most testing machines can achieve.
3. If possible, the breed and species of the specimens should be reported to provide more information regarding this effect in the future.