The use of mice in diabetes research: The impact of experimental protocols

Mice are used extensively in preclinical diabetes research to model various aspects of blood glucose homeostasis. Careful experimental design is vital for maximising welfare and improving reproducibility of data. Alongside decisions regarding physiological characteristics of the animal cohort (e.g., sex, strain and age), experimental protocols must also be carefully considered. This includes choosing relevant end points of interest and understanding what information they can provide and what their limitations are. Details of experimental protocols must, therefore, be carefully planned during the experimental design stage, especially considering the impact of researcher interventions on preclinical end points. Indeed, in line with the 3Rs of animal research, experiments should be refined where possible to maximise welfare. The role of welfare may be particularly pertinent in preclinical diabetes research as blood glucose concentrations are directly altered by physiological stress responses. Despite the potential impact of variations in experimental protocols, there is distinct lack of standardisation and consistency throughout the literature with regards to several experimental procedures including fasting, cage changing and glucose tolerance test protocol. This review firstly highlights practical considerations with regard to the choice of end points in preclinical diabetes research and the potential for novel technologies such as continuous glucose monitoring and glucose clamping techniques to improve data resolution. The potential influence of differing experimental protocols and in vivo procedures on both welfare and experimental outcomes is then discussed with focus on standardisation, consistency and full disclosure of methods.


| INTRODUCTION
Animal models play an essential role in preclinical diabetes research owing to the complexity of the disease and its impact on multiple organs and pathways. 1,2 Mice are the most used animals in preclinical diabetes studies largely because their glucose handling, which is very similar to that of humans, can be studied using relatively simple procedures. 3 However, many common preclinical end points in diabetes research can be impacted by the strain, sex and model of mouse chosen and so this should be carefully considered when planning in vivo experiments. 4 The choice of end points and protocols are also vital in experimental design to ensure that data are reproducible and translatable. 5 Indeed, it has been shown that poor study planning and implementation has direct implications on clinical translatability. 5 In-depth study design can also improve animal welfare and reduce animal numbers with the contemporary 3Rs acknowledging the importance of new technologies in maximising data resolution to increase the usefulness of each animal and/or reduce sample sizes. 6 In diabetes research, the role of welfare may be particularly pertinent as blood glucose concentration, a commonly used primary end point, is directly altered by physiological stress responses. 7-10

| END POINTS
Experimental end points should be considered at the early stages of experimental design to ensure that study outcomes can be appropriately interpreted. Common end points used in preclinical diabetes research usually relate to blood glucose homeostasis, which includes blood glucose and insulin concentrations, glucose tolerance and insulin resistance, all of which can be measured in various ways. 2 It is not uncommon that several end points are used within one study as well as secondary end points (e.g., weight) which can be used to monitor animal health, disease progression and the impact of off-target effects.

| Measuring blood glucose concentrations using a glucometer
Blood glucose concentration is a common end point in diabetes research and is classically measured using a glucose meter (glucometer) at standard single time-points. This method involves delivering a needle-prick or cut to the tip of the tail and gently massaging the tail from the base upwards to generate a blood droplet. 11 This droplet is captured on a glucose strip placed inside a handheld glucometer, which normally requires 0.5-2 µl of blood. 11 This technique is simple, requires only partial restraint and provides a blood glucose concentration reading within seconds. Glucometer readings are typically taken in either a fed (often described as 'random') or fasted state. 12 Glucometer blood glucose measurements provide a rapid and simple snapshot of glycaemic control. Random measurements are useful in indicating glycaemic control under normal physiological conditions (i.e., in the presence of food) and are preferable to show the robustness of a treatment in reducing blood glucose concentrations. 12 Additionally, as random measurements do not require fasting the animal, these can be taken daily if necessary. However, measuring random blood glucose concentrations can lead to data that are influenced by variations in timing and degree of food intake. 12 This may be particularly important comparing mice fed different diets (e.g., high-fat vs. normal chow) as sugar content can vary and influence results. For example, sugar (i.e., fructose/ sucrose) comprises ~7% of the 60% high-fat diet (HFD) (D12492; Research Diets) compared with ~3.5% of a standard chow diet (Rodent Diet 20; PicoLab). This may have immediate impacts on blood glucose concentrations that could alter random glucometer measurements. However, only minimal increases in random blood glucose concentrations (~1 mM) in HFD versus standard chow mice have been observed in our laboratory, despite an impairment in glucose tolerance and insulin sensitivity developing within 2 weeks of diet induction. [13][14][15] What is already known • Experimental design is vital in preclinical diabetes research for optimal mouse welfare and outcomes. • Experimental protocols are poorly standardised throughout the literature.

What this study has found
• Small variations in experimental protocol can influence mouse welfare and scientific end points. • Refinement of procedures still allows for reproducible data and drug effects to be observed.
What are the implications of this study?
• Highlights the need for standardisation, consistency and full disclosure of methods with improved experimental design having the ability to improve drug discovery.
On the other hand, the fasted state is a more reliable indicator of overt diabetes and reduces variability caused by food intake (e.g., variation in time since last feed and quantity of consumption). 12 Therefore, an appropriate fast length with regards to both the scientific question and welfare should be chosen and limited in frequency. Although overnight fasts are commonly used clinically, this induces a state of starvation in mice and, therefore, should be avoided. [16][17][18] It is important to note, however, that induction of a catabolic state (e.g., hypoglycaemia) may be important when testing the effects of prolonged fasting on metabolic responses and counter-regulatory mechanisms. 16,[19][20][21] Therefore, longer fasts may be used in some studies but only if scientifically justified. Considerations when fasting mice are described in more detail below.
It should be noted that, irrespective of whether measurements are in the fed or fasted state, mice are nocturnal and, therefore, most active with approximately two thirds of their calorific intake consumed during the night (dark cycle). 16 This results in circadian rhythms in activity and blood glucose concentrations, both of which are higher at night ( Figure 1). Therefore, random blood glucose concentrations, which are normally obtained during the light cycle, can overestimate general glycaemic control. 22 Therefore, it is often suggested that blood glucose concentration measurements across a study should be taken at similar times of the day and reported in methods sections to allow comparisons between studies.
Overall, both random and fasted blood glucose measurements can give important and complementary insights into blood glucose homeostasis. 12 These can be complemented with measurements of plasma insulin either under fed, fasted or glucose-stimulated conditions. Larger blood samples are typically required to measure insulin (~50 µl) and, therefore, it is usually not measured as frequently. 2 Irrespective of which end point is chosen it is vital that experimental protocols are consistent and fully disclosed with consideration of both welfare and scientific outcomes.

| Measuring glucose tolerance
Glucose intolerance is a key characteristic of impaired glycaemic control and can be quantified using glucose tolerance tests (GTTs), which are a fundamental tool used in diabetes research. 2,23 GTTs involve measuring baseline blood glucose concentrations before F I G U R E 1 Ten second averages of blood glucose concentrations and activity over 72 h in 11 normal diet (ND) male mice captured by HD-XG glucose telemetry devices (Data Sciences International). Grey bars = dark phase. Blue arrows = times of disturbance. Disturbance describes times at which mice were woken by animal unit staff during daily checks but not handled [Colour figure can be viewed at wileyonlinelibrary.com] administration of a glucose bolus and subsequent repeated blood glucose measurements for ~2 h typically at 15-, 30-, 60-, 90-and 120 min. 2,23 The glucose bolus can be given via several routes, although glucose response varies widely depending on the method chosen as the rate of glucose delivery to the system differs. 23 Glucose response is also altered by glucose dose with common concentrations ranging from 1 to 3 g/kg. 17 Prior to GTTs, it is common practice to fast mice to eliminate the impact of variability in food intake on glucose homeostasis. Although previous attempts have been made, there remains no standardised fast length with durations commonly ranging from 6 h in the daytime to 16 h overnight. 16,17,23 Furthermore, cages are sometimes changed at the start of fasting to eliminate the risk of food remnants being left in the cage. 24,25 However, whether and/ or how this is done is not consistent or well documented with the potential impacts of this discussed later in this review.

| Measuring insulin resistance
Glucose intolerance can result from reduced insulin secretion and/or sensitivity but GTTs do not provide this mechanistic information. Consequently, insulin tolerance tests (ITTs) are often undertaken to investigate insulin resistance. As with GTTs, mice are frequently fasted prior to ITTs, albeit often for shorter durations as longer fasting is associated with increased risk of hypoglycaemia. 26 However, as before, there is lack of standardisation in protocol across the literature. 2,26 At the start of an ITT, baseline blood glucose concentration is measured before a bolus of insulin (normally 0.25-1 IU/ kg) is injected i.p. or s.c. and subsequent blood glucose measurements are obtained over ~1-2 h. 26 Several factors should be considered when undertaking ITTs, most important of which is the risk of hypoglycaemia meaning that mice should be carefully monitored for signs of lethargy, inactivity, hunching and piloerection. If any of these are observed, and/or blood glucose concentrations fall <2.5 mM, a glucose bolus (2 g/kg) should be administered, food and warmth should be provided, mice should be separated if lethargic and blood glucose concentration and welfare should be monitored until restoration of a normal state. 26 If this happens more than very rarely, the insulin dose should be adjusted accordingly in future studies.
The ITT is particularly susceptible to stress-related increases in blood glucose concentrations, which in the case of GTT is often masked by the glucose bolus. Therefore, researcher experience, refinement, consistency and full disclosure of protocols is particularly pertinent.

METHODS THAT INCREASE DATA RESOLUTION
The techniques described above can provide a snapshot of glycaemic control at any given time-point. However, for some studies, increased data resolution may be required which is problematic with standard glucometer methods due to the need for repeated blood sampling affecting welfare. Therefore, surgical techniques involving implantation of devices or catheters can be used for more frequent sampling.

| Continuous glucose monitoring
Continuous glucose monitoring (CGM) generates high data resolution with 10-s averages of blood glucose concentration, temperature and activity recorded over several weeks in unrestrained mice. 27,28 CGM consequently allows for blood glucose concentrations to be measured at time-points, which would normally be missed, such as at night, and may, therefore, enhance understanding of 24 h glycaemic control. Furthermore, glycaemic variability can be quantified, which has been shown to be a key driver of morbidity and mortality in human diabetic patients. [29][30][31][32] CGM also offers an opportunity to measure glucose concentrations without the need for handling the mice. Indeed, mouse cages are placed on top of receiver pads, which collect the probe data and transmit it to a data acquisition system on a nearby computer meaning that data can be acquired without even entering the room. 27 This may be important as researcher intervention, which is unavoidable when using a glucometer, can increase blood glucose concentrations ( Figure 1). Hence, CGM may provide more accurate representation of absolute blood glucose concentrations without the influence of stress.
Despite the potential benefits of CGM, this technique requires invasive surgery. The glucose sensor of the telemetry probe is placed in free-flowing blood in the aortic arch and a radio-transmitter is then placed either s.c. or i.p. 27 This requires significant researcher training and experience for both optimal welfare and outcomes. Mice must also be allowed to recover from surgery prior to any further experimentation. In our lab, we have found that immediate surgical recovery is surprisingly swift although mice can lose up to 10% body weight particularly in the first 48 h. In general, recovery is achieved by day 5 post-surgery, but experimentation often begins on day 7. Overall, the drawbacks of invasive surgery may be outweighed by the reduction in researcher intervention and subsequent stress throughout experimentation. Although only one telemetered mouse can be placed on each receiver mat, mice can be housed with non-surgical cage-mates to avoid isolation. 27 Further caveats of this technique are the requirement for researchers to be trained to use the advanced equipment required and the cost of probes and equipment, which vastly exceeds that of standard glucometer methods. Hence, setting up CGM in a laboratory can be extremely costly and time-consuming. The HD-XG probes (Data Sciences International) have a guaranteed lifespan of 28 days meaning that only 21 days of data may be obtained following surgical recovery. 28 However, in our lab, the average probe lifespan achieved in both normoglycaemic and glucose intolerant mice is 8 weeks.
From our experience, CGM provides an opportunity to comprehensively understand several physiological and experimental parameters that would normally be difficult or impossible to assess: (1) 24 h blood glucose concentrations with quantification of glycaemic variability; (2) the impact of researcher intervention on both welfare and blood glucose concentrations and (3) real-time quantification of drug and treatment effects, which is particularly beneficial when responses are subtle. If glucose tolerance and/or insulin resistance are the primary end points of interest, however, we have not found that CGM provides additional information.

| Glucose clamping
A sophisticated technique used to quantify insulin resistance and secretion is glucose clamping. This method is used frequently clinically but has been translated into mice, allowing for detailed mechanistic investigation of the factors regulating blood glucose. 33 There are two main types of glucose clamp: the hyperinsulinaemiceuglycaemic clamp and the hyperglycaemic clamp. The hyperinsulinaemic-euglycaemic clamp is the gold standard to assess insulin sensitivity and is conducted by infusing insulin at a steady state to achieve hyperinsulinaemia, following which varying amounts of glucose are infused to reach a euglycaemic set-point. The amount of glucose required to reach euglycaemia is then translated into a 'glucose infusion rate' (GIR), an equation that also considers the concentration of glucose used and weight of the animal. The hyperglycaemic clamp is a similarly sensitive technique that allows for precise measurements of insulin secretion in parallel with GIR. This technique, therefore, produces additional metabolic parameters compared with the more basic GTTs and ITTs.
The setup of this technique is technically demanding as it requires the catheterisation of both the carotid artery and jugular vein. 34 Catheters are then tunnelled s.c. and exteriorised between the shoulder blades of the mouse, where they are then attached to a pin port that is sutured into the skin. This allows for direct infusion and sampling access to both vessels. The surgery is highly invasive and can take a long time to master, but competency is essential for reducing time under anaesthetic (~60 min), which in turn improves survival rates and postoperative recovery. Similar to CGM, surgical recovery is achieved by day 5 post-surgery with experimentation beginning on day 7. It is also a similarly expensive technique with the cost of probes and equipment, alongside the surgical training required, greatly outweighing that of glucometer methods.
However, as explained above, the different types of clamps allow for quantification of metabolic parameters that outreach the simpler GTTs and ITTs. In addition to this, rapid blood sampling can be conducted (similar to that of a frequently sampled intravenous GTT-FSIVGTT), while infusing erythrocytes to help maintain total blood volume and, hence, animal welfare. It is also possible to measure tissue-specific glucose uptake and endogenous glucose production using this technique by infusion of radiolabelled glucose tracers followed by harvesting of tissues and plasma. This again gleans more information than can be obtained from basic tolerance tests. Once the surgery is mastered, animal welfare is also improved with this technique as animals are conscious and freely moving during studies thanks to the easily accessible pin port located on their back. This both reduces stress associated with intervention and increases precision of data, which in turn reduces sample sizes. Although animals are singly housed during the study, they can be group housed prior to experimentation. From our experience, glucose clamping is, therefore, an unrivalled technique when investigating specific metabolic mechanistic questions. However, it is not essential for simpler end points and should, therefore, only be used as an adjunct to GTTs and ITTs when necessary.

| EXPERIMENTAL PROTOCOLS
When designing experiments, the influence of extraneous variables should be understood and minimised where possible. As previously described, there are many different techniques undertaken in diabetes research to assess blood glucose homeostasis, but protocols for these are poorly standardised despite there being many changeable aspects.

| Fasting
As previously mentioned, it is common practice to fast mice both prior to fasted blood glucose measurements and GTTs to limit the influence of food intake on end points. 12 However, fasting protocols are poorly standardised with variability in both the length and time of day of the fast. [16][17][18] Overnight 16-h fasts were historically common, but these have been associated with hypoglycaemia, hypothermia, weight loss and cardiovascular changes, which all act as potential stressors. 16 Data from our own laboratory using CGM also indicate that 16-h fasts induce severe hypoglycaemia with blood glucose concentrations consistently falling <2.8 mM at ~13 h after the start of fasting. 24 More recently, it has become more common practice to fast for shorter periods (3-6 h) during the daytime and this has been shown to be sufficient for ensuring gastric emptying and hepatic control of glucose homeostasis. 18,[35][36][37] However, overnight fasting still features in many research papers. Interestingly, overnight fasts not only have welfare implications but can significantly alter glucose response during an i.p.GTT with 16-h fasts being associated with impaired glucose tolerance particularly in males. 24 Another poorly reported variable in the fasting protocol is whether the cage is changed at the start of the fast. Although no food remnants will remain if the cage is changed, this procedure is associated with stress responses in mice, including increased corticosterone, most likely due to removal of familiar odours. 38 Conversely, not changing the cage (i.e., removing food from the lid only) may result in small amounts of food remnants on the cage floor, which could impact the fast. A method to ensure complete food removal while reducing the stress of a full cage change is to retain the used bedding (after shaking out any food residue) and place it in the new cage. 38 Indeed, our data have shown that bedding retention reduces the initial glucose spike associated with researcher intervention at the start of the fast. 24 While removing food from the food hopper without changing the cage causes the least amount of stress at the start of the fast, the blood glucose concentration reductions are not as significant, most likely due to food remnants present in the bottom of the cage.
As with overnight fasting, cage change method not only impacts effectiveness of the fast but also GTT outcome with whole cage changes at the start of a 6-h fast significantly impairing glucose tolerance during an i.p.GTT when compared with when bedding is retained or the cage is not changed. 24 Overall, these results highlight the role of welfare in scientific outcomes and the importance of consistency in practices and full disclosure of methods.

| Moving animals from holding rooms
to procedure rooms In many animal units, mice are transported from holding rooms to procedure rooms prior to experimentation (e.g., GTTs). Conversely, because CGM requires mice to remain on their receiver mats for measurements to be taken, mice remain in their holding room for procedures. 28 This is likely to reduce stress due to minimising the degree of disturbance and novelty of environment. Indeed, we have observed that moving mice into a procedure room increases blood glucose concentrations by ~40% for up to 60 min. We have also observed worsened glucose tolerance during an i.p.GTT in mice transported to procedure rooms versus those which remained in holding rooms.

| Protocols undertaken during glucometer measurements, glucose tolerance tests and insulin tolerance tests
Protocols undertaken during experimentation (e.g., glucometer measurements, GTTs and ITTs) can also vary. These procedures all involve handling the mice and obtaining a blood sample using the tail-prick glucometer method. 11 For GTTs and ITTs, either a glucose or insulin bolus is then administered with repeated blood samples obtained over ~2 h. 2,23,26 Overall, differing procedures can significantly alter results due to the influence of both stress and direct physiological changes. Hence, the importance of consistency and full disclosure of methods should not be overlooked.

| Handling, blood sampling and intraperitoneal injections
Using CGM we have shown that even mild researcher intervention (e.g., entering the holding room) can increase blood glucose concentrations (Figure 1). It is, therefore, unsurprising that in vivo procedures (handling, tailprick blood sampling and i.p. injections) can also causes glucose spikes. Indeed, we have shown that in vivo procedures cumulatively increase blood glucose concentrations with a maximum increase of 0.8-3.1 mM in males and 0.8-2.3 mM in females 30 min after intervention ( Figure 2). Blood glucose began increasing from 5-min post-disturbance onwards with responses lasting between 30 and 45 min in females and 45-and 75 min in males.
Although unavoidable, the potential impact of these procedures on outcomes should be considered. Overall, our data have shown that when glucose is administered i.p. at the start of a GTT, the ~13 mM glucose increase observed is a composite of the stress associated with the in vivo procedures required to administer glucose (~2.8 mM) and the glucose dose itself (~10.2 mM). Therefore, any refinement to improve welfare could alter scientific outcomes and this should be acknowledged.
For example, we have observed that tunnel handling tends to produce lower blood glucose responses than tail or cup handling with regards to both magnitude and duration of response.
It is also important to note that various glucometers can be used each with differing properties. Indeed, glucometer technology has improved over the years, meaning that smaller blood samples are now required. For example, the Accu-Chek Performa meter (Roche) requires 0.6 µl of blood whereas Stat Strip Xpress meters (Nova Biomedical) require 1.2 µl. 39,40 Although historically blood samples were often obtained by cutting the tip of the tail with scissors or a scalpel, these blood volumes can now be obtained via a simple needle prick at the end of the tail. 11 Choice of needle size is dependent on blood volume required, but 27-30 G needles are sufficient for most glucometers. Interestingly, we have found that using larger 27 G needles causes higher glucose spikes, indicating increased stress. However, if the needle is too small for the sample required (e.g., 30 G for 1.2 µl), glucose concentrations increased most likely due to the additional pressure applied to the tail.
Other differences between glucometers include testing ranges of blood glucose concentrations. Many glucometers (e.g., Accu-Chek Performa) capture concentrations of up to 33.3 mM but other, more specialised meters (e.g., StatStrip Xpress) can measure up to 50 mM. 39,41 Accuracy of glucometers also differs with only Contour Next and StatStrip Xpress meters meeting the accuracy standards set out by the International Organisation for Standardisation (ISO 2013). This may be linked to haematocrit interference, which is not always corrected for (e.g., with the Accu-Chek Performa) but can significantly alter blood glucose concentrations. [42][43][44][45] Overall, as different glucometers can alter results by as much as 30%, consistency in glucometer use is vital and should be clearly described in methods. 45

| Route of administration
When carrying out GTTs and ITTs, choosing the route of glucose/insulin administration is extremely important as it can markedly impact test outcome. Glucose is most commonly administered via i.p. injection but can also be administered orally through gavage or voluntary ingestion of a glucose gel. 23 Insulin is usually administered either via i.p. or s.c. injection as it is a peptide and, therefore, cannot be administered orally. 2,26,46 Increases in blood glucose concentrations after glucose administration are highest after i.p. administration, followed by gavage and then voluntary gel ingestion. Attenuated glucose spikes during an oral GTT are primarily due to incretin release from the gastrointestinal tract with both glucagon-like peptide 1 (GLP-1) and glucosedependent insulinotropic polypeptide (GIP) enhancing glucose stimulated insulin secretion. 47 The improved glucose tolerance observed with voluntary gel ingestion versus gavage is likely due to reduced stress responses as gavage requires restraint of the animals. However, it is important to note that there is also evidence for increased incretin responses with mastication. 48 Hence, voluntarily gel ingestion may improve welfare while avoiding stressinduced artefacts in the GTT. Considerations when using voluntary ingestion of glucose gels include the need to separate mice to ensure correct dosing, the requirement for mouse training and potential issues of non-adherence. 23 Options to separate mice include single housing (either continually or during the GTT) or using physical barriers to separate mice (if in pairs) while ensuring that both mice have access to water. However, our data have shown that cage separation, but not single housing, immediately prior to an i.p.GTT, significantly impairs glucose tolerance. A further concern with voluntarily ingestion of gels is that dosing may be inaccurate due to partial and/or prolonged consumption not consistent with a bolus. These issues can largely be overcome with training as mice are neophobic and need to become familiar with new processes. 49 In our laboratory, mice are initially given gels in a plastic pot for two nights in the presence of food and absence of researcher intervention. After this, the mice are habituated to single housing or cage separation for 2 h on two separate occasions. Finally, single housing/cage separation is combined with gel administration. Mice are considered 'trained' when they consume 90% of the gel within 1 min on two separate occasions. This is normally achieved in two training sessions with all training taking ~1 week. To calculate percentage consumption, gels are weighed in their plastic pots before and after administration.

| Dose and volume of glucose/insulin
Chosen dose of glucose and insulin will have direct implications on tolerance tests. Although not standardised, glucose is typically administered at a dose of 1-3 g/kg. 17 Previous studies have found that ≥2 g/kg glucose is required to detect differences in oral glucose tolerance in HFD versus normal chow mice, although this was also true for a set dose of 50 mg (equivalent to 2 g/kg in a 25-g mouse). 17 For ITTs, doses ranging from 0.25 to 1 IU/kg have been used although we have found that 0.75 IU/kg dose is sufficient to discriminate between insulin sensitive and resistant animals with a low incidence of hypoglycaemia. 26 However, it is recommended that insulin dosage is determined using low starting doses and incremental increases to ensure ~50% blood glucose reductions with low risk of hypoglycaemic events. 26 Furthermore, genetic background, weight, phenotype and age can alter insulin dose required, so further dosage studies should be undertaken if starting experiments in a new mouse model. 26 Volume of glucose and insulin may also have an impact on GTT and ITT outcome, respectively, but local guidelines should be followed regarding maximum volumes to a particular injection site. Adherence to voluntary consumption of glucose gels can be impaired in heavier mice (e.g., HFD or Lep Ob/Ob mice) due to increases in gel volume. Consequently, gels can be made with smaller volumes of more concentrated glucose solutions which can reduce consumption time by ~30% (for 45% vs. 30% glucose gels). Importantly, these differences in rates of intake did not alter glucose response.

| Timing of repeated blood samples
Blood glucose concentrations following either a glucose or insulin bolus are normally measured using a glucometer F I G U R E 3 The effect of repeated blood sampling on blood glucose concentrations over 6 weeks in normal diet (ND) (a) Male and (b) Female mice. * represents a significant difference compared with other repeats. # represents a significant difference compared with the −30min pre-intervention concentration for each repeat (p < 0.05, two-way ANOVA with Holm-Sidak post hoc test, n = 5-11) [Colour figure can be viewed at wileyonlinelibrary.com] at 15-, 30-, 60-, 90-and 120 min. However, CGM data show a maximum glucose response following i.p. glucose injection at 19.0 ± 0.8 min in normoglycaemic mice with 15-min glucometer readings underestimating the glucose concentration at this time-point (21.2 ± 0.9 vs. 20.3 ± 0.6 mM, p < 0.05, paired t-test, n = 16). Conversely, in glucose intolerant HFD mice, maximum glucose is achieved at 31.4 ± 2.2 min, which is accurately represented by 30-min glucometer reading (25.7 ± 1.0 vs. 24.5 ± 1.1 mM, p > 0.05, paired t-test, n = 9). Although peak glucose concentrations may be missed using 15-min glucometer readings in some mice, overall glucose tolerance is accurately captured using these standard timepoints and consistency between studies and mice is most important.
Overall, glucose tolerance may be affected by variations in both protocols undertaken prior to GTTs (i.e., fast length, cage changing and movement of animals) and during GTTs (i.e., in vivo procedures, route/dose of glucose and timing of repeated sampling). Hence, full disclosure of methods and consistency in protocols between mice and cohorts is key.

| Refinement on drug efficacy studies
GTT protocols can be refined in several ways, including by using shorter fast lengths (6 h daytime vs. 16 h overnight), modifying cage change method (retention of used bedding vs. whole cage changes) and reducing restraint for glucose administration (oral gels vs. gavage and i.p. injection). 24 However, it could be argued that higher blood glucose concentrations, whether due to stress, sex or exogenous glucose, may be required to detect the ability of drugs to reduce them. Importantly, our data have recently refuted this by showing that the effect of both i.p. exendin-4 and oral metformin on glucose tolerance can still be observed when all these refinements are practiced even in females who have lower blood glucose concentrations. 24 Hence, refinement of procedures in line with the 3Rs does not impede on the predictive validity of the model.

| The effect of refinement on GTT reproducibility
We have also considered the reproducibility of GTTs using different protocols with regards to differences observed between days, between mice and between cohorts. Our data have shown that more refined procedures (6 h fasted oral gel GTTs with bedding retention cage change) are actually more reproducible between repeats in the same mice whereas less refined procedures (16 h fasted i.p.GTTs) are less reproducible between mice of the same cohort and between cohorts. Hence, refinement of protocols maintains scientific integrity and reproducibility of data while improving welfare. In addition, we found that more refined in vivo procedures (i.e., tunnel vs. tail handling, 30 G vs. 27 G needles for tail-pricks and 0.6 vs. 1.2 µl blood samples) also reduced researcher-induced glucose responses and, hence, could potentially affect GTT outcome.

| Habituation and acclimatisation
It has previously been shown that mice can become acclimatised and habituated to stressors. 50 However, we have observed no evidence of reductions in researcher-induced glucose responses with repeated experimentation despite reduced behavioural responses including defaecation and urination. For example, there was no significant difference in response to blood sampling for three repeats over 6 weeks ( Figure 3). Despite this, a 'first GTT/ITT' phenomenon has been regularly observed in our laboratory whereby glucose tolerance and/or insulin sensitivity during the first i.p.GTT or ITT is consistently and significantly impaired compared with all future repeats. Although repeating experiments may remove this effect, mice will then undergo further blood sampling and stress T A B L E 1 Our recommendations for protocols undertaken in preclinical diabetes research such as those both prior and during glucose tolerance tests (GTT) to maximise welfare and scientific outcomes

Protocol Recommendation
Habituation Mock injection prior to first GTT which could impact welfare. Therefore, prior administration of i.p. saline, which we have found to eliminate this effect, may be preferable. 51 Although not tested, prior scruffing and touching the i.p. site with a needle or finger may also provide similar habituation while ensuring further refinement.

| The effect of different researchers
Our data have shown that different researchers can profoundly alter both response to stressors and glucose tolerance with cumulative reductions in blood glucose responses during an i.p.GTT alongside enhanced researcher experience and animal familiarity. The impact of different researchers should, therefore, be considered, especially in laboratories where numerous researchers collaborate, as data may not be directly comparable.

| Accumulation of responses
Finally, we have found evidence of an accumulation in glucose responses following in vivo procedures which may cause cumulative effects on impaired glucose tolerance. For example, moving animals into new procedure rooms only impairs glucose tolerance when mice have undergone a whole cage change at the start of a 6 h fast or have been fasted overnight. Furthermore, sex and oestrous-related differences in glucose tolerance are exaggerated when a less experienced researcher undertakes the i.p.GTT. These results, therefore, suggest that using the most refined procedures can partially protect from unavoidable researcher-induced artefacts in preclinical diabetes studies.

| CONCLUSIONS
A summary of our recommendations for protocols undertaken in preclinical diabetes research is shown in Table 1.
In conclusion, appropriate experimental design is vital in preclinical diabetes research with various aspects that must be considered including appropriate end points, experimental protocols and the potential impact of variations/refinements in procedures. 5,52 This is paramount to maximise welfare in line with the 3Rs while ensuring the end point of interest is being tested, scientific reproducibility is maintained, and drug effects can still be observed. Most importantly, methods should be standardised, kept consistent and fully disclosed to ensure comparability of data as even minor variations in protocols can significantly impact results.