Strain differences in three measures of ethanol intoxication in mice: the screen, dowel and grip strength tests


  • J. C. Crabbe,

    Corresponding author
    1. Portland Alcohol Research Center, Department of Behavioral Neuroscience, Oregon Health & Science University and VA Medical Center, Portland, Oregon, USA,
    Search for more papers by this author
  • C. J. Cotnam,

    1. Portland Alcohol Research Center, Department of Behavioral Neuroscience, Oregon Health & Science University and VA Medical Center, Portland, Oregon, USA,
    Search for more papers by this author
  • A. J. Cameron,

    1. Portland Alcohol Research Center, Department of Behavioral Neuroscience, Oregon Health & Science University and VA Medical Center, Portland, Oregon, USA,
    Search for more papers by this author
  • J. P. Schlumbohm,

    1. Portland Alcohol Research Center, Department of Behavioral Neuroscience, Oregon Health & Science University and VA Medical Center, Portland, Oregon, USA,
    Search for more papers by this author
  • J. S. Rhodes,

    1. Portland Alcohol Research Center, Department of Behavioral Neuroscience, Oregon Health & Science University and VA Medical Center, Portland, Oregon, USA,
    Search for more papers by this author
  • P. Metten,

    1. Portland Alcohol Research Center, Department of Behavioral Neuroscience, Oregon Health & Science University and VA Medical Center, Portland, Oregon, USA,
    Search for more papers by this author
  • D. Wahlsten

    1. Department of Psychology and Centre for Neuroscience, University of Alberta, Edmonton, Alberta, Canada
    Search for more papers by this author

*J. C. Crabbe, VA Medical Center (R & D 12), 3710 SW US Veterans Hospital Road, Portland, Oregon 97239 USA. E-mail:


Mice from 8 to 21 inbred strains were tested for sensitivity to ethanol intoxication using a range of doses and three different measures: the screen test, the dowel test and a test of grip strength. Strains differed under nearly all conditions. For the dowel test, two dowel widths were employed, and mice were tested immediately or 30 min after ethanol. For the dowel and screen tests, low doses failed to affect some strains, and the highest doses failed to discriminate among mice, maximally affecting nearly all. For grip strength, a single ethanol dose was used, and mice of all strains were affected. Pharmacokinetic differences among strains were significant, but these could not account for strain differences in intoxication. For doses and test conditions in the middle range, there were only modest correlations among strain means within a test. In addition, genotypic correlations across tests were modest to quite low. These results suggest that different specific versions of a test reflect the influence of different genes, and that genetic influences on different tests were also distinct.

Many neurological disorders are investigated using laboratory mice, particularly now that mouse genetics offers efficient methods for assessing genetic contributions to etiology. Motor system function is studied using a wide variety of laboratory measures, and we presume that performance on these tests probably reflects different underlying physiological substrates. Motor performance is behaviorally complex, and for many tasks it seems likely that good performance may require at least some of the following: intact vision, accurate proprioceptive feedback, balance, gait coordination, locomotion, muscle strength, attention, motivation and ability to learn and remember.

The most commonly used laboratory test of motor function for rodents is probably the rotarod (Bogo et al. 1981; Dunham & Miya 1957; Jones & Roberts 1968). This ubiquitous task can be used in fixed-speed or constantly accelerating versions, and we have recently explored several aspects of the apparatus and test procedures (Rustay et al. 2003a; Rustay et al. 2003b). In addition to the rotarod, we have experience with multiple tests of the global behavioral domain of motor performance. These include: the tasks reported in this paper, the screen test, the dowel test and a measure of grip strength (Browman & Crabbe 2000; Crabbe 1983; Schafer & Crabbe 1996), the grid test and balance beam (Crabbe et al. in press) and a test of ataxia rated by an observer (Crabbe et al. 1982; Metten et al. 2003). All these tests were sensitive to the intoxicating effects of an intraperitoneal dose of ethanol. However, for most tests we used a restricted set of apparatus and test protocol parameters for our studies (e.g., a single ethanol dose), as have other investigators for the most part.

A mouse's genotype affects its ability to perform in all of the above assays. The three tasks reported here have been used to detect differences among inbred strains (Browman & Crabbe 2000; Crabbe 1983; Damjanovich & MacInnes 1973; Kirstein & Tabakoff 2001; Kirstein et al. 2002), between lines of mice selectively bred for high or low ethanol response (Deitrich et al. 2000; Erwin & Deitrich 1996; Erwin et al. 2000; Gehle & Erwin 2000; Perlman & Goldstein 1984; Schafer & Crabbe 1996) or between mice with an engineered, targeted mutation and their wild-type controls (Le Marec & Lalonde 1997; Smith et al. 1995; Voikar et al. 2002).

It remains unknown which physiological substrates are called into play by which tasks. Most tasks doubtless require multiple skills. For example, one version of the grip strength task is said to assess both force and duration of grip, and the former begins to differentiate wobbler mutant mice with lower motor neuron disease from controls after 18 days of age (Smith et al. 1995). While the ability to remain perched on a small-diameter dowel seems to the experimenter that it primarily requires good balance, it might be that the task also requires a modicum of muscle strength. In addition, reasonable motivation not to fall to the bedding below and cerebellar learning might contribute to performance. The dangers of anthropomorphizing commonly used tasks are clear (Wahlsten et al. 2003c). We do know that genetic influences are not the same for all tasks. For example, we compared mice with a null mutation of the serotonin 1B receptor gene with their wild-type controls on several tests. We found that ethanol-treated null mutant mice made fewer missteps in the grid test and on a balance beam than their wild-type controls, but there was no effect of the gene deletion on the screen test, the dowel test, grip strength test or either of two versions of the rotarod (Boehm et al. 2000).

To add to the complexity, even for a single task patterns of genetic influence can depend exactly upon how the tasks are conducted. For example, we recently optimized both the fixed speed and accelerating rotarod tasks to detect ethanol intoxication under a broad range of conditions in mice (Rustay et al. 2003b). In a subsequent comparison of 8–20 inbred strains, we found that when mice were tested 30 min after ethanol injection in both tasks, there was a great deal of genetic influence in common on fixed speed and accelerating rotarod strain differences. But, if accelerating rotarod performance at 30 min were compared with fixed-speed rotarod intoxication tested immediately after injection, there was a very low genetic correlation among strain means (Rustay et al. 2003a).

In the current studies, we sought to evaluate inbred strain sensitivity differences in three tasks. We had several goals. We wanted, if possible, to identify task-specific test protocols and doses of ethanol that would affect nearly all mouse strains, and which could provide a ‘snapshot’ of strain sensitivities across a broad range of conditions for that behavior. We also wanted to see whether some tasks appeared to reflect the influence of genes similar to those affecting other tasks. For example, if we found that the rank order of strain sensitivities was identical for the screen and grip strength tests, we would not need to incorporate both in future test batteries of ethanol intoxication. Our eventual goal in these studies is to combine the results with those from several other tests of ethanol intoxication in these strains, encompassing a broader range of potentially relevant phenotypes (e.g., motor stimulation, gait). This would extend the generality of the findings.

Initially we evaluated the physical apparatus and the test conditions using genetically outbred mice. These preliminary studies, which are not presented, allowed us to narrow the range of conditions under which we tested multiple inbred strains. Our outbred stock (WSC) was derived from the HS/Ibg stock at the Institute for Behavioral Genetics (Boulder, CO). The HS stock is an intercross of 8 inbred strains (A, AK, BALB/c, C3H, C57BL, DBA/2, ISBi and RIII). Because these mice are segregating for many alleles, the range of their individual differences in a behavioral task is highly likely to encompass the range seen in most inbred strains (McClearn et al. 1970). The strain characterizations are in part designed to contribute to the Mouse Phenome Project (MPP) (Paigen & Eppig 2000), a consortium effort to provide basic behavioral and physiological data on a variety of mouse genotypes. Another, short-term, goal was to provide suggestions for appropriate methods that would enable screening of many novel genotypes. Thus, we also report reliability measures for the strain differences in these studies.

Materials and methods

Animals, husbandry and general procedures

Mice from 8 inbred strains were tested (for grip strength, we tested an additional 13 strains – see below). The strains were selected from those on the ‘A’ list of the Mouse Phenome Project ( docs/pristrains), and were obtained from The Jackson Laboratory (Bar Harbor, ME) at 5–6 weeks of age. A-list designation by the MPP was given because these strains are widely used, fertile, relatively inexpensive (average cost of the strains we studied =$15.38/mouse; for the entire A-list, cost was $26.23/mouse) and offer diverse pedigrees (Paigen & Eppig 2000). In our 8-strain panel, we also included the BTBR strain ($41.30/mouse), which once, briefly, was on the A-list, but is now on the D-list, because we have data on them for several other related tasks. We also studied B6D2F1 mice in the screen and dowel tests. Although B6D2F1 individuals are identical at each gene, they are heterozygotes for all genes where B6 and D2 mice possess different alleles. Although B6D2F1 data were collected, they were neither analyzed nor presented, but are available from the authors on request. The following eight inbred strains were used in all three tests: 129S1/SvImJ (129), A/J (A), BALB/cByJ (BALB), BTBR T + tf/tf (BTBR), C3H/HeJ (C3H), C57BL/6J (B6), DBA/2J (D2) and FVB/NJ (FVB).

Mice were housed in plastic shoebox cages lined with Bed-o-Cob™ bedding, usually 3 mice per cage. Food (Purina 5001™, Animal Specialities Inc, Hubbard, OR) and water were provided ad libitum. Colony lights were on for 12 h (06.00–18.00) and the ambient temperature was 21 ± 1 °C. Bedding was changed weekly, but always after testing. Two shipments were received for each of the screen and dowel tests, each comprising mice of all strains. For the grip strength test, mice were divided across 8 shipments based on availability. For each shipment, cages were randomized for testing order. Mice were approximately equally divided between males and females in each strain X dose condition. All mice were tested in only 1 of the 3 tests, and were naive at the start of testing. On the day of a test, mice were first moved to the test room and habituated for one hour. All housing and test procedures were approved by the Institutional Animal Care and Use Committee at the VA Medical Center and were in accordance with United States Public Health Service guidelines.

Screen test

The screen test apparatus comprises a 0.5-cm2 wire grid (screen) mounted in a wooden frame. When horizontal, the grid (15 cm × 15 cm) is positioned 55 cm above bedding. Two vertical arms support the rotating grid, and the entire apparatus is attached to a horizontal wooden base.

Mice were 8 weeks old at the time of the initial (saline) test. After habituation, all animals were weighed and placed back into their home cages. Subjects were then injected i.p. with saline at five minute intervals and individually housed in holding cages lined with cob bedding. Thirty minutes after injection, mice were removed from the holding cage and placed onto the horizontal screen. Once the animal grasped the wire, the screen was slowly rotated to a vertical orientation over 2–3 seconds, and the latency to fall was recorded. A criterion latency of 120 seconds was set. The three animals that failed the first criterion test were given and passed a second test, after all remaining animals were tested.

One week later, each animal was tested again, except that it was given one of four doses of ethanol (1.5, 1.75, 2.25 or 2.5 g/kg). Doses were assigned such that within a cage at least one animal had a dose differing from the other mice. Ethanol (Pharmco Inc., Brookfield, CT) was mixed 20% v/v in physiological saline (Baxter Healthcare Corp., Deerfield, IL). Thirty min later, each mouse was tested. Animals not falling after 240 seconds were given a score of 240. One week after the first ethanol trial, all subjects were again quasi-randomized into dose groups and tested on the screen for a second time. This time we included the 2.0 g/kg dose and a higher dose and deleted the 1.5 g/kg dose, which had shown a floor effect. Doses for the second ethanol trial were 1.75, 2.0, 2.25, 2.5 or 2.75 g/kg, and were assigned such that no animal received the same dose for both trials. Numbers of mice tested for each strain × dose combination ranged from 4 to 12. A total of 212 mice were tested on the screen.

Dowel test

Many versions of this test have been reported. We selected a version adopted from others (Erwin & Deitrich 1996; Gallaher et al. 1982) with which we had previous experience (Browman & Crabbe 2000; Crabbe 1983; Schafer & Crabbe 1996). The hardwood round dowel was 9.6 mm in diameter, mounted horizontally 55 cm above the tabletop, and divided into four 15 cm long sections, with 8 cm diameter dividers between the sections; this configuration allows testing of multiple mice while permitting each a modest degree of movement along the dowel. Mice were 12–16 weeks old at the beginning of testing. After habituation, each mouse in a group of four was placed on an individual section of the dowel so that the length of its body was parallel with the dowel and the mouse was facing the divider, for a two-minute criterion test. Upon successful completion of the criterion test (all mice passed), the four mice in a group were weighed and then injected with a quasi-randomly pre-assigned dose of ethanol (1.5, 2.0 or 2.5 g/kg i.p., 20% v/v); all 4 mice were weighed and injected within one min. Immediately following injection, each animal in the squad was placed on the dowel. The latency to fall from the dowel into a cage of bedding (or a maximum score of 300 seconds) was recorded, and the mouse was then returned to its home cage. At 30 min after injection, each mouse was again placed on the dowel and tested (NB, by this time a substantial level of acute functional tolerance could have developed, contributing to strain differences; see Discussion and Erwin & Deitrich 1996).

Ten to 14 days after testing on the 9.6 mm dowel, the animals were tested on a 15.8 mm dowel, using the same dose range and following the same procedure, including the criterion test. Mice were randomized into (usually) different dose groups for this test. Latencies to fall at both T0 and T30 were recorded. At 35 min after injection, each mouse was placed in a restrainer and the tip of its tail was nicked to draw a 20-µl blood sample for blood ethanol determinations. One hundred and sixty-nine mice (four to eight per strain) were tested.

Grip strength test

The apparatus comprises a push-pull strain gauge (Amtek Accuforce Cadet; Nevins et al. 1993). Mice were housed 2–4 to a cage and were tested at 7–11 weeks of age. On the test day, mice were moved to the testing room and allowed to habituate for 60 min before testing. Each mouse was weighed and singly housed in a holding cage for the duration of the experiment. After all mice were weighed, each mouse was tested for baseline grip strength. A 3 mm diameter triangular piece of metal wire was used as the grip bar. A mouse was held near the base of its tail and lowered toward the bar until it gripped the bar with both forepaws. The mouse was then gently pulled away from the bar at a steady rate of about 2.5 cm/second until the bar was released. Pulling the mice away more slowly results in many mice prematurely releasing the bar with one or both paws, before their grip is ‘broken.’ Peak force disturbance was automatically registered in grams-force by the apparatus. Data were recorded, and two additional trials were immediately given.

After all baseline measurements were taken, mice were injected i.p. with 2.0 g/kg ethanol (20% v/v) at 2 min intervals, and were tested for grip strength 30 min following injection. Thus, each mouse served as its own control. Immediately thereafter, a 20 µl blood sample was taken from the tip of the tail for blood ethanol analysis. Additional blood samples were taken at 60, 105 and 150 min. During the first passes of testing, we noted that mice of some of the smaller strains had difficulty providing four tail blood samples. Some provided only two of the last three samples. When this occurred, in later passes, we drew the last sample from peri-orbital sinus. In addition to the 8-strain panel tested in the screen and dowel tests, we also tested mice from the following inbred strains, generously provided from their A or B priority lists by the Mouse Phenome Project: AKR/J (AKR), C57L/J (L), C58/J (C58), CAST/Ei (CAST), MOLF/Ei (MOLF), NOD/LtJ (NOD), NZB/B1NJ (NZB), PERA/Ei (PERA), PL/J (PL), SJL/J (SJL), SM/J (SM), SPRET/Ei (SPR), SWR/J (SWR). A total of 258 mice (10–18 per strain) were tested. Equipment failure caused loss of baseline data from 6 animals and loss of post-ethanol data from 9 additional animals. Thus, the ultimate sample sizes were 7–18 mice/strain, with 243 mice successfully tested.

Blood ethanol concentration (BEC) assays

Blood samples were processed and analyzed by gas chromatography according to previously published methods (Ponomarev & Crabbe 2002). For BEC data, there were two ways for incomplete data to occur. Firstly, some blood samples were difficult to obtain at some time points. This affected the following numbers of samples at each time point: 30 min – 1 sample; 60 min – 8 samples; 105 min – 9 samples; 150 min – 19 samples. Secondly, some BEC data were eliminated as statistical outliers (i.e., beyond the mean ± 3 SD at that time point for all 258 mice). By this rule, the following numbers of samples were eliminated at each time point: 30 min – 2 samples; 60 min – 7 samples; 105 min – 4 samples; 150 min – 1 sample.

Statistical analyses

One general goal was, if possible, to identify one or more particular sets of apparatus, procedure and dose parameters for each test that best described the inbred strains' behavior and sensitivity to ethanol. Thus, our statistical strategy was somewhat unusual. Depending on the dependent variables, we used either factorial analysis of variance, multiple regression or logistic regression. We had to face some unusual circumstances. For example, because strains differ markedly, a particular dose of ethanol might be too high for some strains and completely incapacitate them, but hardly affect others. In such cases, we chose to eliminate particular doses or conditions from the analyses, for we sought conditions under which the largest number of strains would yield interpretable behavior. Other issues are raised in the Results section.

Power and sample size

With Type I error level at α = 0.01 and power at 90% ( β = 0.1), sample size per group can be determined using the method of Cohen (1988) for multiple degree of freedom effects. For the same 8 inbred strains, previous work (e.g., Rustay et al. 2003a) has found the strain main effect to have an effect size of about ω2 = 0.40 for asymptotic performance on rotarod acquisition and a range of ω2 from about 0.2–0.5 after 1–2 g/kg ethanol. A ω2 = 0.4 corresponds to Cohen's effect size f = 0.82, so we will assume fstrain = 0.8 and fdrug = 0.5. For 8 strains that are more or less uniformly spread over a wide range of means (Cohen's Pattern 2), only 4 mice per group would be needed to detect the strain main effect and 5 per group to detect the ethanol main effect when the design involves 8 strains and two treatment conditions, control and ethanol.

As shown previously by Wahlsten (1990) for many common kinds of interactions, the sample size needed to detect an interaction is generally larger than what is needed to detect the main effect with the same level of power. Under moderate doses of ethanol, many mice suffer impaired motor function, but the size of the effect might differ among strains. We assumed that 2 strains would be minimally impaired, 2 strains impaired by 0.5 standard deviation, 2 strains reduced by 1.0 standard deviation and the other 2 reduced by 2.0 standard deviations. This would amount to an interaction effect of about f = 0.4, and sample size would need to be n = 11 or 12 mice per strain in each group to achieve 90% power. This kind of interaction would amount to a partial ω2 = 0.14, a value that we have detected in several previous studies for the strain × lab environment effect (Crabbe et al. 1999; Wahlsten et al. 2003b). These estimates were used to guide the choices of sample sizes and aid interpretation of results in this study. Formal computations of sample size were not done for each specific experiment, which in some cases was not an 8 × 2 design with independent groups.

Measures of genetic commonality

The proportion of individual differences in two traits attributable to a common set of genes was estimated from the correlation of inbred strain means for selected variables. (Hegmann & Possidente 1981). With only eight strains, a correlation of | r | ≥ 0.71 is required for significance (P < 0.05). Nearly all the correlations seen therefore could best be considered suggestive, but we discuss some correlations that did not reach statistical significance. We also examined scatterplots for all correlations, because single outlier strains can have a marked effect on correlation values, producing either false positive or false negative results, and we did not want to overlook potentially interesting relationships simply because of the low power of the correlational analyses. Nonetheless, very strong genetic correlations can emerge from such analyses (Palmer et al. 1987; Rustay et al. 2003a). While the range of each such estimate of genetic correlation is technically − 1.0 to +1.0, the correlations are actually bounded by their true reliabilities (see Results for the task reliability estimates).


Screen test

Results for the screen test are shown in Fig. 1. The dose-effect curves were quite similar for all strains, with doses between 2.0 and 2.75 g/kg affecting at least some mice of all strains. FVB was somewhat of an exception, for the lowest two doses did not impair any mice, while the highest two doses impaired all mice – that is, their dose effect curve appeared to be steeper than those of the other strains. Across all strains and doses, 50% (186/374) of the mice were not impaired – i.e., remained on the screen for the entire 240 second test. Thus, instead of analyzing latencies, we used logistic regression to analyze the binomial response (impaired vs. not impaired). The strains appeared to fall into two rough groups. The 129, A, BALB and FVB mice (Figs 1a,b) had relatively steep dose-effect curves and somewhat greater ED50 values (range, 2.37–2.13 g/kg). Because strains showed more similar (and shorter) latencies at the higher doses, the differences in slopes and ED50 values occurred because some strains were more sensitive to lower ethanol doses (D2, C3H, BTBR, and B6, Fig. 1c,d; range of ED50 values, 2.02–1.88 g/kg).

Figure 1.

Mean ± SE latency (seconds) to fall from vertical screen 30 min after injection with ethanol at the indicated dose (g/kg). As noted in the methods, data were binomially distributed and were analyzed by logistic regression. Panels (a) and (b): less sensitive strains with steeper dose–response curves. Panels (c) and (d): more sensitive strains with shallower dose–response curves.

The logistic regression analysis yielded significant effects of Dose (P < 0.0001), Strain (P < 0.01), and a significant Sex × Strain interaction (P < 0.01). Males were more affected than females, and we wondered whether this was due to their greater body weight. Entering body weight first as a continuous variable in the regression model yielded significant effects for this variable (heavier animals fell sooner than lighter animals, P < 0.01), but did not change the pattern of significance of the other factors: Dose, Strain, and the Sex × Strain interaction remained significant. Thus, the strain-dependent sex difference could not be entirely accounted for by body weight. The interaction was largely because C3H females were very insensitive and C3H males very sensitive to ethanol, and A females tended to be more sensitive than A males.

Dowel test

Preliminary tests with genetically outbred mice revealed that a 19 mm dowel was too wide to provide a valid test of ethanol's intoxicating effects (for example, mice could remain draped over the dowel even after ethanol doses that caused loss of consciousness). Finally, these preliminary studies (with a total of 358 mice given 1.5–2.75 g/kg ethanol) showed that data taken 10 min after injection were more variable than those obtained either immediately or 30 min later (data not shown).

Results from the dowel are shown in Fig. 2. Because mice were assigned to different dose groups for the tests on the 9.6 mm and the 15.8 mm dowel, we analyzed the two dowel widths separately. In addition, we also analyzed the time points separately. Inspection of the data immediately after injection (T0) suggested a threshold effect, i.e., that at least the 2 and 2.5 g/kg doses yielded very similar results (Fig. 2a,b). This would be expected for a loss-of-function task where ethanol was being rapidly absorbed into brain (Ponomarev & Crabbe 2002). Finally, there was again an apparent sex difference, with shorter latencies for male mice, so we tested to see whether this might reflect their difference in body weight.

Figure 2.

Mean ± SE latency to fall from dowel after injection with ethanol at the indicated dose. Panel (a): immediately after injection, 9.6 mm dowel. Panel (b): immediately after injection, 15.8 mm dowel. Panel (c): thirty min after injection, 9.6 mm dowel. Panel (d): thirty min after injection, 15.8 mm dowel.

We first examined the T0 data from the 9.6 mm dowel (Fig. 2a). Data were highly positively skewed, so data were transformed by raising to the exponent 0.5 to yield non-skewed residuals. Linear regression showed significant effects of Dose (P < 0.001), Sex (P < 0.01) and Strain (P < 0.0001), but no interactions were significant. Entering body weight first into the regression (P < 0.0001) eliminated the effect of sex without changing other outcomes. For the 15.8 mm dowel at T0 (Fig. 2b), it was necessary to raise data to the exponent 0.25. Although Sex was not significant in the initial analysis, for parallelism, we entered body weight into the analysis. Body weight (P < 0.001), Dose (P < 0.001) and Strain (P < 0.01) were all significant, and there were no significant interactions.

The T30 data were not skewed, so no transformation was necessary. Body weight (P < 0.001) eliminated the further predictive value of sex (P > 0.10). It did not change the pattern of significance of the other factors: Dose (P < 0.0001), Strain (P≤ 0.01) and their interaction (P≤ 0.01) were all significant for both dowels after injection (Fig. 2c,d).

Figure 3.

Mean ± SE grip strength before (baseline) and 30 min after injection with 2.0 g/kg ethanol.

For the 2.5 g/kg dose, most animals in most strains were highly impaired, i.e., a ceiling effect was apparent. Similarly, a floor effect was apparent for the 1.5 g/kg dose on the larger dowel (Fig. 3d). For two strains, no mice fell, and many others were unimpaired, so this dose was too low for the wide dowel.

Grip strength

Results of the grip strength test are shown in Fig. 3. An initial analysis was performed on the baseline values, and strains differed significantly (P < 0.0001). However, males had greater grip strength than females (P = 0.001), and there was a significant Strain × Sex interaction (P < 0.0001). For most strains, males were stronger than females, and for most strains this was true even after correcting for body weight. However, the difference in strength of males vs. females after correcting for body weight was small in magnitude (only a 4% difference including all strains, and only an 8% difference after eliminating the three strains that showed the pattern opposite to the others with females being stronger than males). The post-ethanol scores were significantly correlated with baseline scores (r = 0.48, df = 241, P < 0.0001). We therefore indexed effect of ethanol on grip strength as a change from baseline, and included body weight in the analysis. With this index of strain sensitivity to ethanol (which is really the interaction of strain and ethanol effects), there were significant effects of Body weight (P < 0.0001), Strain (P < 0.001) and Sex (P < 0.001), but the interaction of Strain × Sex was not significant (P = 0.2). Males were more affected by ethanol than females.

Ethanol pharmacokinetics

BEC are given in Table 1. Data for eight strains taken after the dowel test showed a trend toward a significant strain effect after 1.5 g/kg ethanol (F = 2.12, P= 0.06), and differed significantly among strains after both 2.0 and 2.5 g/kg (F ≥ 2.90, P ≤ 0.01). B6 and D2 mice tended to have higher BEC than the other six strains, which did not appear to differ (Table 1). The 21 strains tested for grip strength also differed significantly 30 min after injection of 2.0 g/kg ethanol (F = 9.72, P < 0.01). In all these analyses, there were highly significant sex differences, with males achieving higher BEC. The interactions of Sex × Strain, however, were not significant (P > 0.01). As blood ethanol levels decline pseudolinearly after initial tissue equilibrium has been reached, we also estimated the slope of the blood ethanol metabolism curve (Widmark's β) by linear regression for each strain using the last three data points taken after the grip strength test (Grisel et al. 2002). These data are shown in Table 1.

Table 1.  Ethanol pharmacokinetics in inbred strains
StrainDowel Test BEC ± SE (mg/ml)Grip Strength (2 g/kg) – BEC (mg/ml)Widmark's β
 1.5 g/kg2.0 g/kg2.5 g/kg30 min60 min105 min150 minmg/ml/h ± SE
  1. BEC = blood ethanol concentration. Widmark's β is the slope of the blood ethanol elimination curve, estimated from the last 3 time points. Range of SE for the grip strength BEC data is 0.05–0.15 mg/ml.

1291.19 ± 0.101.48 ± 0.092.04 ± 0.141.591.921.551.10−0.54 ± 0.03
A1.09 ± 0.101.39 ± 0.111.84 ±−0.57 ± 0.11
AKR   1.241.541.080.55−0.66 ± 0.03
BALB0.96 ± 0.041.37 ± 0.111.81 ± 0.111.351.551.410.91−0.42 ± 0.13
BTBR1.16 ± 0.101.52 ± 0.191.99 ± 0.131.341.541.190.63−0.60 ± 0.08
C3H0.96 ± 0.051.34 ± 0.081.86 ± 0.151.311.491.120.63−0.57 ± 0.04
B61.28 ± 0.051.86 ± 0.172.40 ± 0.101.931.801.390.89−0.61 ± 0.03
L   1.721.531.330.85−0.45 ± 0.11
C58   1.561.391.190.54−0.57 ± 0.17
CAST−0.68 ± 0.46
D21.16 ± 0.071.94 ± 0.182.30 ± 0.141.701.691.070.54−0.76 ± 0.36
FVB1.14 ± 0.161.55 ± 0.201.77 ± 0.221.641.661.010.51−0.76 ± 0.06
MOLF   1.471.190.800.25−0.63 ± 0.06
NOD   1.431.591.220.74−0.57 ± 0.04
NZB   1.661.561.150.63−0.62 ± 0.04
PERA   1.731.721.120.45−0.85 ± 0.02
PL   1.441.621.370.83−0.52 ± 0.12
SJL   1.411.431.150.62−0.54 ± 0.10
SM   1.801.691.070.54−0.77 ± 0.03
SPRET   1.761.631.080.45−0.79 ± 0.03
SWR   1.431.481.020.44−0.69 ± 0.04

Relationships among variables

Table 2 shows the correlations among variables, and Table 3 the rank order of strain sensitivity values for each behavioral task. Selected variables were analyzed. For the screen test, we estimated the ED50 for each strain. The values were approximately bounded by the 1.75 and 2.25 g/kg doses. Future experiments to compare novel genotypes would need to choose specific doses of ethanol, so we included latency to fall after these doses as well. For the dowel test, we wanted to select an appropriate condition or conditions from among the 12 dowel size × dose × time point conditions. Because the 2 and 2.5 g/kg doses did not differ at T0 due to a threshold effect, and the mice were too intoxicated at T30 after the 2.5 g/kg dose, we deemed the highest dose inappropriate at either time, as well as the 1.5 g/kg dose for the test occurring at T30 on the 15.8 mm dowel (floor effect). For grip strength we used difference from baseline 30 min after 2 g/kg ethanol. Because strains could differ in behavioral sensitivity due to different ethanol doses reaching the brain, we examined the initial blood ethanol value taken 35 min after injection in the dowel test after each dose, and the BEC taken in the grip strength test after 2 g/kg. The blood ethanol levels correlated well with each other (0.72 ≤r≤ 0.89, all P < 0.01), so we included the value taken in the grip strength test at 30 min in Table 2. Selected strain mean correlations among these variables are shown in Table 2. There were no significant correlations between blood ethanol levels and post-ethanol performance on any task. This suggests that the behavioral strain differences do not represent different effective doses of ethanol in the strains.

Table 2. : Correlations among measures of ethanol sensitivity
 Dowel testScreen testGrip
 T0T0 9.6T0 15.8T0 15.8T30 9.6T30 9.6T30 15.8ScreenScreenGrip Diff
  1. For the dowel test, T0 and T30 refer to time in min after injection of the dose of ethanol shown in parentheses, and 9.6 and 15.8 refer to mm width of dowel. For the screen test (30 min after injection), doses are given in parentheses. Grip strength = difference from baseline 30 min after 2 g/kg. Grip BEC = blood ethanol concentration 30 min after injection of 2 g/kg. The variables selected as representative of that task are indicated in bold italics in the header. The full correlation matrix is available from the authors upon request. Correlations among 8 strain mean values ( df = 6) are shown. |r| ≥ 0.7, P ≤ 0.05 (shown in bold). |r| ≥ 0.81, P ≤ 0.01, (shown in bold underline). *The correlation r= 0.23 was heavily influenced by the BALB strain, and became 0.64 without it.

T0 9.6 (2.0) 0.81         
T0 15.8 (1.5) 0.23  0.62        
T0 15.8 (2.0) 0.62 0.89 0.54       
T30 9.6 (1.5) 0.75 0.44 0.16 0.18      
T30 9.6 (2.0) 0.53 0.59 0.50 0.48 0.51     
T30 15.8 (2.0) 0.25 0.64 0.86 0.64 0.40 0.50    
Scr (1.75) 0.05 0.18−0.36 0.25−0.21−0.11−0.32   
Scr (2.25) 0.65 0.52 0.12 0.18 0.40 0.12 0.22 0.38  
Grip Diff (2.0) 0.60 0.21−0.36−0.01 0.47−0.22 0.23 0.270.78 
Grip BEC−0.21−0.11 0.38−0.24 0.08−0.45−0.46−0.340.020.01
Table 3. : Strain sensitivity ranks
  1. 1 = most sensitive, 8 = least sensitive. *Data immediately after injection of 1.5 g/kg on the 15.8 mm dowel. **Data 30 min after injection of 1.5 g/kg on the 9.6 mm dowel.

Screen (1.75 g/kg)66534216
Screen (2.25 g/kg)86732541
Dowel (T0/15.8/1.5)*38512674
Dowel (T30/9.6/1.5)**62815473
Grip (2.0 gkg)28137465

For both the dowel and screen tests, the patterns of strain responses were substantially different across the specific conditions of the test. Thus, it proved impossible to select a single condition that indexed strain-specific sensitivity adequately for either task. For the screen test, response latencies to the 1.75 and 2.0 g/kg doses were correlated (r = 0.74), suggesting that these two conditions were markedly influenced by the same genes. Thus, the 1.75 and 2.25 g/kg doses (which were correlated with each other only r= 0.38) were included in Table 2. However, screen latency correlations among other doses were modest. For the screen test at the higher doses, performance was more difficult to interpret because many mice in some strains were showing maximum impairment.

For the dowel test, examination of the patterns of correlations of strain mean values revealed that at T0, nearly all combinations of dose and dowel were correlated (0.54 ≤ r ≤ 0.89). For T30, a similar, but weaker, pattern of general correlation was seen (0.40 ≤ r ≤ 0.51). Dowel test correlations between performance at T0 and T30 varied more widely, ranging between 0.16 and 0.86. This pattern of correlations suggested that no single dose, dowel and time could adequately represent the range of strain sensitivities to ethanol. Therefore, to index dowel sensitivity, we selected two responses, latency at T0 on the 15.8 mm dowel, and latency at T30 on the 9.6 mm dowel, both after 1.5 g/kg. Although each of these dose-dowel-time conditions was reasonably representative of other dose-dowel combinations at the same time, they were themselves correlated only r= 0.16. In this way, we intentionally sought conditions reflecting the influence of at least two, largely different, groups of genes.

Sensitivity across the screen and dowel tests correlated modestly, and then only under certain conditions. On the 9.6 mm dowel, strains sensitive to 1.5 or 2.0 g/kg on the dowel at T0 were more sensitive to 2.25 g/kg ethanol on the screen (0.52 ≤ r ≤ 0.65). However, correlations with other doses on the screen apparatus were lower. Even though correlations were positive between dowel sensitivity at T0 and T30 for these doses, sensitivity on the larger dowel at T0 after 1.5 or 2.0 g/kg was essentially unrelated to screen test sensitivity (− 0.36 ≤ r ≤ 0.25). The screen (at 2.25 g/kg ethanol) and grip strength tests (at 2.0 g/kg) yielded very congruent results (r = 0.78), but this correlation depended substantially on two strains highly insensitive in both tasks (129 and BALB). However, when the screen and grip strength tests were correlated at the same dose (2.0 g/kg), the correlation was much lower (r = 0.27). Comparing grip and dowel sensitivity (1.5 g/kg) revealed a modest correlation with the 9.6 mm dowel at T30 (r = 0.47), but a negative correlation with T0 performance on the 15.8 mm dowel at 1.5 g/kg (r = −0.36).

Reliability estimates

For the grip strength test, we were able to estimate reliability directly because three measurements were made both at baseline and after ethanol. Genetic correlations (among strain mean values) were high, among the three baseline scores (0.85 ≤ r ≤ 0.94) and among the three difference from baseline scores (0.56 ≤ r ≤ 0.75). Interestingly, the phenotypic correlations (across all mice) were substantially lower (∼0.6 for baseline, ∼0.35 for difference after ethanol). The variable included in Table 2 for correlational analysis was the average baseline minus average post-ethanol score.

For the screen and dowel tests, animals were not tested twice under exactly the same conditions. We therefore estimated reliability by splitting the samples into two groups and correlating these values. We ignored sex in these estimates to keep subgroup sample sizes reasonable. For the screen test, we divided each strain/dose group and with these 48 pairs of data points found the split-half reliability estimate to be r= 0.86, using the Spearman-Brown correction. For the dowel test, we estimated the reliabilities to be r= 0.80 for the transformed data on the 9.6 mm dowel at T0, r= 0.61 for the transformed data on the 15.8 mm dowel at T0 and r= 0.79 and .87 for the untransformed data at T30, respectively.


All three tasks were useful in detecting genetic differences in sensitivity to ethanol-induced intoxication. As noted in the introduction, they had been used before for this purpose, and we found each to be quite reliable under at least some conditions. Nonetheless, the results suggest that these tasks are generally measuring different aspects of performance under the influence of ethanol. Furthermore, for the dowel test and the screen test, small changes in the dose, or the width of the dowel, or the time at which the behavioral assessment occurs, can all yield different patterns of genetic results. If one were screening a null mutant for ethanol sensitivity, it would be difficult to know which variant of these tasks, or dose of ethanol, to choose. For example, a gene deletion might influence sensitivity on the 9.6 mm dowel, but other neural systems might compensate on an easier task, so the null mutant might perform well on a 15.8-mm dowel, even if the same dose of ethanol were used to probe sensitivity. Thus, the different apparatus and test conditions, either across tasks or within a task, can best be considered discrete traits, each with a unique set of contributing genes, but each potentially reflecting the effects of some genes in common.

The results presented here are, of course, limited to the specific genotypes studied, and more robust patterns of genetic correlation might have been detected in a wider range of genotypes. Nonetheless, these strains reflect a reasonable range of genetic variation among standard inbred strains (Beck et al. 2000). In addition, these tasks were not chosen for any presumed ethological relevance, and may not be those behaviors most sensitive to ethanol's effects. For example, we have devised a rating scale of wildness and ease of handling using these strains by noting their response to being transferred to a test situation, the difficulty of capturing them by the tail in order to test them, etc. However, in that survey we found that the strains with the most substantial ‘wildness’ scores were the wild-derived strains; SPRET/Ei, PERA/Ei and CAST/Ei, none of which were tested here (except in the grip strength test; see below), as we had sufficiently learned our lesson! (Wahlsten et al. 2003a).

Probably the most informative (and conservative) course for any study comparing genotypes on a task would be to employ multiple versions of the task to ensure the generality of the findings. Thus, Contet et al. (Contet et al. 2001) compared two inbred mouse strains in a task requiring them to traverse dowels of different diameters, and found that the strain differences were suppressed for the smallest diameter dowel (9 mm). Kiernan et al. found complex interactions of the ability of tenascin-C null mutants vs. wild types to traverse different width dowels, depending on the background genotype (Kiernan et al. 1999). In the current experiment, we used the dowels as a test of ability to remain upright. In other studies, we have compared the strains tested here for their ability to traverse rectangular balance beams of different widths (12.7, 15.8 and 19.0 mm; Crabbe et al. in press). Depending on the particular test parameters employed, different strains would appear to be most or least affected, and the more complete analysis of additional tasks we will perform may allow us to distinguish the contributions of dowel/rod shape vs. diameter vs. the activity demands of the task (stationary vs. moving).

Even though there were few significant genotypic correlations across tasks, it still could be possible that some strain or strains stood out as particularly sensitive or resistant to ethanol across multiple tasks. Table 3 shows that this was not really the case. BTBR came the closest to being such an outlier, as they were the most sensitive strain to several conditions on the dowel (including the two conditions selected for comparisons with other tasks, as discussed in the Results) and were the third most sensitive strain on the screen and grip strength tests. In our other studies, they were the most sensitive strain among 20 to ethanol-induced ataxia on the accelerating rotarod. They were also very sensitive on the fixed speed rotarod when tested 30 min after injection, but were among the least sensitive strains immediately after injection (Rustay et al. 2003a). Their relative sensitivity to ethanol in the grid test and the balance beam test varied across doses and conditions (Crabbe et al. in press). The A strain was quite insensitive in the screen and grip strength tests, but was insensitive under only some dowel test conditions. One other such finding was the FVB strain, which had a steep dose–response curve on the screen test. In other studies comparing these 8 genotypes, we saw that the FVB strain had a very steep dose–response curve for observer-rated ataxia (Metten et al. submitted).

The pattern of sex differences in the current work was interesting. In all three tasks, males were more impaired by alcohol than females. However, whether or not the sex difference merely reflected differences in body weight depended on the task. For the dowel test, sex per se was not a factor, as body weight accounted for all sex differences. For grip strength, a small sex difference remained unaccounted for by body weight, but this did not differ among strains. For the screen test, removal of the contribution of body weight could not entirely account for the sex difference, and there remained a significant strain × sex interaction. In studies with a balance beam in the same strains, we found that body weight differences could account for sex differences in ethanol sensitivity (Crabbe et al. in press). In yet another study, we saw no sex differences, and no significant correlation between body weights and rotarod performance either before or after ethanol (Rustay et al. 2003a).

During the scoring of behavior on the screen and dowel tests, we recorded the specific behavior of the animal at the time it fell. One factor that could contribute to performance on these tests could be activity. For example, one might surmise that an active animal might fall more readily during either test. For the screen test, activity appeared to play no role. Nearly all animals were stationary at the time of falling. For the dowel test, behavior at the time of fall appeared to differ across conditions. Nearly all animals fell when they were either stationary and oriented along the axis of the dowel, or when they were moving. When performance was tested immediately after injection, all mice that fell did so while moving, regardless of dose, strain or dowel diameter. For the T30 tests, however, mice behaved differently after different doses. For example, after the 1.5 g/kg dose, 95% of the mice that fell from the 9.6 mm dowel and 83% of the mice that fell from the 15.8 mm dowel fell while moving. However, after the 2 g/kg dose, 75% and 55% of the mice on the 9.6 and 15.8 mm dowels, respectively, were moving at fall, and the rest were stationary and appeared to lose their balance suddenly. At the 2.5 g/kg dose, these tendencies tended to reverse, but differently on the dowels. On the 9.6 mm dowel, 48% of the mice were moving, while on the 15.8 mm dowel, only 9% were. These differences in the specific behavior noted at the point of intoxication were distributed quite uniformly across strains and did not appear to account for the strain differences in sensitivity to ethanol. However, they may explain the fact that 7/8 strains fell from the 15.8 mm dowel more rapidly than from the 9.6 mm dowel when tested 30 min after 2.5 g/kg ethanol (Fig. 2). A/J were the only mice able to remain on either dowel for very long under these conditions. If moving allowed mice to remain upright for a few more seconds, the almost complete lack of movement of most animals on the 15.8 mm dowel may explain their worse performance. Future global analyses of these strains' performance across a number of tasks that include several more direct and quantitative measures of activity will allow a better understanding of the relationships between effects of ethanol on activity and motor coordination.

The dowel test required more intervention by the experimenter than many other standard ataxia tests. This illustrates the point that handling of the different strains was not entirely uniform, for the strains elicited strain-specific responses from the experimenter. This represents a type of gene-environment correlation that is unavoidable (Rowe 2002). Other sources of gene-environment correlation are also unavoidable in comparisons of purchased strains [e.g., The Jackson Laboratory feeds different strains different diets before shipment, see (Crabbe et al. in press)]. Some strains are simply too wild to provide meaningful behavioral data in many tests (Wahlsten et al. 2003a). Strain differences in ease of handling could interact with test performance. For example, the correlation of 21 strain means on an index of wildness with their scores on the grip strength test was r=−0.47 (P = 0.04). This may reflect our observation that some of the wilder strains were apparently less willing to grasp the bar during the grip strength test.

As noted, there were no significant correlations between blood ethanol concentrations and degree of impairment for any task. While this finding might seem surprising, it is in fact found in most such analyses (Crabbe 1983; Crabbe et al. 1994; Crabbe et al. in press; Rustay et al. 2003a). Certainly, if groups of mice, of whatever genotype, are given different doses of ethanol, the means of the groups will show systematic dose–response relationships in resulting blood ethanol concentrations. Yet, there are always individual differences within any dose group. When genotypes, in this case inbred strains, are compared after a fixed dose, there will also be strain differences in blood ethanol concentrations, as we report in Table 1. However, the issue is do the strain differences in behavioral response covary with those characterizing blood levels? If they do not significantly covary (as we found here), then mean strain blood level cannot predict behavioral sensitivity. We believe that the underlying reason for this is that strains really do not differ very substantially in blood ethanol levels. For example, Table 1 shows that the range of blood ethanol concentrations 35 min after 1.5 g/kg ethanol was between 0.96 and 1.28 mg ethanol/ml blood. Within this narrow range, the strain differences in absorption and distribution of ethanol are small as compared to the substantial behavioral differences in intoxication. An exception may be seen under conditions of chronic intoxication. When 19 inbred strains of mice were exposed to ethanol vapor for three days while metabolism was inhibited by the drug pyrazole, their blood ethanol levels after 72 h differed markedly and were significantly correlated (r = 0.57, P = 0.02) with the severity of subsequent withdrawal. In that case, we expressed correlations of withdrawal severity with other behavioral variables as the partial correlations holding blood ethanol levels constant (Crabbe et al. 1983).

Finally, as mentioned in the Materials and methods, because the nervous system can adapt rapidly to the presence of ethanol, acute functional tolerance can develop during the course of a single exposure to ethanol (Erwin & Deitrich 1996). As blood ethanol levels rise after an i.p. injection, mice may show a given level of intoxication at a specific blood ethanol concentration. If intoxication increases, and then the animal begins to recover, which will occur after blood ethanol concentrations have begun to fall, a second blood level taken at the reappearance of the same level of intoxication may be higher than that seen on the rising phase. This provides evidence for acute functional tolerance, which has been reported to differ across mouse genotypes for the dowel test task we studied here (Erwin & Deitrich 1996; Gehle & Erwin 2000), as well as for the loss of righting reflex response to ethanol (Ponomarev & Crabbe 2002). Acute functional tolerance is probably complete somewhere between 10 and 30 min after injection, although this depends no doubt on the behavioral task, strain and dose (Ponomarev & Crabbe 2002).

In both the screen test and dowel test data taken at T30, our results are likely therefore to reflect some combination of strain sensitivity differences and strain differences in tolerance development. Although lines selectively bred for acute functional tolerance differences on the dowel do not differ in acute sensitivity (Deitrich et al. 2000), differences in genotypes and specific procedures make it difficult to know the extent to which this may have contributed to our results without further experiments. However, this could easily explain why the pattern of strain sensitivities at T0 and T30 on the dowel test were not highly correlated, because the T0 measurements were taken before measurable acute functional tolerance could have occurred.

In conclusion, we found these studies helpful in that they allowed us to eliminate extremes of test apparatus, ethanol dose and test procedures for future work. Although an ideal test for examining genotypic differences in mice would elicit interpretable performance from nearly all genotypes, our results suggest that this is not always an achievable goal. The idea of reducing each test from a rather large, parametric set of variables to a single set of conditions suitable for ‘snapshot’ characterizations of genetic differences is admirable, but we needed multiple conditions to extract sufficient information for all strains. For the dowel test in particular, we propose that two conditions could be used for future studies. The test immediately after ethanol injection seemed to reflect sources of genetic influence different from those influencing the later test, at T30, which probably include tolerance development. Within a time point, dowel diameter was less crucial. If, as is often the case, a new genotype of animals to be screened were scarce, the two conditions could be combined. A single ethanol injection of 1.5 g/kg could be given just before the T0 test using a 15.8-m dowel and then the same mice could be tested at T30 on a 9.6-mm dowel. These two conditions revealed the influence of different groups of genes, and for comparisons of novel genotypes, this would give the investigator two chances to detect any genetic differences. For the screen test, a dose of either 1.75 or 2.25 g/kg seemed to give the best results for the test 30 min later. Thus, while investigators may be able to avoid large parametric studies for future work, they are still likely to need to explore multiple test conditions in a thorough examination of any genetic influence.


Supported by a grant from the Department of Veterans Affairs, by NIH Grants P50 AA10760, R01 AA12714 and T32 AA07468, by a grant from the Natural Sciences and Engineering Research Council of Canada, and the Mouse Phenome Project of the Jackson Laboratory. We thank Karyn Best, Christopher L. Kliethermes, Nathan R. Rustay, Chia-Hua Yu and Jessica M. Zuraw for their assistance with the experiments, and Mark Rutledge-Gorman for help with the manuscript.