The Body Image Matrix of Thinness and Muscularity-Male Bodies : Development and validation of a new figure rating scale for body image in men

Correspondence Rike Arkenau, Department of Clinical Psychology and Psychotherapy, Osnabrück University, Knollstraße 15, 49088 Osnabrück, Germany. Email: rike.arkenau@uni-osnabrueck.de Abstract Objective: The study aimed to validate the Body Image Matrix of Thinness and Muscularity—Male Bodies (BIMTM‐ MB), a two‐dimensional figure rating scale consisting of 64 three‐dimensional male bodies, arranged in an 8 × 8 grid, with muscularity increasing stepwise on the vertical axis and body fat on the horizontal axis. Method: The online sample included 355 men participating in an online survey. Besides the BIMTM‐MB, participants completed questionnaires on body‐related attitudes, behaviors, and psychopathology. Another 91 men were recruited to examine test–retest reliability of the BIMTM‐MB. Results: The BIMTM‐MB showed good convergent and criterion validity. Men meeting their own body ideal showed higher body satisfaction and lower body‐related psychopathology. Test–retest reliability was high. Conclusions: The BIMTM‐MB proved to be a reliable and valid measure and is recommended for use in research and clinical practice to examine central aspects of male body image.


| INTRODUCTION
Various studies indicate gender differences in body image concerns, with muscularity-related body dissatisfaction being higher in (adolescent) men, and weight/shape concerns being more pronounced in (adolescent) women (e.g., Hoffmann & Warschburger, 2017;Kelley, Neufeld, & Musher-Eizenmann, 2010). To adequately address body image concerns in men in research and clinical practice, for example, to establish prevention or treatment strategies, it is necessary to develop and provide appropriate instruments that consider these different foci. The use of figure rating scales, encompassing a muscularity and a body fat component, might represent an effective and economical option (for a review, see .
Although there is an existing pool of psychometrically validated figure rating scales focusing on the named criteria, these scales all face certain limitations. First, there are one-dimensional scales that separately display a muscularity and a body fat dimension (e.g., Ralph-Nearman & Filik, 2018;. As this separation of body fat from muscularity is artificial, the displayed body figures do not conform with real-life bodies.
This, in turn, might restrict the ecological validity of the scales (see , and limit the generalization of participants' estimations of their actual or ideal bodies on the respective scale to real-life situations. Second, some scales only display hand-drawn and thus less detailed body figures (e.g., Gruber, Pope, Borowiecki, & Cohane, 1999;Hildebrandt, Langenbucher, & Schlundt, 2004), and use the front double biceps pose (e.g., Gruber et al., 1999) possibly, mainly familiar to bodybuilding respondents. Overall, this might restrict the ecological validity of the scales and impede the identification procedure. Third, some scales lack adequate test-retest reliability (Cafri, Roehrig, & Thompson, 2004). Finally, body figures are restricted in range (e.g., Talbot, Smith, Cass, & Griffiths, 2019) and number (e.g., Ralph-Nearman & Filik, 2018;, potentially leading to biased results and further limiting the scope of administration to individuals with levels of muscularity and body fat within the displayed range of the scales (see . This latter point might be especially relevant if figure rating scales are to be used in a clinical context, that is, with underweight, obese, or extremely muscular clients.
Given these limitations, the present study aimed to develop and validate a highly differentiated figure rating scale for body image in men, combining muscularity and body fat into a two-dimensional (2D) format, and further including sufficiently extreme and realistic male bodies. It was expected that participant's estimations of their actual, felt, and ideal body on the muscularity or the body fat dimension of the the Body Image Matrix of Thinness and Muscularity-Male Bodies (BIMTM-MB) would correlate positively with the respective items on the muscularity or body fat dimension of the Bodybuilder Image Grid-Original (BIG-O; Hildebrandt et al., 2004). These correlations were expected to be higher than those between inconsistent variable pairs. Furthermore, it was assumed that ideal BIMTM-MB muscularity scores would correlate positively with drives for muscularity and leanness, and physical training frequency, while ideal BIMTM-MB body fat scores would correlate negatively with drives for thinness and leanness, and physical training frequency. Additionally, ideal BIMTM-MB muscularity and body fat levels were expected to differentiate between weight-training and nonweight-training men. Moreover, participants who correspond to their BIMTM-MB muscularity or body fat ideal were expected to report higher body satisfaction, fewer eating disorder symptoms, less body image disturbance, and less dysmorphic concern compared with those who deviate from their ideal. Based on data from other figure rating scales (e.g., , modest to high test-retest reliability was expected.

| Participants and recruitment
The sample used to validate the BIMTM-MB was derived from a broader online survey on body image and sexual orientation and comprised 355 straight (n = 161) and gay (n = 194) men with a mean age of 29.99 years (standard deviation [SD] = 10.84; range = 18-79), and a mean body mass index (BMI) of 24.72 (SD = 5.11;).
The educational level was relatively high, as nearly half of the sample reported having a university/polytechnic degree (42.6%, n = 151). The survey protocol was approved by the ethics committee of Osnabrück University. Participants were recruited via university press releases and email lists, notices in local newspapers and online websites, social media, and flyers at local gyms or university buildings. The sample used to assess test-retest reliability was a prospective community-based sample comprising 91 straight (n = 86) and gay (n = 5) men with a mean age of 40.70 (SD = 14.69;, and a mean BMI of 26.21 (SD = 3.38;). Again, the educational level was relatively high, with 45.1% (n = 41) of men reporting a university/polytechnic degree. Participants were recruited face-to-face at local student cafés and on campus via notices and flyers. In both samples, inclusion criteria were age ≥18 years and male gender. No measure for ethnicity was included. However, given the structure of the German population (Statistisches Bundesamt, 2018), and considering that mainly German men participated in the online sample (91%), it is likely that most of the participants were Caucasians.

| Procedure
The online survey was conducted using UniPark (Questback GmbH, Berlin, Germany). Participants of the online sample used a link or a QR code to access the online survey and took part after having provided informed consent.
The survey duration was approximately 35 min. Participants of the prospective sample were given a paper-andpencil questionnaire package, which had to be completed directly after recruitment. For the retest, participants were given a second questionnaire package, together with return envelopes, which had to be completed after a 2-week interval. Participants had the opportunity to win online shopping vouchers (1 out of 10, worth 20 Euros each in the online sample and 1 out of 5, worth 10 Euros each in the prospective sample).

| Development of the BIMTM-MB
The BIMTM-MB was constructed using the Rendering Software DAZ Studio 4.9 Pro, with the 3D model Michael 6.0 HD as the basic figure. Using different body characteristics that can be varied by the software, the 3D model was modulated to create an 8 × 8 grid with 64 figures varying along the orthogonal dimensions of muscularity and body fat. The figures at the polar extremes of the matrix were created first and used as reference points for further construction. This method of construction enabled the creation of figures with extreme values in muscle mass and body fat, for example, an extremely underweight body. The figures forming the frame of the matrix were created next, followed by a stepwise refilling. In this regard, muscle mass increased progressively from top to bottom on the vertical axis and body fat increased progressively from left to right on the horizontal axis. For each of the figures, a team of experts was consulted, and adaptations were realized based on experts' consensus. The final matrix consists of 64 colored and realistic male bodies, with numbers from 1 to 64 in the direction of reading. All figures wear neutral gray underwear, do not vary in skin pigmentation, and are presented from the neck down in a standardized pose (Figure 1). Participants were asked to select their actual ("How do you actually look?"), felt ("How do you feel you look?"), and ideal body ("How would you like to look?") out of the 64 displayed figures.

| Measures
The

| Statistical analysis
Participants' answers on the BIMTM-MB were replaced by the respective values on the muscularity and body fat dimension (1 = low muscularity/body fat, 8 = high muscularity/body fat), resulting in two separate variables for each of the BIMTM-MB items. The same procedure was conducted for the items of the BIG-O. If participants had missing values in the above-named measures, they were excluded from the respective analysis. The existence of outliers (assessed via the visual inspection of boxplots and defined as scores below Q1 -1.5 × interquartile range or above Q3 + 1.5 × interquartile range) was checked for all measures in both samples and ranged between 0.5% and 20.0%.
Apart from one participant being excluded from the analysis of criterion validity due to a probable typing error (training frequency = 90 times/week), all other outliers remained to obtain full variance. As further on, the BIMTM-MB was measured at ordinal scale level, and the data were non-normally distributed (checked via visual inspection of histograms and calculation of Kolmogorov-Smirnov tests), nonparametric statistical methods were used.
Examination of test-retest reliability (2-week interval) and the association with the BIG-O, the DMS, DLS, DTS, and the physical training frequency was performed using Spearman's rank correlation. Fisher's z tests were calculated to test for significant differences between Spearman's rank correlation coefficients of consistent and inconsistent item pairs (e.g., ratings on the BIMTM-MB muscularity and the BIG-O body fat dimension; Myers & Sirois, 2006).
Kruskal-Wallis H tests and/or (post hoc) Mann-Whitney U tests (including Bonferroni correction [α = .016]) were conducted to test for subgroup differences, for example, between weight-training and nonweight-training men, or between participants who correspond to or deviate from their body ideal on the BIMTM-MB. Wilcoxon signed-rank tests were conducted to test the short-term stability of the BIMTM-MB. Effect sizes of correlation coefficients and group differences were interpreted using the recommendations of Cohen (1992). All statistical analyses were conducted using IBM SPSS 24.

| RESULTS
Descriptive statistics of participants' estimations of their actual, felt, and ideal body on the BIMTM-MB dimensions, separately for the online and the prospective sample, are displayed in Table 1. Intercorrelations between the actual, felt, and ideal items within the two BIMTM-MB dimensions were modest to high (online sample: .32 ≤ r s ≤ .77; prospective sample: .35 ≤ r s ≤ .68).

| Convergent validity of the BIMTM-MB
There were significant positive correlations between the actual, felt, and ideal BIMTM-MB scores and the respective BIG-O scores within the same dimensions of the two scales (Table 2). Separate Fisher's z tests revealed that the correlations between the actual, felt, and ideal BIMTM-MB and BIG-O muscularity scores were ARKENAU ET AL.

| 1287
significantly higher than those between the actual, felt, and ideal BIMTM-MB muscularity scores and the respective BIG-O body fat scores (4.86 ≤ z ≤ 6.76; all p < .001). Likewise, the correlations between the actual, felt, and ideal BIMTM-MB and BIG-O body fat scores were significantly higher than those between the actual, felt, and ideal BIMTM-MB body fat scores and the respective BIG-O muscularity scores (6.62 ≤ z ≤ 11.55; all p < .001).
Moreover, Table 2 displays correlations between participants' estimations of their actual, felt, and ideal body on the BIMTM-MB muscularity and body fat dimension and mean DMS, DLS, and DTS scores. Ideal muscularity scores correlated significantly positively with mean DMS, DLS, and DTS scores, and ideal body fat scores correlated significantly negatively with mean DLS and DMS scores. No significant correlation between ideal body fat scores and mean DTS scores was found.  n = 165; one-tailed) with self-reported frequency of physical training. Mann-Whitney U tests indicated that weighttraining men (n = 109) compared with nonweight-training men (n = 65) preferred an ideal body with higher muscularity (U = 2845.5; p = .028; r = .17) and lower body fat (U = 2760.0; p = .012; r = .19).

| Group differences in ratings on the BAS-2, BIDQ, DCQ, and EDE-Q
On the BIMTM-MB muscularity dimension, a Kruskal-Wallis H test revealed significant group differences in ratings on the BAS-2 between participants who correspond to their muscularity ideal (Subgroup 1) and those rating themselves as less muscular (Subgroup 2) or more muscular (Subgroup 3) than desired (Table 3). No further significant group differences in symptomatology on the BIDQ, DCQ, or the EDE-Q subscales emerged (all p ≥ .088). Post hoc Mann-Whitney U tests showed that participants' ratings on the BAS-2 were significantly higher for Subgroup 1 compared with Subgroups 2 or 3. Moreover, Subgroup 2 displayed significantly higher values on the BAS-2 compared with Subgroup 3 (Table 3). On the BIMTM-MB body fat dimension, significant group differences in BAS-2, DCQ, and EDE-Q subscale ratings emerged between participants who correspond to their body fat ideal (Subgroup 1) and those who reported having more body fat (Subgroup 2) or less body fat (Subgroup 3) than desired ( Regarding the EDE-Q, Subgroup 2 reported more eating disorder symptoms on all four subscales than Subgroup 1, and compared with Subgroup 3 further showed higher symptomatology on the restraint, the weight concern, and the shape concern subscale (Table 3). None of the other post hoc Mann-Whitney U tests regarding subgroup differences in BAS-2, DCQ, or EDE-Q subscale scores reached statistical significance (all p ≥ .036).

| DISCUSSION
The aim of the present study was to develop and validate a figure  concern, and disordered eating were found according to whether participants correspond to or deviate from their muscularity or body fat ideal. These group differences emerged in the expected direction, that is, with higher body appreciation and/or lower body-related psychopathology in subgroups corresponding to their ideal. Test-retest reliability of the BIMTM-MB was high over a mean interval of about 17 days. Overall, these findings conform with validity and reliability reports of other figure rating scales (e.g., Hildebrandt et al., 2004;. Nevertheless, there were also some inconsistent findings. Given the nonsignificant association between ideal BIMTM-MB body fat scores and drive for thinness, one could question the adequacy of the DTS to assess malespecific body image concerns, for example, muscularity dissatisfaction (e.g., Hoffmann & Warschburger, 2017), as it was primarily developed to assess symptoms typical for anorexia and bulimia nervosa (e.g., Paul & Thiel, 2005). The only small effect sizes concerning the association between ideal BIMTM-MB scores and physical training frequency could be explained by factors such as lack of time, or a possible moderating effect of the importance of one's appearance for one's own self-worth, which, for instance, was found to contribute to exercise dependence over and above mere self-ideal discrepancy (Lamarche & Gammage, 2012). The nonsignificant subgroup differences in eating disorder symptomatology and body dysmorphic concern between participants who correspond to and deviate from their muscularity ideal, in contrast to the significant subgroup differences along the BIMTM-MB body fat dimension, were in line with  and , who also found no significant association between actual-ideal muscularity discrepancy and disordered eating. Hence, these results might indicate that deviating from one's individual muscularity ideal might affect body-related thoughts and feelings, such as measured by the BAS-2, more than enhancing specific psychopathological behaviors, for example, restrictive dietary behavior. Alternatively, other behaviors not assessed in the current study might be more relevant regarding muscularity dissatisfaction, for example, the use of performance-enhancing supplements.
Despite these findings underlining the BIMTM-MB as a reliable and valid instrument, the results need to be interpreted in light of some limitations. First, the generalization of results to a broader population might be limited due to the non-representativeness of the online and the prospective sample, especially regarding age distribution, educational level, and participants' sexual orientation. Second, participants' answers might have been affected by the different presentation forms of the questionnaire packages (online vs. paper-pencil), and in the online sample might be biased due to the anonymous nature of the online survey (see Hydock, 2018). Third, in the prospective sample, test-retest reliability was assessed by instructing participants to complete the BIMTM-MB again after 2 weeks; hence, there was no objective measure guaranteeing compliance with this intended time interval. Fourth, the BIMTM-MB does not provide anthropometric data for the displayed body figures. Thus, it is not possible to draw conclusions about the degree of perceptual body image disturbance, for example, over-or underestimation of actual body dimensions, unless a group of independent raters compares an individual's actual body image chosen on the BIMTM-MB and his real anthropometric body dimensions. Furthermore, without underlying anthropometric data, the BIMTM-MB can only be measured on an ordinal scale level, which limits the range of admissible statistical analyses and also leads to impaired interpretation of actual-ideal discrepancy scores, which are typically calculated and interpreted as indices of body dissatisfaction (e.g., . Finally, body figures were only presented with Caucasian skin pigmentation. Individuals with different skin pigmentations might identify less with the displayed body figures, fostering invalid results.

| CONCLUSION
The BIMTM-MB constitutes a reliable and valid 2D figure rating scale. Due to its quick administration and mainly language-free procedure, it can be easily employed in research and clinical practice of diverse language areas, for example, as a screening instrument or in addition to traditional body image questionnaires to examine body image ideals or assess whether men meet their ideals with respect to muscularity and body fat. By including body figures at the extreme ends of the two dimensions, for example, an extremely underweight, obese, or muscular body figure, the BIMTM-MB might be especially useful within a clinical context.