Hungarian validation of the Buss–Perry Aggression Questionnaire—Is the short form more adequate?

Abstract Objective We aim to provide a publicly available Hungarian version of the BPAQ; compare the BPAQ factors to other personality traits; and compare both the original BPAQ factor structure provided by Buss and Perry (J. Pers. Soc. Psychol., 63, 1992, 452), the revised BPAQ‐SF factor structure by Bryant and Smith (J. Res. Pers., 35, 2001, 138), and the BAQ by Webster et al. (Aggress. Behav., 40, 2014, 120). Methods The validation of the Hungarian version of the BPAQ was carried out on a Hungarian university sample (N = 841). There were three main focuses of data analysis: descriptive statistics, correlations, and confirmatory factor analyses. Results CFA‐related statistics showed an adequate fit for the BPAQ 4 factors; however, contrary to prior validations of BPAQ, we were not able to clearly define the verbal aggression factor. We found that the shorter form of the BPAQ has a better model fit on our sample than the original form, while the model fit of the BAQ was in‐between these. BPAQ scales showed low to moderate relationship with the Barratt Impulsivity Scale and Hospital Anxiety and Depression Scale. Conclusion Both the BPAQ and the BPAQ‐SF, also the BAQ provide acceptable model fitting on a Hungarian sample of university students. While most of BPAQ items provided adequate loadings on their hypothesized factors, two items (21 and 27) did not. We argue this is the result of conceptual inaccuracy of the original items.

The BDHI (Buss & Durkee, 1957) was developed in the 1950s and remained in use for decades. The BDHI measures 7 dimensions of aggression (Assault, Indirect Hostility, Irritability, Negativism, Resentment, Suspicion, and Verbal Hostility). The BDHI was criticized for its lack of discriminant validity (Biaggio, 1981), its unstable factor loadings (Buss & Perry, 1992) and because it was found to strongly correlate with social desirability (Biaggio et al., 1981). Thus, a new questionnaire (BPAQ) was developed.
The BPAQ is a 29 item five-point Likert scale inventory replacing the true or false type questions of the BDHI. It measures four factors of aggression: Verbal Aggression, Physical Aggression, Anger, and Hostility. The BPAQ has been in use in different languages and its psychometric properties appear to substantiate the original four-factor model of aggression that was proposed by Buss and Perry. The 29 item BPAQ has gone through some changes since its inception.
For example, Archer et al. (1995) criticized the questionnaire for the difficulty of obtaining classically "pure" measures, like the ones proposed by Buss and Perry. Some criticized the 29 item BPAQ because the four-factor model explained only a portion of the common variance, model fit was only moderate (Bryant & Smith, 2001;Harris, 1995;Williams et al., 1996).
This criticism was led by Bryant and Smith (2001), who suggested to remove items from the questionnaire to improve its psychometric properties. The result is a 12 item BPAQ with better psychometric properties than the original 29 item scale. The Buss-Perry Aggression Questionnaire-Short Form (BPAQ-SF) aimed to improve the common explained variance of the BPAQ (for which variance explained was only 80%). By omitting seventeen items from the original questionnaire, 12 items remained. This reduced form produced four conceptually similar factors as before: Anger, Physical Aggression, Hostility, and Verbal Aggression. Confirmatory Factor Analyses of the BPAQ-SF supported the new model, fit was satisfactory on independent data sets (Diamond & Magaletta, 2006;Gerevich et al., 2007;Reyna et al., 2011;Webster et al., 2014). Webster et al. (2014) also proposed a 12-item model, the Brief Aggression Questionnaire (BAQ). This questionnaire gives an alternative for the BPAQ-SF. The items were selected also from the BPAQ, using multiple criteria (item-total correlations, factor loadings from principal axis factoring PAF, confirmatory factor analyses, wording and face validity of items). Contrary to the BPAQ-SF, they kept the reversed items. While Bryant and Smith (2001) used 5 moderate samples to develop the BPAQ-SF, Webster et al. (2014) had one larger sample for selecting the ideal items. The BAQ shows the same factor structure as the BPAQ and BPAQ-SF. Their model outperformed the BPAQ-SF in validity test but underperformed in internal consistency.
The BPAQ has been translated to various languages with adequate psychometric properties. There are translations available in Dutch (Meesters et al., 1996;Morren & Meesters, 2002), Japanese (Nakano, 2001), Spanish (Alvarado et al., 2007;García-León et al., 2002;Santisteban & Alvarado, 2009), German (von Collani & Werner, 2005), Chinese (Maxwell & Maxwell, 2007), Pakistani (Iftikhar & Malik, 2014), and many others. A Hungarian version was also validated on a representative sample. Gerevich et al. (2007) conducted a study on a representative (N = 1,200) Hungarian sample. They concluded that the four-factor model originally proposed by Buss and Perry is valid. The factor loadings, however, were not as clear as they were in the original BPAQ, and they found the BPAQ-SF more adequate. However, this Hungarian translation is not publicly available.
Furthermore, aggression has been connected to various other personality traits, such as impulsivity, depression, and anxiety.
In sum, the BPAQ is a widely used tool to assess aggression, translated to many different languages. It has been previously translated to Hungarian, showing adequate psychometric properties (Gerevich et al., 2007), however, during our backward translation process certain translation errors were noticed. Thus, we made a modified Hungarian BPAQ, which we also provide here, as the first publicly available Hungarian version of the full Buss-Perry Aggression Questionnaire (see BPAQ at Appendix 1) and the shorter versions as well (see BPAQ-SF at Appendix 2, and BAQ at Appendix 3). We also aim to compare the factors of Aggression to other personality traits: Impulsivity, Anxiety, and Depression. Furthermore, there are some studies suggesting that shorter forms of the original 29 item BPAQ might have better psychometric properties. Thus, the purpose of this study was to attempt to remedy the issue of factor loadings: comparing both the original BPAQ factor structure provided by Buss and Perry (1992) and the revised BPAQ-SF factor structure by Bryant and Smith (2001) and also investigating the BAQ by Webster et al. (2014) in our Hungarian university sample.

| Participants and procedure
The study protocol was approved by the Institutional Ethical Board. Recruitment took place in four different educational institutions (N = 841, Mean age = 23.55 SD = 8.04). Inclusion criteria were no present or past psychiatric illnesses. Participants received no compensation for participation. Participants were all Caucasian college and university students (53.5% female and 46.5% male).
The minimum target sample size for the confirmatory factor analysis was based on the "rule of thumb" which suggest 10 participants per item (10 × 29 = 290), and we duplicated this to be able to test the planned two models (BPAQ and BPAQ-SF), resulting in a planned minimum sample size of 600 participants.
All participants provided written informed consent and filled out the Hungarian translation 29 item BPAQ questionnaire. Data collection the BPQA was performed in two waves. Instruction and statements of the questionnaire were exactly the same in both, however, in the first wave of data collection (80.1% of the total sample; n = 674) the answers were given on a seven-point Likert scale. In the second wave of the collection, we used a five-point Likert scale (19.8%; n = 167). In both cases, the answers ranged between "absolutely not true" to "absolutely true." For the analysis, we transformed the seven-point Likert scale data to five-point Likert scale data. Reason behind reduction to five-point scale was to simplify data analysis. When converting items from 7-to 5-point scales, we used 0.66 step increments (1 = 1; 2 = 1.66; 3 = 2.33; 4 = 3; 5 = 3.66; 6 = 4.33; 7 = 5). t Tests were performed to compare BPAQ and BPAQ-SF scales on the 5-and 7-point Likert scales (see Tables 1 and 2). In most cases, the difference between the scale mean of the 5-point Liker scale version and the 7-point Likert scale version was not significant, and even in cases where the difference reached the level of significance, the differences in the means were rather low.

| Questionnaires
The BPAQ explores aggression on four subscales: Anger (items 1, 9,12,18,19,23,28),Physical Aggression (items 2,5,8,11,13,16,22,25,29),Hostility (items 3,7,10,15,17,20,24,26),and Verbal Aggression (items 4,6,14,21,27). The 29 item BPAQ has two reversed items, one loading on the Anger (item 9) subscale and the other loading on Physical Aggression (item 16) subscale. Subscale scores were calculated by taking the mean of the appropriate items. The factors of the BPAQ-SF remained conceptually the same (Appendix 2): Anger (items 12, 18, 28), Physical Aggression (items 11, 13, 25), Hostility (items 7, 17, 24), and Verbal Aggression (items 6, 21, 27). Subscale scores were calculated by taking the mean of the appropriate items. The psychometric properties of the first Hungarian version of the BPAQ have previously been investigated (Gerevich et al., 2007) providing adequate reliability indexes. As part of the present study, we made further corrections on this first Hungarian translation of the BPAQ since our backward translation prevailed small inaccuracies in about half the items, and significant translation errors in about two items. This modified version of the BPAQ was applied in the present study to investigate its psychometric properties.
A common measure of impulsivity is the Barratt Impulsivity Scale (BIS, Barratt, 1959). The BIS has been through numerous revisions, now we use the 11th version (Patton et al., 1995) The Hospital Anxiety and Depression Scale (HADS) (Zigmond & Snaith, 1983) measures anxiety and depression on an intermixed 7-7 items long questionnaire, respectively. It was translated to Hungarian by Muszbek et al. (2006). The HADS (Zigmond & Snaith, 1983) is a self-report tool containing 14 items, 7 items for an Anxiety (Ax), and 7 items for Depression (Dp) scale containing straightforward and reversed items as well. It is also scored on a four-point Likert scale, and while answers differ in wording, their content ranges from "not at all" to "most of the time." For both questionnaires, subscale scores were calculated by taking the mean of the appropriate items.

| Data analysis in R
The questionnaire has been analyzed two ways: considering the BPAQ factor structure of 29 items by Buss and Perry (1992); and the BPAQ-SF factor structure of 12 items by Bryant and Smith (2001).  Team, 2015) software.
There were three main focuses of data analysis: descriptive statistics, correlations, and confirmatory factor analyses. For variable selection, the sqldf package (Grothendieck & Grothendieck, 2017), and for plotting the corrplot (Wei et al., 2017) and ggplot2 To test measurement invariance between 5-and 7-point Likert Scale BPAQ groups and genders, we used the measurementInvariance function of semTools.
To test the hypothesized factor structure of Bryant and Smith (2001) as well, we used the same statistical methods as described above for the original factor structure. Instead of the four factors derived from the 29 items of the BPAQ, we used the newer four conceptually same factors of the BPAQ-SF, derived from the 12 items Bryant and Smith recommended from the BPAQ. We also tested the 12-item Brief Aggression Questionnaire (BAQ; Webster et al., 2014Webster et al., , 2015. For both the original and the short versions of the BPAQ, we considered an item to be correctly loading on a factor if its factor loading was 0.3 on the hypothesized factor and also less than 0.3 on any other factor (see Buss & Perry, 1992).

| Descriptive statistics
Mean age of the sample was 23.55 years (SD = 8.04). The gender distribution was slightly skewed toward women (53.86%). Mean score on the total Aggression questionnaire (BPAQ) was 2.29 (SD

| Internal consistency and correlations among aggression factors in the original and short form of the Aggression Questionnaire and Brief Aggression Questionnaire
Internal consistency was computed on for the subscales Anger, Physical Aggression, Hostility, and Verbal Aggression and the total mean score for the Aggression Questionnaire. Cronbach's alpha coefficient was used to determine reliability of the scales (Table 3) Shapiro-Wilk test was conducted to assess normality of scales, which yielded significant deviation from the normal distribution for all scales. Due to these results, nonparametric Spearman correlations were used to investigate interscale correlations of the scales and subscales. All scales had low or moderate intercorrelation, see Table 4. We also tested the intercorrelation of the BPAQ-SF and BAQ subscales, which also yielded significant results similar to the BPAQ (Table 4).

| Correlation with other traits
We tested the relationship of aggression to other personality traits, namely the three impulsivity factors (Im, Ic, Inp) of BIS and the de-

| Confirmatory factor analysis based on Buss and Perry (1992)
Next, we carried out a confirmatory factor analysis (CFA) to assess Then, we conducted a Maximum Likelihood Factor analysis to assess the factor loadings of each item. In this CFA analysis, we used varimax rotation and based on the original factor structure the number of factors were fixed as 4. Loading on each factor can be seen in Table 5. For each item, factor loadings in bold represent the factor hypothesized based on the BPAQ.
All Anger items load on the first factor (loadings range between

| Confirmatory factor analysis based on Bryant and Smith (2001)
The next analysis examined the model fit based on the shorter version of the Aggression Questionnaire (BPAQ-SF), originally presented by Bryant and Smith (2001). This confirmatory factor analysis was conducted on the same dataframe as before, except this time it only con-

| Confirmatory factor analysis based on Webster et al. (2014)
We examined the 12-item model proposed by Webster et al. (2014).
Conducted on the same dataset as before, containing only the items proposed by the authors ( was .07, once again improved from the BPAQ, but did not match the BPAQ-SF.

TA B L E 5
Confirmatory factor analysis based on Buss and Perry (1992) Confirmatory factor Items of the BPAQ Factor loadings (n = 841)

| Exploratory factor analysis of newly proposed models
As we have seen with the CFAs of the models proposed by Buss and Perry (1992;

| Measurement invariance
When comparing groups, it is assumed that the measurement investigates the same underlying psychological construct in all groups (Milfont & Fischer, 2010;Webster et al., 2015). Using multiple group CFAs, we tested configural, metric (constraining factor loadings to be equivalent across groups), and scalar invariance (also constraining item intercepts alongside factor loadings to be equivalent across groups) to compare groups based on gender and whether they completed the 5-or 7-point Likert scale BPAQ questionnaires. Configural invariance means the basic organization of the construct is supported in each group, while noninvariance means patterns of factor loadings differ between groups; metric invariance means fixing factor loadings across groups does not significantly alter model fit, while noninvariance means at least one model fit is different across groups, and its source should be investigated; scalar invariance means item intercepts are not significantly different across groups, while noninvariance means at least one item intercept is different across groups (Putnick & Bornstein, 2016). Configural invariance is measured by the overall model fit of the multiple group model, while we estimate both factor models simultaneously. If configural invariance is met, the metric invariance model is nested against the TA B L E 8 Measurement invariance of BPAQ, BPAQ-SF and BAQ configural invariance model. If metric invariance is also met; then, the scalar invariance model is nested against the metric invariance model. Evaluation of measurement invariance is the subject of debate; however, the significance of change in χ 2 or a −0.01 change in CFI for nested models is commonly used metrics.
As seen in Table 8, configural invariance is met in the case of BPAQ and BPAQ-SF and BAQ models, gender, and item-wise. In this section, the metric of significance of change in χ 2 is used to determine measurement invariance. In case of gender, metric noninvariance can be observed in BPAQ and BAQ, and scalar noninvariance in BPAQ-SF. Results suggest gender differences influence our model.
Investigating the 5-and 7-point scale models of BPAQ, metric invariance is seen, while scalar invariance is not met, indicating one intercept is not equal between the two groups. When comparing the 5-and 7-point scale models of BPAQ-SF and BAQ, scalar invariance is seen, indicating the difference between 5 and 7 item Likert scales on the BPAQ have no effect on our 12-item models.
In this section, the metric of significance of a minimal −0.01 change in CFI for nested models is used to determine measurement invariance. Scalar noninvariance can be observed between gender groups in BPAQ and BPAQ-SF and BAQ. In the case of comparing the two scale types, scalar invariance is obtained in all three, BPAQ, BPAQ-SF, and BAQ. All models had similar measurement invariance across all measures.

| D ISCUSS I ON
This article investigated the model fitting of two factor structures based on the Buss-Perry Aggression Questionnaire (Buss & Perry, 1992). We also examined the relations of the BPAQ to other personality traits measured by questionnaires, namely Impulsivity (Barratt Impulsivity Scale; Barratt, 1959) and Anxiety and Depression (Hospital Anxiety and Depression Scale ;Zigmond & Snaith, 1983). In line with previous findings (Fossati et al., 2002;Vigil-colet & Codorniu-raga, 2004), we found Nonplanning impulsivity, Motor Impulsivity, and Cognitive Impulsivity to be positively correlated with all BPAQ factors. We also found Depression to be correlated with all BPAQ factors, while Aggression correlated with every factor except the Physical Aggression factor.
The strongest relationships were observed between Anger and Motor Impulsivity, Cognitive Impulsivity, Anxiety and between Hostility and Anxiety and Depression. Regarding gender differences, we found higher all-around BPAQ mean scores and Physical Aggression mean scores higher for men, and higher Anger mean scores for women. To test the model fit of the Buss-Perry Aggression Questionnaire (Buss & Perry, 1992) on our sample, we ran a confirmatory factor analysis (CFA). The chi-square value and other CFA-related statistics showed a bad fit; however, it is well known that the chi-square statistics are sensitive to sample size. The results of the CFA were similar to the CFA results reported by others before (Collani & Werner, 2005;Gerevich et al., 2007;Santisteban & Alvarado, 2009).
Contrary to prior validations of the BPAQ, we were not able to clearly define the verbal aggression factor. Three items (items 6, 21, and 27) had higher factor loadings on a different factor (Anger) other than the target one (Verbal Aggression). Item 6 states "I can't help getting into arguments when people disagree with me," item 21 states "I often find myself disagreeing with people," while item However, as in the case of the BPAQ, the items in the verbal aggression factor did not reach the cut-off point. The VA factor of the BPAQ-SF contained the three (on the BPAQ) problematic items, items 6 ("I can't help getting into arguments when people disagree with me."), 21 ("I often find myself disagreeing with people."), and 27 ("My friends say that I'm somewhat argumentative.").
We were interested in seeing how the shortened and modified structure of the questionnaire would affect the factor loadings and the distribution of these items. Item 6 this time had a very high factor loading, contributing to most of the factor's conceptual meaning. Item 21 and item 27 did not meet the required factor loading of 0.30 on the hypothesized factor, but had higher factor loading on Anger. These factor loadings paint a similar picture to the one we have seen on the BPAQ. On the original form, we found all of these items loading incorrectly. While we argued that items 21 and 27 did not fit the concept of verbal aggression well and were leaning more toward assertivity, we found item 6 to be in line with the definition of verbal aggression despite its incorrect factor loadings. In the case of the BPAQ-SF factor structure, this argument seems to be consolidated. We can see that item 6 differs in terms of conceptual meaning from the other two, contributing to most of the VA factor of the BPAQ-SF. These also underlie that items 21 and 27 should undergo some conceptual changes to fit better with the definition of verbal aggression.
Besides the BPAQ-SF, we investigated the Brief Aggression Questionnaire proposed by Webster et al. (2014). BAQ proved to be the middle ground in terms of the model fit of the hypothesized factor structure. AN, PA, and HS factors had appropriate factor loadings once again, while the VA factor was the outlier. The Verbal Aggression factor is mostly defined by item 6 ("I can't help getting into arguments when people disagree with me."), with a very high factor loading. The BAQ, however, did not contain the other two problematic items, 21 and 27. It could be hypothesized, that their absence would contribute to a more straightforward VA factor; however, this was not the case.
We also tested an exploratory model feasibility of the BPAQ, BPAQ-SF, and BAQ with items 21 and 27 moved to the Anger factor.
While BPAQ showed little to no improvement with the altered factor structures, BPAQ-SF and mostly BAQ models improved drastically.
In the case of BPAQ-SF and BAQ, both CFI and TLI values changed for the better. It seems that the wording of these items 21 and 27 do indeed fit other scales more and are not in line with the previously introduced Verbal Aggression definitions. Once again, wording of these factors should be changed in future studies to be more in line with VA constructs. Besides these attempts, a wide range of further measurement practices (observational, self-report, or laboratory) is available to gauge aggression (Suris et al., 2004) which could provide a great opportunity as a comparison to this questionnaire.
In conclusion, both the BPAQ, the BPAQ-SF and the BAQ provided acceptable model fitting on a Hungarian sample of university students. However, the shorter form provides better model fitting in terms of measures of confirmatory factor analysis. The factors derived from the BPAQ-SF explain a higher proportion of the common variance. While most of BPAQ items provided adequate loadings on their hypothesized factors, two items (21 and 27) did not. We argue this is the result of conceptual inaccuracy of the original items.
Therefore, we suggest changing the wording, to better fit the concept of verbal aggression. Furthermore, a freely accessible version of the BPAQ (Appendix 1) is also provided to the Hungarian scientific community.
It also has to be noted that although the shorter versions showed clearer factor structures then the BPAQ, the internal consistency values were lower in case of the subscales of the shorter versions, suggesting that the reliability of the subscales did suffer from the item reduction. Thus, although the Cronbach alpha values are still around the acceptable range in case of the subscales of the short versions, interpreting scores of these subscales as independent scores should be handled with caution. They should rather be used as additional information on the pattern of aggression characteristics of the participants, like if example one has a high total score, scores on the subscales could give information on which facet it is mostly coming from.

CO N FLI C T O F I NTE R E S T
The authors (S. Zimonyi, K. Kasos, Z. Halmai, L. Csirmaz, H. Stadler, E. Kotyuk) declare that they do not have any interests which could constitute a real, potential or apparent conflict of interest with respect to his/her involvement in the publication. The authors also declare that they do not have any financial or other relations (e.g., directorship, consultancy, or speaker fee) with companies, trade associations, unions, or groups (including civic associations and public interest groups) that may gain or lose financially from the results or conclusions in the study.

PE E R R E V I E W
The peer review history for this article is available at https://publo ns.com/publo n/10.1002/brb3.2043.

DATA AVA I L A B I L I T Y S TAT E M E N T
The dataframe (df) and scripts of all the analyses are freely available at the Open Science Framework website (https://osf.io/afzb7/ ?view_only=e6ba9 0a3e5 9d450 08478 8707e 6ed27fe, Zimonyi et al., 2021). The dataframe has been stored as a comma separated Excel file and imported into R as such.

R E FE R E N C E S
Alvarado, M., Recio, P., & Santisteban, C. (2007). Evaluation of a Spanish version of the Buss and Perry aggression questionnaire: Some