Variability in power generation from wind farms is an important issue in the energy industry. If sub-hour variability events can be predicted, potential disruptions to the grid operations might be mitigated. Using 4 years of 5 min wind power data from the Australian Energy Market Operator for an 80 MW wind farm in south-east Australia, we fit statistical models of variability on meteorological reanalysis data from the US National Centers for Environmental Prediction. The National Centers for Environmental Prediction fields were transformed into spatial empirical orthogonal functions, and 6 h projections onto these became explanatory covariates for generalized linear, random forest (RF), gradient boosting and support vector machine classification models. Other covariates considered were local wind speed and 6 h-lagged empirical orthogonal function differences. Models were selected by minimizing cross-validated misclassification rate and assessed using area under the receiver operating characteristic curve and reliability score. Considering performance and ease of tuning, RFs were preferred. Performance was poorer for larger ramps. The RFs accurately predicted their performance on the validation set. For asymmetric costs (miss-to-false alarm cost ratio = 10), RFs yielded competitive low-cost models. Support vector machines produced slightly superior models but needed to be tuned manually. RF models using atmospheric model output provide a robust approach to predicting wind power variability and relatively large ramp events. We recommend the RF models as a practical and skilful method to feed into an early warning system for energy/electricity operators. Copyright © 2014 John Wiley & Sons, Ltd.