This study reports on alterations in the magnitude and frequency of extremes in reproductive phenology using long-term records (1951–2008) for plant species widely distributed across Germany. For each of fourteen indicator phases studied, time series of annual onset dates at up to 119 stations, providing 50–58 years of observation, were standardized by their station mean and standard deviation. Four alternative statistical models were applied and compared to derive probabilities of extreme early or late onset times for the phases: (1) Gaussian models were used to describe decadal probabilities of standardized anomalies, defined by data either falling below the 5th or exceeding the 95th percentile. (2) Semi-parametric quantile regression was employed for flexible and robust modelling of trends in different quantiles of onset dates. (3) Generalized extreme value distributions (GEV) were fitted to annual detrended minima and maxima of standardized anomalies, and (4) Generalized Pareto distributions (GPD) were fitted to extremes defined as peaks over threshold. Probabilities of extreme early phenological events inferred from Gaussian models, increased on average from 3 to 12%, whereas probabilities of extreme late phenological events decreased from 6 to 2% over the study period. Based on quantile regressions, summer and autumn phases revealed a more pronounced advancing pattern than spring phases. Estimated return levels by GEV were similar for the GPD methods, indicating that extreme early phenological events of magnitudes 2.5, 2.8, and 3.6 on the detrended standardized anomaly scale would occur every 20 years for spring, summer and autumn phases, respectively. This corresponds to absolute onset advances of up to 2 months depending on the season and species. This study demonstrates how extreme phenological events can be accurately modelled even in cases of inherently small numbers of observations, and underlines the need for additional evaluation related to their impacts on ecosystem functioning.