Reuse of this article is permitted in accordance with the terms and conditions set out at http://wileyonlinelibrary.com/onlineopen#OnlineOpen__Terms.
Conditional transformation models
Version of Record online: 20 MAR 2013
© 2013 Royal Statistical Society
Journal of the Royal Statistical Society: Series B (Statistical Methodology)
Volume 76, Issue 1, pages 3–27, January 2014
How to Cite
Hothorn, T., Kneib, T. and Bühlmann, P. (2014), Conditional transformation models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76: 3–27. doi: 10.1111/rssb.12017
- Issue online: 6 JAN 2014
- Version of Record online: 20 MAR 2013
- Conditional distribution function;
- Conditional quantile function;
- Continuous ranked probability score;
- Prediction intervals;
- Structured additive regression
The ultimate goal of regression analysis is to obtain information about the conditional distribution of a response given a set of explanatory variables. This goal is, however, seldom achieved because most established regression models estimate only the conditional mean as a function of the explanatory variables and assume that higher moments are not affected by the regressors. The underlying reason for such a restriction is the assumption of additivity of signal and noise. We propose to relax this common assumption in the framework of transformation models. The novel class of semiparametric regression models proposed herein allows transformation functions to depend on explanatory variables. These transformation functions are estimated by regularized optimization of scoring rules for probabilistic forecasts, e.g. the continuous ranked probability score. The corresponding estimated conditional distribution functions are consistent. Conditional transformation models are potentially useful for describing possible heteroscedasticity, comparing spatially varying distributions, identifying extreme events, deriving prediction intervals and selecting variables beyond mean regression effects. An empirical investigation based on a heteroscedastic varying-coefficient simulation model demonstrates that semiparametric estimation of conditional distribution functions can be more beneficial than kernel-based non-parametric approaches or parametric generalized additive models for location, scale and shape.