On the use of log-transformation vs. nonlinear regression for analyzing biological power laws

Authors

  • Xiao Xiao,

    Corresponding author
    1. Department of Biology, Utah State University, Logan, Utah 84322-5305 USA
    2. Ecology Center, Utah State University, Logan, Utah 84322-5205 USA
    3. Department of Mathematics and Statistics, Utah State University, Logan, Utah 84322-3900 USA
    Search for more papers by this author
  • Ethan P. White,

    1. Department of Biology, Utah State University, Logan, Utah 84322-5305 USA
    2. Ecology Center, Utah State University, Logan, Utah 84322-5205 USA
    Search for more papers by this author
  • Mevin B. Hooten,

    1. Ecology Center, Utah State University, Logan, Utah 84322-5205 USA
    2. Department of Mathematics and Statistics, Utah State University, Logan, Utah 84322-3900 USA
    Search for more papers by this author
    • Present address: U.S. Geological Survey, Colorado Cooperative Fish and Wildlife Research Unit, Departments of Fish, Wildlife and Conservation Biology, and Statistics, Colorado State University, 1484 Campus Delivery, Fort Collins, Colorado 80523-1484 USA.

  • Susan L. Durham

    1. Ecology Center, Utah State University, Logan, Utah 84322-5205 USA
    Search for more papers by this author

Abstract

Power-law relationships are among the most well-studied functional relationships in biology. Recently the common practice of fitting power laws using linear regression (LR) on log-transformed data has been criticized, calling into question the conclusions of hundreds of studies. It has been suggested that nonlinear regression (NLR) is preferable, but no rigorous comparison of these two methods has been conducted. Using Monte Carlo simulations, we demonstrate that the error distribution determines which method performs better, with NLR better characterizing data with additive, homoscedastic, normal error and LR better characterizing data with multiplicative, heteroscedastic, lognormal error. Analysis of 471 biological power laws shows that both forms of error occur in nature. While previous analyses based on log-transformation appear to be generally valid, future analyses should choose methods based on a combination of biological plausibility and analysis of the error distribution. We provide detailed guidelines and associated computer code for doing so, including a model averaging approach for cases where the error structure is uncertain.

Ancillary