Irregular Identification, Support Conditions, and Inverse Weight Estimation


  • Shakeeb Khan,

    1. Dept. of Economics, Duke University, 213 Social Sciences Building, Durham, NC 27708, U.S.A.;
    Search for more papers by this author
  • Elie Tamer

    1. Dept. of Economics, Northwestern University, 2001 Sheridan Road, Evanston, IL 60208, U.S.A.;
    Search for more papers by this author
    • We thank a co-editor and three referees for comments that improved the content and exposition in the paper. We also thank J. Heckman, M. Ponomareva, J. Powell, and seminar participants at many universities, as well as conference participants at the 2007 NASM at Duke University and at the 2008 ES Winter Meeting in New Orleans, for helpful comments. Support from the National Science Foundation is gratefully acknowledged by Tamer.


In weighted moment condition models, we show a subtle link between identification and estimability that limits the practical usefulness of estimators based on these models. In particular, if it is necessary for (point) identification that the weights take arbitrarily large values, then the parameter of interest, though point identified, cannot be estimated at the regular (parametric) rate and is said to be irregularly identified. This rate depends on relative tail conditions and can be as slow in some examples as n−1/4. This nonstandard rate of convergence can lead to numerical instability and/or large standard errors. We examine two weighted model examples: (i) the binary response model under mean restriction introduced by Lewbel (1997) and further generalized to cover endogeneity and selection, where the estimator in this class of models is weighted by the density of a special regressor, and (ii) the treatment effect model under exogenous selection (Rosenbaum and Rubin (1983)), where the resulting estimator of the average treatment effect is one that is weighted by a variant of the propensity score. Without strong relative support conditions, these models, similar to well known “identified at infinity” models, lead to estimators that converge at slower than parametric rate, since essentially, to ensure point identification, one requires some variables to take values on sets with arbitrarily small probabilities, or thin sets. For the two models above, we derive some rates of convergence and propose that one conducts inference using rate adaptive procedures that are analogous to Andrews and Schafgans (1998) for the sample selection model.