Discussion on “A formal causal interpretation of the case‐crossover design” by Zach Shahn, Miguel A. Hernán, and James M. Robins

It has always been clear that the case‐crossover design works, for some definition of “works,” but some of the details have been surprisingly elusive, and it is good to see more of them nailed down by Shahn et al. My interest in case‐crossover analyses has mostly been in the context of air pollution epidemiology mentioned at the end of the paper. The air pollution setting is distinctive for several reasons: as the exposure variable is plausibly exogenous, it is possible to use control times after the case time, the effects of interest are quite small, and the same measured exposure series is shared over many—perhaps all—of the cohort.

The approximate exogeneity of particulate air pollution concentrations (at least in the days before widespread use of masks and high-efficiency particle filters) will simplify the causal analysis substantially. In a careful analysis, there will be a complication that the measured neighborhoodlevel or city-level exposure time series is not the actual personal exposure. Zeger et al. (2000) argued that much of the difference manifests as Berkson error and the bias are not as large as one might naively expect. Their analysis does suggest that assumptions about measurement error structure will also be important for the case-crossover design in this setting.
One issue where a counterfactual analysis might be illuminating for air pollution case-crossover studies is the distinction between events that could recur and those that cannot when using control times after the case time. Under a rare-event assumption, few or no individuals will experience a second event, but intuitively, it seems more appropriate to use postevent control times when a second event is possible than when one is impossible. Using This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. © 2022 The Authors. Biometrics published by Wiley Periodicals LLC on behalf of International Biometric Society. control times after a fatal event, in particular, raises face validity concerns and has been controversial. On the other hand, postevent control times reduce or remove bias from time trends in exposure, and they do this whether or not a second event is possible.
Having a shared exposure time series does not affect the causal estimand, but it does affect the choice of estimators and the magnitude of various biases, and it may affect asymptotics. A popular estimator based on conditional logistic regression, while not a true conditional likelihood estimator, often performs well in "standard" case-crossover designs. It has noticeable bias in the air pollution context, unless the control times are chosen appropriately (Janes et al., 2005;Lumley & Levy, 2000). This bias tends to average out when individuals have different exposure series.
Perhaps, most seriously, in the air pollution context, it is almost certainly not true that the entire effect of exposure is transient. What is plausibly true is that the effect of exposure includes a transient component and a chronic  Table 1 shows a simple simulation, is a binary variable in a cohort of 1000 people observed for 100 times, = 1, 2, … , 100, with event probability at time of logit ( ) = + +̄, where ∼ ( , 1) and is chosen to give about 200 events in the 10 5 person-times. In this model, is a transient effect of at the same time, and represents a continuing effect, with̄being the cumulative mean of over time. We consider different values of and three scenarios for transient/chronic effects. In the first scenario, = 0 and are iid Bernoulli(0.5). In the second scenario, is the same, but = 1. In the third scenario, = 1 and is generated by thresholding an AR(1) process with lag-1 autocorrelation of 0.5 at zero, to produce autocorrelated Bernoulli(0.5) data. The MH estimator is computed ignorinḡ, so we have a setting where there is truly both a transient and a chronic effect, but only the transient effect is being considered; this is how air pollution case-crossover studies were often performed. Code for this simulation is in the Supporting Information.
The simulation shows that the MH estimator does reasonably well in recovering the transient component of the effect, but there is definite bias and the bias is aggravated by autocorrelation in exposure. It would be interesting to characterize this bias and whether it can be miti-gated, because case-crossover designs are used to study the transient effects of exposures that are expected to also contribute to chronic effects.
Finally, the case-specular design (Zaffanella et al., 1998) raises some related questions. In this design, the controls are not sampled people but the houses "opposite" those where a case lives, originally to measure electromagnetic field exposure matched on neighborhood characteristics. The case-specular design works to control the major causes of confounding for which it was designed, but a counterfactual analysis of it seems more difficult. Because the specular controls are not necessarily places where a noncase lives, it seems that stronger assumptions about both the exposure model and confounding would be required to estimate an exact causal estimand.