Published Online: 15 JUL 2005
Copyright © 2005 John Wiley & Sons, Ltd
Encyclopedia of Biostatistics
How to Cite
Greenhouse, S. W. and Greenhouse, J. B. 2005. Cornfield, Jerome. Encyclopedia of Biostatistics. 2.
- Published Online: 15 JUL 2005
Born: October 30, 1912, in New York City, New York.
Died: September 17, 1979, in Herndon, Virginia.
Reproduced by permission of the Royal Statistical Society
Jerome Cornfield was arguably the most influential statistician in the biomedical sciences in the US from the 1950s until his death. He was the consummate statistical scientist. His understanding of the nature of the subject-matter of statistics and of its essential role in the inductive process of integrating data into a body of empirical knowledge, particularly in the biomedical sciences, was outstanding. This thorough view of statistics and scientific research enabled him to identify essential statistical problems. He exercised considerable influence as an advisor and consultant, and for over two decades was a major advocate for statistical reasoning in clinical research.
After attending elementary and high schools in the Bronx, New York, he entered New York University, graduating in 1933 with a major in history. Cornfield did not receive any advanced degrees. He did, however, take some formal graduate courses in history at Columbia University. After moving to Washington, DC, in 1935, Cornfield took a number of courses in statistics at the US Department of Agriculture Graduate School during the period 1936–1938, including courses with M.A. Girshick in general statistics and multivariate analysis. He also had a course in sampling which, together with what he learned on the job from Duane Evans, enabled him to advance the cause of getting probability sampling accepted by several Federal Agencies. Although his formal training was minimal, most of what he had to learn about statistical theory, reasoning, and methodology was self-taught from a continually expanding literature. This enabled him to be discriminatingly selective both as to subject-matter and to the time at which he felt it necessary to learn about a subject. In later years, biomedical associates and statistical colleagues were surprised to discover that he had no doctorate.
A brief review of the major positions he held begins with the Bureau of Labor Statistics, where he was a statistician from 1935 to 1947. In 1947 he joined Harold Dorn's methods unit in the Public Health Service. This unit was shortly transferred to the National Cancer Institute on the campus of the National Institutes of Health (NIH). Cornfield remained in the Cancer Institute until 1955 or 1956 when both he and Dorn moved over to a new Division of Research Services. Here, he consulted with investigators in various Institutes of the NIH. In 1958 he was invited to succeed William Cochran as Chairman of the Department of Biostatistics in the School of Hygiene and Public Health of the Johns Hopkins University. He was also appointed Professor of Biomathematics in the School of Medicine. He returned to the NIH in 1960 as Assistant Chief of the Biometrics Research Branch of the National Heart Institute, became Branch Chief in 1963, and served in that position until his retirement from the NIH in 1967. In 1968 he joined the Graduate School of Public Health of the University of Pittsburgh as a Research Professor of Biostatistics. At the same time he founded a biostatistics research group with offices in the Washington, DC, area. In 1972 he joined the Department of Statistics at the George Washington University as Professor of Statistics and brought his research group into the Department as the Biostatistics Center. He served as Chairman of the Department from 1973 to 1976 and continued as Professor of Statistics and Director of the Center until his terminal illness.
Over a span of three decades, from 1947 to 1979, Professor Cornfield was one of the leading statisticians working in the biomedical area. He made many original contributions to biostatistics, epidemiology, clinical trials, and to quantitative methods in the design and analysis of experiments (see Experimental Design) conducted in clinical and laboratory research. In addition, he wrote a number of papers on Bayesian inference and on the application of Bayesian methods in the biomedical sciences. Before presenting the highlights of the work in this period, it is important to comment on his contributions to economic statistics and sampling while at the Bureau of Labor Statistics (BLS).
From the very beginning of his career, Cornfield was a creative and original thinker, motivated by important real-world problems. He made a number of important contributions to economics and economic statistics during his work at the BLS. He played a major role in the revision of the Consumer Price Index, 1938–1940, introducing several new procedures. He developed a keen interest in sampling, which led to the development of a survey using probability sampling for a study of Family Spending and Saving in wartime. This complex design, according to Duncan & Shelton [26, pp. 46–49] “represented a significant advance in a number of respects. Indeed, it was the precursor of several ideas which were worked out more fully and justified mathematically a year later by Hansen and Hurwitz”. In 1941 Cornfield consulted with the Bureau of Home Economics on a nutrition-related problem which was known as the “diet” problem. The mathematical problem requires the minimization of linear functions subject to a set of given inequality constraints, the problem of linear programming. Zelen [31, p. 12] refers to a 1958 book on linear programming by Dorfman et al. as crediting Cornfield “as being the first person to formulate the linear programming problem and find an approximate solution”. His work appeared in 1941 in an unpublished BLS memorandum. It was also at the BLS that Cornfield made his first contribution to statistical theory. He developed a method using indicator variables for easily obtaining the first few moments of the sample mean when sampling from finite populations. He thus obtained an unbiased estimate of the sample variance and of the variance of the sample mean .
From 1948 to his death 31 years later, Cornfield devoted the major portion of his career to the development and application of statistical theory and methods to the biomedical sciences. His contributions were diverse both in the nature of his statistical interests and in the areas of biostatistical applications. He was involved in and touched upon every major public health issue that arose in that period—the polio vaccines , smoking and lung cancer (see Smoking and Health) [22, 29], risk factors for cardiovascular disease [5, 30], and the difficult statistical issues of estimating the low-dose carcinogenic effects in humans (see Extrapolation, Low Dose) of a food additive that becomes suspect because it produces cancer in animals at much higher doses [14, 20].
In the broad area of biomedical research, Cornfield was involved in a wide variety of problems, in each of which he made significant and lasting contributions. These studies and problems include the following: an imaginative method for estimating the volume–surface ratio of individual cells as observed under the microscope , the statistics of bioassay (see Biological Assay, Overview) [3, 6, 19], photosynthesis , the analysis of the toxicity of mixtures of the essential amino acids , chemical kinetic experiments using radioactive compounds (see Pharmacokinetics and Pharmacodynamics) , the physiological and biological effects of irradiated animals (see Radiation) , and the computer diagnosis of electrocardiograms  (see Clinical Signals).
In the amino acid problem, the question was: Which mixtures of the 10 essential amino acids were toxic? The investigators called on Cornfield for help when they were confronted with the impractical task of conducting 1013 experiments with two or more mixtures. Cornfield considered the issue of measuring the joint effects of two or more drugs administered in combination. The method usually employed was to assume the joint effects were additive in their individual responses. Cornfield saw that this simple method could give strange results. Instead, he chose a measure of additivity introduced by Gaddum, namely additivity of doses conditioned on a given response, a concept which Cornfield called dose-wise additivity. After some persuasion, the biochemists proceeded to conduct experiments implied by dose-wise additivity. These turned out to be highly successful, leading to the previously unknown result that L-arginine was essential for the combination of the 10 amino acids to be nontoxic in the human .
The animal radiation study is noteworthy for the development of a methodology that would later become a fundamental tool in epidemiologic research, i.e. multiple logistic regression. The issue in the animal data was the effect on survival of irradiated mice as a function of certain observed blood characteristics, such as lymphocytes and granulocytes. This is clearly a regression problem with a straightforward solution if the traits could be controlled at a set of fixed values. Since survival is a 0, 1 variable, the solution would be a multiple logistic function. (Of course, since that period much work has been done on regression with variables subject to error.) Since the observed blood properties were uncontrolled, Cornfield chose to adopt the method of analysis as that of discrimination between two multivariate populations for surviving and nonsurviving animals. With the additional assumption of multivariate normality and equal covariance matrices, he derived the multiple logistic risk function whose coefficients were the same as those found by R.A. Fisher in the linear discrimination problem (see Discriminant Analysis, Linear) [5, 21]. This solution is obtained directly, requiring only the inversion of a matrix but no iterations. Cornfield would later say that the simplicity of the solution appealed to him and he believed that if the assumptions were reasonable, then the solution would be close to that of the regression approach. Cornfield later applied the same reasoning to use the multiple risk function to identify cardiovascular risk factors on the basis of data obtained from the famous Framingham Study [5, 30].
Cornfield made another very important contribution to epidemiology. When epidemiologists began turning their attention to the study of chronic diseases, prospective cohort designs for finding causes of, or risk factors for, chronic diseases were in many instances impractical. They therefore turned to case–control or retrospective types of strategies. A problem with these designs, assuming they are well planned, is that they do not yield traditional estimates of absolute risk or relative risk. Cornfield, in 1955 at the Third Berkeley Symposium in Mathematical Statistics and Probability [4, 18], presented a derivation which demonstrated that under a rather strong assumption (but rather reasonable in the case of chronic diseases) the odds ratio or cross product ratio (in a 2 × 2 table) is a fairly good approximation of the relative risk. The assumption was that the incidence of the disease under study should be small. This result strengthened and increased the use of the case–control design, since it set this research strategy on a much more solid inferential foundation.
In an important paper , responding to critics of the purported causal relationship between smoking and lung cancer, Cornfield argued for the preference of measures of association based on relative risk as opposed to differences of absolute risk, at least for scientific purposes. However, the significant matter here is not the issue of risks but the example he used to justify his position. The illustration bears on the question of the effect of latent, unobservable variables. Sir Ronald Fisher, in arguing against the smoking–lung cancer relationship, had offered an hypothesis that postulated the existence of some constitutional factor (latent and unobservable), e.g. genetic, that caused cancer and that was also associated with the need to smoke. Without giving the details of his argument here, Cornfield demonstrated that if cigarette smokers are shown to have nine times the risk of nonsmokers of getting lung cancer, but that this elevated risk is due, not to cigarettes, but to some latent factor X, then the proportion of smokers having X must be larger than nine times the proportion of nonsmokers having X. Cornfield's conclusion was that if X was a causative agent of this magnitude, then the relationship between the latent factor X and the observed agent would probably have been detected much before that of the agent and the disease. No such factor has been found.
In addition to epidemiologic methods, Cornfield devoted a substantial portion of his career to the theory and practice of randomized, controlled clinical trials (RCTs). His influence was far-reaching. He wrote papers on aspects of design of RCTs both for therapeutic and prevention trials [10, 13, 17, 24], on statistical problems in the interpretation of results , and on a Bayesian test of hypotheses arising in RCTs . But the totality of his publications constituted only a small part of his vast influence as an advisor and consultant. He was personally involved in the Coronary Drug Project, one of the earliest multicenter trials sponsored by the NHLBI. It was in this trial that Cornfield introduced the Bayesian concept of relative betting odds (a measure related to the Bayes Factor) as a measure to assess the efficacy of a therapy instead of the classical P value. He was personally involved in many major multicenter trials, serving in various capacities as a member of planning committees, steering committees, policy advisory boards, and data monitoring and safety committees. These RCTs include the National-Diet Study, a trial of urokinase in the treatment of myocardial infarction, the Coronary Drug Project (CDP), the University Group Diabetes Program (UGDP), the Urokinase Pulmonary Embolism Trial (UPET), the Diabetic Retinopathy Study (DRS), the Multiple Risk Factor Intervention Trial (MRFIT), the Program for the Surgical Control of the Hyperlipidemias (POSCH), and the Persantin Aspirin Reinfarction Study (PARIS).
Throughout his career in statistics Cornfield was interested in, and contributed to, the foundations of statistics, first as a frequentist and then as a Bayesian. The first manifestation of his interest in Bayesian inference was his joint work with Geisser on deriving the posterior distribution for the multivariate normal parameters . He then followed with a number of papers on the theory of Bayesian inference and on its practice and application to clinical trials [8, 11], to estimation in higher order cross-classifications , and to the analysis of life tables .
Cornfield was also actively engaged as a consultant in areas other than clinical trials and epidemiology. He was a member of the Three Mile Island Advisory Committee, on the NHLBI Policy Advisory Board on Coronary Bypass Surgery, Chairman of the Committee on Biometry and Epidemiology for the Food and Drug Administration, on the Scientific Advisory Board for the Sloan–Kettering Institute for Cancer Research, etc. He also served in a number of editorial roles, the principal ones being Associate Editor for the Journal of the American Statistical Association and Consulting Editor for the Journal of Chronic Diseases. Cornfield was President of the American Statistical Association, the American Epidemiological Society, Vice-President of the American Heart Association, and President of the Eastern North American Region of the International Biometric Society.
For a more detailed review of Cornfield's contributions to the theory of statistics, laboratory research, clinical trials, and epidemiology, the reader is referred to the March 1982 supplement to Biometrics vol. 38. Furthermore, Cornfield's American Statistical Association Presidential Address is a wonderful account in his own words of his contributions to statistics and science, and his personal perspective on being a statistician.
Cornfield married Ruth Bittler and they have two daughters, Ann and Ellen.
- 21944). On samples from finite populations, Journal of the American Statistical Association 39, 236–239.(
- 31955). Review: the statistics of bioassay, Journal of the American Statistical Association 50, 1368–1371.(
- 41956). A statistical problem arising from retrospective studies, in Proceedings of the Third Berkeley Symposium, Vol. 4, J. Neyman, ed. University of California Press, Berkeley, pp. 135–148.(
- 71966). A Bayesian test of some classical hypotheses, Journal of the American Statistical Association 61, 577–594.(
- 101970). Design of primary and secondary prevention trials, in Atherosclerosis: Proceedings of the Second International Symposium, R. J. Jones, ed. Springer-Verlag, New York, pp. 566–571.(
- 111970). The frequency theory of probability, Bayes' theorem, and sequential clinical trials, in Bayesian Statistics, D. L. Myers & R. O. Collier, Jr, eds. Peacock, Ithaca, pp. 1–28.(
- 121972). Statistical classification methods, in Computer Diagnosis and Diagnostic Methods, J. A. Jacquez, ed. Charles C. Thomas, Springfield, pp. 108–130.(
- 151951). A problem in geometric probability, Journal of the Washington Academy of Sciences 41, 226–229.& (
- 161977). Bayesian life table analysis, Journal of the Royal Statistical Society, Series B 39, 86–94.& (
- 171967). On certain aspects of sequential clinical trials, in The Fifth Berkeley Symposium on Mathematical Statistics and Probability, Vol. 4, J. Neyman & L. M. Le Cam, eds. University of California Press, Berkeley, pp. 813–829.& (
- 201977). “Safe doses” in carcinogenic experiments, Biometrics 33, 21–30.& (
- 211961). Quantal response curves for experimentally uncontrolled variables, Bulletin of the International Statistical Institute 37(3), 97–115., & (
- 261978). Revolution in United States Government Statistics 1926–1976. US Government Printing Office, Washington.& (
- 271963). Posterior distributions for multivariate parameters, Journal of the Royal Statistical Society, Series B 25, 368–376.& (