• Open Access

‘Blue Sky’ epidemiology: definition, examples and a plea for understanding


  • Funding: David Whiteman is supported by a Future Fellowship (ID FT0990987) from the Australian Research Council.

  • Disclaimer: the genesis of this essay lies in a plenary presentation to the Public Health Association of Australia, Canberra, ACT, Australia in September 2009.

Correspondence to: David Whiteman, Cancer Control Laboratory, Queensland Institute of Medical Research, 300 Herston Rd, Herston, Queensland 4006; e-mail: David.Whiteman@qimr.edu.au

More than 50 years ago, in the aftermath of World War II, the term ‘blue sky science’ was coined to describe that type of science which had long prevailed, but which had been neglected in pursuit of focused, war-time objectives. Most simply defined as ‘curiosity-driven science’, the unifying theme was of science performed for the sake of knowledge alone. Proponents justified this approach on the grounds that unanticipated scientific breakthroughs are sometimes (perhaps often) more valuable than the outcomes of agenda-driven research. Early adopters of blue sky science included large and outwardly conservative organisations such as the US Air Force, oil companies and, more recently, most of Silicon Valley.

In the field of health research, history is replete with instances of curiosity-driven science that have led to entirely unforeseeable medical breakthroughs. Penicillin was discovered when a perspicacious bacteriologist with an interest in the growth characteristics of Staphylococci noticed a zone of lysis surrounding one of his cultures.1 An interest in cellular responses to injury led to the discovery of apoptosis by young Australian pathologist John Kerr;2,3 an observation that shifted paradigms for cancer research, embryology and stem cells. The genomics revolution can be traced directly to the thought experiments of Kary Mullis, who imagined the polymerase chain reaction (‘PCR’) late one night while driving to his weekend hideaway in northern California.4

Genomics owes even more to blue sky science than the musings of a gifted chemist. If not for Thomas D. Brock's discovery of a new species of bacteria (Thermus aquaticus) in the scalding geysers of Yellowstone National Park, genetic research as we currently know it may never have happened.5 At the time, Brock's microbial discovery was mildly interesting, but its full importance was not realised until almost 20 years later when Mullis realised that such organisms must be able to reproduce naturally at very high temperatures and therefore must be able to synthesise DNA under extreme conditions of heat. By utilising the DNA polymerases from these hardy bacteria, Mullis was able to perfect his polymerase chain reaction (PCR) and thereby synthesise industrial quantities of nucleic acids. As a consequence, PCR was lauded as ‘Molecule of the Year’,6 Mullis won the Nobel Prize,4 his employers made a fortune, and the genomics revolution was born.

A slight digression

These classical illustrations of the benefits of blue sky science, from many possible examples, illustrate perfectly the notion that the starting point for many scientific investigations in no way predicts the utility of the final application. What each of the preceding examples has in common are two ingredients vital for successful research: serendipity (once defined as ‘looking in a haystack and finding a farmer's daughter’) and sagacity (the capacity to recognise the singular importance of one particular finding from all of the other findings that arise from investigations).

‘Serendipity’ is a neologism, introduced to the world by Horace Walpole after reading an old fable, ‘The Three Princes of Serendip’.7 The tale describes the journeys of the protagonists, sons of a king, who had been instructed to go and learn about the world before they could be considered ready to inherit their kingdom. The Princes did their father's bidding, and along the way demonstrated a flair for making useful discoveries (hence Walpole's neologism). This flair drew attention from the King of Persia, who had them imprisoned as unwelcome troublemakers. Eventually they found freedom and returned home. The tyrant who imprisoned them, Bahram, has lent his name to the recently coined term ‘bahramdipity’7 to describe the act of suppressing or obstructing discoveries by those in positions of power. While not yet in common use, most researchers in the modern world would be fully aware of the concept, as in the motive sometimes ascribed to journal editors, grant reviewers and administrators for rejecting the work of scientists.

Blue sky epidemiology

Epidemiology is an applied science that draws upon a broad array of health and statistical disciplines. Epidemiological studies are designed to answer research questions relating to human health; they address real world problems for which solutions are demanded. Typically, such investigations are labour-intensive, costly and long term, requiring large amounts of funding and team-work to produce results. Viewed in such a way, it is challenging, perhaps impossible, to sustain an argument for pursuing epidemiology solely to extend knowledge. But perhaps there are possibilities for blue sky research in our discipline: for example, opportunities arise to look at things in different ways, or to explore novel exposures, or to investigate unfashionable or rare diseases that might yield previously unpredicted insights of a general, even paradigm-shifting, nature.

An example of the latter might include the investigation of ‘galloping senescence of the juvenile’ or kuru, a subacute degeneration of the brain, unremittingly fatal, that became highly prevalent among the Fore people in Papua New Guinea in the early years of the 20th century.8 Epidemiological and anthropological fieldwork established that the disease was common in females of all ages, but among males it occurred only before puberty. The cause was determined to be mortuary cannibalism, in which the female relatives (and youngest boys) of a deceased person dismembered the body and ate it – adult males did not participate. However, no transmissible agent – bacterial, parasitic, viral – could be identified,9 and it was not until decades later (and overcoming enormous amounts of bahramdipity) that proteinaceous infectious particles (prions) were determined to be the causal agents.10 Two Nobel prize-winning discoveries were generated from this initially obscure epidemiological investigation.

A different example of blue sky epidemiology came about through the simplest of research instruments, the questionnaire. As part of a population-based study of skin cancer in the high incidence population of Queensland, Adèle Green and colleagues asked participants whether they had ever used any folk remedies for the treatment of skin cancer, and if so, what they used and their own perception of its efficacy11. Among scores of positive responses, one stood out for the consistency with which respondents reported a ‘response’. Euphorbia peplus (‘radium weed’) was reportedly effective in all cases. Duly published, the investigators notified colleagues with biochemical and pharmacological expertise, who quickly determined that the sap of the weed did indeed have biological properties, but whether harmful or helpful they could not be sure. Events moved quickly thereafter; a novel anti-cancer agent was isolated (ingenol mebutate), patented, tested in animal models and entered the clinical trial pathway.12 Phase III trials have now been completed, and the agent looks set to become approved for use as a topical pharmaceutical for the treatment of skin cancers – a major source of morbidity in fair-skinned populations around the world.13 All from one simple question on a survey form.

My final example is personal, to illustrate that blue sky epidemiology might be more accessible than many might realise. At the time of my doctoral studies (the mid-’90s), cancer researchers were enthused by the discovery of tumour-suppressor genes. One, p53, was famously designated the ‘guardian of the genome’ and was implicated in cancers of many organs.14 At a loose end, I decided to stain the melanoma samples from my patients with antibodies against the p53 protein. Behold! Around 30% of melanomas stained brightly with the p53 antibody. But so what? I went back to my epidemiological analyses, but now I separated the patients with p53-positive melanomas from those that had p53-negative melanomas to see whether they differed in their patterns of association with risk factors. The differences were striking. Whereas the p53-positive tumours were predominantly from older people on sun-exposed body sites, the p53-negative tumours arose largely on the trunk, and were most common among people with large numbers of moles on their skin.15 From there, a new hypothesis for the development of melanoma was proposed, which has been confirmed in many subsequent tests.16 An unexpected insight arose simply by extending a classical epidemiological study into a laboratory-assisted investigation.

These Editorials represent the views of the authors and not necessarily the views of this Journal or the Public Health Association of Australia.

Practice tips for blue sky epidemiology

Clearly, epidemiology has many faces, and these selected illustrations may not be analogous to investigations conducted by the majority of practitioners in the field. So how can epidemiologists practise blue sky science?

Question dogma. Dogma may often be correct, but there is little harm in testing how much evidence exists to support the prevailing view. Neil Pearce encapsulated this sentiment beautifully in a recent essay in which he described a discussion among research colleagues during an asthma conference in Barcelona.17 One of the epidemiologists ventured that he had recently found that having a cat early in life was protective against asthma, but he had not published the finding fearing it could not be true. As talk turned around the table, a further five studies were identified showing that cat exposure was protective, but only one had ever published the finding, and even that was presented as an incidental finding appearing only in a table. Pearce concluded:

“now a number of studies showing a protective effect of cat exposure have been published, and those of us who work in the field can claim at least one important public health victory – despite years of research, we may not have prevented any cases of asthma, but we have saved the lives of many cats.”

Pilot studies are the perfect vehicle for blue sky science, since they provide an opportunity to test unconventional ideas in a low-risk setting.

Record-linkage studies are the apotheosis of hypothesis-free, curiosity-driven research, and the potential boons to human health research are difficult to over-estimate. The concern, of course, is the high number of spurious associations that will be identified through mindless trawling of datasets. As Mantel and Haenszel observed more than 50 years ago, however, multiple testing is only a concern if inferences are to be drawn from a single set of data. In their words ‘a single … study does not yield conclusions, only leads’.18 To paraphrase, associations identified through record linkage studies should be the start of formal hypothesis-testing studies, not the final word.

‘Non-pre-specified’ analyses, including subgroup analyses and explorations of secondary outcomes, offer the opportunity to explore new territory without the attendant costs of de novo data collection. Again, caution must prevail when interpreting findings of such analyses; any associations, positive or negative, must be viewed with scepticism. But at the very least they should be the basis for further thought.

‘Speculative measures’. Space on questionnaires is always tight, but a place should be reserved for testing the brave new hypothesis (to find the next radium weed, perhaps). Soliciting feedback from study participants is often a fruitful source of ideas: we were recently contacted by a participant who suggested that our future studies of reflux-associated cancers might include questions about bulimia, an idea we had not previously entertained.

Laboratory collaborations. Finally, the analysis of biospecimens in conjunction with epidemiological data offers enormous opportunities for blue sky epidemiology. Genome-wide association studies have made ‘blue sky’ research almost passé; what could have less application than scanning every nucleotide of the genome for associations with every known disease, trait or idiosyncrasy?

Threats to blue sky epidemiology

The paradigm of epidemiology is purpose-driven, and hence seemingly at odds with blue sky research. This tension manifests in numerous ways. Grant review panels, charged with the task of distributing limited resources to applications of the greatest merit, will seldom favour the risky over the focused. There has been a worldwide tendency to apply selection criteria that place a higher value on trials than observations, on translation rather than basic knowledge. Increasingly, research agendas are being directed toward chosen diseases, inevitably directing funds away from other, less ‘popular’ maladies and concentrating expertise into narrower and narrower silos. Journal editors tend to subscribe to the same value systems as grant reviewers, which blue sky proponents might deem ‘bahramdipitous’ behaviour, at least if practised unconsciously and without wisdom. Some reticence to fund or publicise blue sky research is understandable, since the world is full of unsubstantiated and unreplicated findings, but a policy of intentionally restricting the subject and manner of scientific investigations would seem counter to the pursuit of inquiry.

Of all examples of creeping bahramdipity, probably the most pervasive has been the introduction of sweeping privacy regimes in the early years of the 21st Century. Ostensibly masquerading as a libertarian ideal, ‘privacy concerns’ have been used to prevent much potentially valuable epidemiological research. Fortunately, data custodians are becoming more sophisticated at preserving individual privacy yet permitting linkage of patient records, and one hopes that the full potential of this approach might be realised.

Finally, the career structure for epidemiologists renders blue sky research a risky proposition. For those pursuing a ‘research only’ track, continued employment is contingent on grants and publications, both of which are hard to come by. It is much safer to lead a large, well-conducted trial (even to a null finding) than to pursue a folly. For epidemiologists in service positions, the imperative to monitor patterns of disease or to investigate clusters or outbreaks, leaves little scope for application-free investigations.

Final thoughts

Epidemiology will always be grounded in practical concerns, yet as we strive toward our goal of reducing human suffering through better understanding of diseases and their causes, we should keep one eye skyward, alert for the possibility that solutions to our questions may be found where we least expect them.