PubMed search strings for the study of agricultural workers' diseases


  • Disclosure Statement: The authors report no conflicts of interests.
  • Full address of the institution where the work was performed: Unità Operativa di Medicina del Lavoro, Policlinico S. Orsola-Malpighi, via Pelagio Palagi 9, 40138 Bologna, Italy.



Several optimized search strategies have been developed in Medicine, and more recently in Occupational Medicine. The aim of this study was to identify efficient PubMed search strategies to retrieve articles regarding putative occupational determinants of agricultural workers' diseases.


We selected the Medical Subjects Heading (MeSH) term agricultural workers' diseases and six MeSH terms describing farm work (agriculture, agrochemicals NOT pesticides, animal husbandry, pesticides, rural health, rural population) alongside 61 other promising terms. We estimated proportions of articles containing potentially pertinent information regarding occupational etiology to formulate two search strategies (one “more specific,” one “more sensitive”). We applied these strategies to retrieve information on the possible occupational etiology among agricultural workers of kidney cancer, knee osteoarthritis, and multiple sclerosis. We evaluated the number of needed to read (NNR) abstracts to identify one potentially pertinent article in the context of these pathologies.


The “more specific” search string was based on the combination of terms that yielded the highest proportion (40%) of potentially pertinent abstracts. The “more sensitive” string was based on use of broader search fields and additional coverage provided by other search terms under study. Using the “more specific” string, the NNR to find one potentially pertinent article were: 1.1 for kidney cancer; 1.4 for knee osteoarthritis; 1.2 for multiple sclerosis. Using the sensitive strategy, the NNR were 1.4, 3.6, and 6.3, respectively.


The proposed strings could help health care professionals explore putative occupational etiology for agricultural workers' diseases (even if not generally thought to be work related). Am. J. Ind. Med. 56:1473–1481, 2013. © 2013 The Authors. American Journal of Industrial Medicine published by Wiley Periodicals, Inc.


During the past decades, the number of medical articles published has increased steeply. The number of citations added to PubMed—the leading health science database managed by the U.S. National Library of Medicine (NLM)—rose from 406,000 in 1990 to 926,000 in 2010. The explosion of accessible knowledge has affected many research topics, including etiological research: in 2010 more than one third of PubMed new articles were classified using the subheading “etiology.” Therefore, when investigating the putative causes of a disease, researchers and practitioners need to be provided with time-effective tools for the retrieval of pertinent literature.

The consultation of electronic databases has nowadays become the standard approach to literature search. Among others, PubMed represents the reference standard as: (1) it is free; (2) it takes advantage of MeSH terms conceived to increase the specificity of search strategies; (3) it indexes more than 21 million citations. Nevertheless, no database can be considered comprehensive on its own [Verbeek et al., 2005]. Hence, researchers and practitioners are often advised to consult several databases, including PubMed and Embase [Gehanno et al., 1998].

Unfortunately, the drawback for such in-depth research is a high consumption of time and resources. However, questions may arise on the quality of articles missed by careful consultation of PubMed. Rollin et al. [2010] recently estimated that the recall ratio of Medline for high-quality intervention studies is close to 90%. Based on their findings, the authors concluded that limiting the research of pertinent literature to Medline only is more cost-effective than previously thought and it represents a valuable tool for answering daily practice questions. Although the authors of this study were exploring a particular area (i.e., intervention studies), it is possible that also in the field of etiology the recall ratio of Medline for high-quality etiological papers could be sufficiently high.

Several optimized search strategies have been developed to speed up and facilitate the search process in PubMed [Haynes et al., 1994]. In the Occupational Medicine field, search strategies have been proposed for the retrieval of intervention studies or etiological research papers [Verbeek et al., 2005; Haafkens et al., 2006; Schaafsma et al., 2006; Mattioli et al., 2010]. In addition, a controlled trial showed that evidence-based search strategies may enhance the effectiveness of the consultation of PubMed [Schaafsma et al., 2007].

Although rural work poses many hazards to health, job exposures and their consequences have been neglected or scarcely investigated [Alavanja et al., 1996]. Occupations in the rural context are often characterized by the presence of multiple risk factors acting simultaneously. Workers are frequently exposed to the use of complex machineries, chemicals, extreme working hours, noise pollution, harsh climates and task-related physicality. Moreover, the health problems that are most frequently reported—e.g., musculoskeletal disorders, hearing loss, or skin cancers—are often determined by a multifactorial etiology, involving occupational, para-occupational, and non-occupational risk factors.

In this challenging context, search strategies designed for the broader field of Occupational Medicine might be ineffective due to low retrieval or lack of specificity. Hence, we decided to construct PubMed search strings designed to investigate possible occupational determinants of diseases occurring in agricultural workers. We adopted the systematic approach previously used to create PubMed search strategies to retrieve articles regarding putative occupational determinants of conditions not generally considered to be work related [Mattioli et al., 2010].

Our final aim was to provide health care professionals with “copy and paste” tools for the identification of articles on established or putative determinants of agricultural workers' diseases.


Rationale and Study Design

We adopted the same study design used in a precedent study, recently published [Mattioli et al., 2010]. We compiled a list of search terms (either Medical Subjects Heading [MeSH] or non-MeSH) seeming pertinent to occupational determinants of diseases in rural populations. Limits were set for articles added to PubMed by December 31, 2009 and, since availability of an English language abstract can be of practical importance when assessing the potential relevance of an article, we decided to also introduce the Limit “abstracts.” Later on, in order to exclude those articles on animal diseases only, we added to each search string the words NOT (animals [MH] NOT humans [MH]). The use in PubMed of this sub-string means that all papers with the MeSH term “animals” and without the MeSH term “humans” will be excluded. The same result can be obtained by writing (NOT animals [MH] NOT (animals [MH] AND humans [MH])).

First of all, the coverage of each search term was explored considering the absolute number of articles evoked in PubMed. After limiting the search to the records with an available English abstract, we estimated the proportion of retrieved articles that could be potentially pertinent for the field of occupational etiology in rural populations. We then designed two search strategies aimed at (1) conducting a time-effective investigation, or (2) performing more in-depth literature searches. Finally, we used “number needed to read” (NNR) values to evaluate the performance of the two search strategies in the context of three diseases that are generally considered, with different degrees of probability, work related in the field of agriculture.

Selection of Terms to Be Tested

We consulted the Medline MeSH database to identify MeSH terms (and their subheadings) pertinent to the agriculture field. MeSH terms are the NLM controlled vocabulary thesaurus used for indexing articles for PubMed. In this way, users can more easily retrieve abstracts related to a particular field or matter using the adequate MeSH term in their search strategy. A search executed with MeSH terms is more specific than a search with only free-text words. On the other hand, users have to take into consideration that MeSH terms may not be applied properly in all cases and that the application of MeSH terms requires some months, after the publication date.

Words associated with the highest number of relevant MeSH terms were “agriculture” (n = 3), “rural” (n = 12), “pesticides” (n = 33), and “farm” (n = 11). The MeSH term agricultural workers' diseases appeared to be the more related to the occupational etiology in the agricultural context; hence, citations evoked in PubMed by this term were used to evaluate the additional contribution of other—apparently less specific—search terms. After preliminary analysis, and according to the definitions provided in Medline MeSH database, we added six other MeSH terms to the pool of search terms to be studied (see Table I). The MeSH term “pesticides” was explored as (pesticides [MH] NOT (pesticides [pharmacological action])) in order to exclude the papers regarding the use of Warfarin (the most widely prescribed oral anticoagulant in humans) or similar substances.

Table I. Terms Selected for the Construction of the Search Strings Divided by Category
MeSH terms
Agriculture; agrochemicals NOT pesticides; animal husbandry; pesticides; rural health; rural population
Free-text terms
“Agricultural environment”; agricultural exposure*; “agricultural health”; agricultural machin*; “agricultural medicine”; agricultural work*; agriculture work*; agronomist*; “animal feeding”; “animal husbandry” NOT animal husbandry [MH]; (breed* NOT breed* [AU]) AND humans; CAFO OR CAFOS (i.e., Concentrated Animal Feeding Operation); cattle; crofter*; cropper* NOT cropper* [AU]; cultivator*; dairy farm*; farm; farm NOT farm [AD]; farmhand*; farm operator*; farm work*; farmer* NOT farmer* [AU]; farming; farmwork*; (fertilizer* OR fertiliser*) AND humans [MH]; field hand*; forestry work*; fungicid*; gardening; gleaner*; granger* NOT granger* [AU] NOT (granger AND caus*); greenhouse*; grower*; harvester*; herbicid*; herdsman; homesteader*; insecticid*; livestock*; nester* NOT nester* [AU]; (pesticide* NOT pesticides [MH]) NOT (pesticides [pharmacological action]); plantation* AND humans [MH]; planter*; plowman NOT plowman [AU]; rancheman; rancher* NOT rancher* [AU]; reaper* NOT reaper* [AU]; rodenticid* NOT (rodenticides [pharmacological action]); “rural area”; “rural environment”; “rural health” NOT rural health [MH]; “rural population” NOT rural population [MH]; rural work*; sharecropper*; shepherd NOT shepherd [AU]; sower* NOT sower* [AU]; stockman NOT stockman [AU]; tiller*; tractor; yeoman NOT yeoman [AU]
Terms tested with sub-search strings
(agriculture [MH] OR agricult*); (farmer* NOT farmer* [AU]); fungicid*; herbicid*; insecticid*; ((pesticide* OR pesticides [MH]) NOT (pesticides [pharmacological action])); rural

Our study was extended to terms not indexed in the MeSH database. Preliminary samples of abstracts were collected for all terms identified on the basis of findings from a single available study on PubMed searches regarding occupational etiology [Schaafsma et al., 2006], analysis of MeSH entry terms, the authors' experience and brainstorming. After extensive evaluation (data not shown), 61 terms were included in our study as free-search (see Table I). Every term was tested as free text either using truncation or inverted commas in order to select the most comprehensive search term. Of note, the use of idiosyncratic spelling was also required with the aim of being as comprehensive as possible. For example, the use of farmwork* added 352 abstracts to the 835 retrieved by farm work*.

After that, every word was evaluated in terms of additional contributions compared to the search conducted using the MeSH term agricultural workers' diseases by itself.

Finally, we decided to also test some complex terms (sub-search strings), recalling basic search strategies for the agricultural field, obtained by mixing the “more specific” string proposed by Mattioli et al. [2010] with terms related to farm work (see Table I).

Estimating Proportions of Pertinent Articles

For each studied term, we sampled 100 abstracts from PubMed; searches were limited to citations accompanied by an English abstract. If appropriate, search field tags were used. To obtain systematically recruited samples, we set the PubMed “display settings” function in such a way as to obtain a number of pages approximately corresponding to a multiple of 100: we then selected the “top-of-the-page” citation, skipping appropriate numbers of pages.

The rationale to sample 100 articles is that the number of abstracts to be sampled to estimate the proportion of possibly pertinent abstracts (assuming an alpha error of 0.05 and a precision level of 90%) is definitely less than 100 (even in case of maximum variability, i.e., 50%). Only in the case of more than 20,000 abstracts retrieved (by a specific entry term) should the number of abstracts to be sampled be exactly 100 [Cochran, 1963].

Two physicians (D.G.; L.R.) independently examined each abstract and declared the article to be potentially pertinent or not. This judgment was based on the subject of the article (i.e., rural occupational determinants of disease), irrespective of study design and quality. Inter-rater agreement, explored in a preliminary assessment of 100 abstracts, was “good” (κ = 0.65) [Altman, 1991]. A third physician (V.D.G.) evaluated the pertinence of abstracts for which an agreement was not reached between the first two evaluators.

Agricultural workers' diseases entered as a MeSH term was considered as the “core” of our search strategies. Therefore, we evaluated the performance of each of the 61 other search items (entered as listed above) in terms of the proportion of pertinent abstracts among those retrieved while excluding agricultural workers' diseases entered as a MeSH term.

Formulation of Search Strings

Based on the estimated proportions, we distinguished the search terms characterized by a high retrieval rate of pertinent articles from those performing poorly. For this purpose, we arbitrarily established a cut-off of 40% of pertinent articles, corresponding to an NNR value of 2.5.

Finally, we created two distinct search strategies to be proposed for routine use: one conceived to be “more specific” and one rather “more sensitive.”

The “more specific” string was conceptually thought to have the power to retrieve a minor number of “false positive” abstracts (even if losing some possible pertinent abstracts), while the “more sensitive” one should be considered as a string able to retrieve a greater number of possible pertinent abstracts, even if with a higher number of “false positive” ones.

Assessment of Proposed Search Strategies

The two search strategies were tested in the context of three diseases that have been reported [Cooper et al., 2002; Walker-Bone and Palmer, 2002; Bassil et al., 2007], with different degrees of probability, to be work related in the field of agriculture, from the most to the least putatively associated: kidney cancer, knee osteoarthritis, and multiple sclerosis. Firstly, we collected all abstracts evoked in PubMed for these diseases by the proposed search strategies and by the “more specific” string filtered with the two “Clinical Queries”—a narrow and a broad one—concerning “etiology” provided by PubMed as PubMed Tools [National Center for Biotechnology Information, 2012]. Secondly, pertinent articles were identified by D.G., L.R., and V.D.G. (using the aforementioned methods). Finally, the NNR values were calculated for each string [Bachmann et al., 2002].


Numbers of Articles Identified and Their Overlaps

Agricultural workers' diseases [MH] identified 5,541 articles (2,745 abstracts) from PubMed, representing ∼0.03% (∼0.03% abstracts) of all 19,769,836 PubMed articles (10,932,393 abstracts). We estimated that about 80% (2,196) of the 2,745 retrieved abstracts were pertinent to rural occupational etiology of the diseases reported. For each of the other six MeSH terms considered (agriculture, agrochemicals NOT pesticides, animal husbandry, pesticides, rural health, rural population), overlaps with the agricultural workers' diseases MeSH term were minimal, ranging from 1% to 10%. On the whole, these six MeSH terms (agriculture, agrochemicals NOT pesticides, animal husbandry, pesticides, rural health, rural population) added another 53,313 abstracts, bringing the coverage of abstracts for all seven MeSH terms to ∼0.5% abstracts of all PubMed abstracts retrieved with the same limits (56,058/10,932,393). Data on the proportion of pertinent abstracts are reported in Table II.

Table II. Numbers of Abstracts Identified by Agricultural Workers' Diseases MeSH Term and Overlaps With the Other Six MeSH Terms and Estimates of Numbers Potentially Pertinent to Rural Occupational etiology
PubMed queryTree numbers for MHAbsolute numbers of abstracts retrievedaEstimated proportion of potentially pertinent additional abstractsbEstimated absolute numbers of potentially pertinent additional abstractsc
  1. MH, medical subject heading.
  2. aNumber of abstracts retrieved by agricultural workers' diseases [MH] and number of abstracts retrieved by the other six search terms excluding those overlapping with those retrieved by agricultural workers' diseases [MH].
  3. bEstimates were based on reviews of 100 systematically recruited sampled abstracts.
  4. cCalculated by multiplying the number of abstracts additionally identified (i.e., n in column 3) by the estimated proportion of potentially pertinent additional abstracts (column 4).
Agricultural workers' diseases [MH]C24.0802,745802,196
Agriculture [MH]J01.04014,583253,646
Agrochemicals [MH] NOT pesticides [MH]D27.720.0312,526251
Animal husbandry [MH]C24.0801,50316240
(Pesticides [MH] NOT (pesticides [pharmacological action]))D27.720.031.700, D27.720.723, D27.888.7237,942251,986
Rural health [MH]N01.400.6509,5227667
Rural population [MH]N01.600.72521,25871,488

The non-MeSH search terms evoked 120,722 abstracts (almost 1.1% of all articles listed in PubMed). The overlapping with the agricultural workers' diseases MeSH term was minimal, and the incremental contribution was quantified in 118,488 (∼98%) citations. Twenty-four non-MeSH terms (agricultural exposure*, agricultural machin*, agriculture work*, agronomist*, CAFO OR CAFOS, crofter*, cropper* NOT cropper* [AU], cultivator*, farmhand*, farm operator*, field hand*, gleaner*, granger* NOT granger* [AU] NOT (granger AND caus*), herdsman, homesteader*, nester* NOT nester* [AU], plowman NOT plowman [AU], rancheman, rancher* NOT rancher* [AU], reaper* NOT reaper* [AU], sharecropper*, sower* NOT sower* [AU], stockman NOT stockman [AU], yeoman NOT yeoman [AU]) were excluded from the study as the number of articles retrieved was very small (68, 102, 39, 52, 28, 5, 7, 102, 5, 61, 14, 2, 43, 15, 8, 83, 8, 0, 56, 57, 9, 55, 23, and 8, respectively). Furthermore, at the end of the study we checked the contribution of these 24 non-MeSH terms to the formulated search strings and we found no relevant differences in the articles retrieved by the “more sensitive” search string. Only in the case of agricultural exposure* did we note its valuable contribution to the “more specific” string (10 out of 14 added papers resulted pertinent) and we consequently decided to add this non-MeSH term back to this strategy.

Data on the proportion of pertinent abstracts for the other 37 non-MeSH terms (“agricultural environment”; “agricultural health”; “agricultural medicine”; agricultural work*; “animal feeding”; “animal husbandryNOT animal husbandry [MH]; (breed* NOT breed* [AU]) AND humans; cattle*; dairy farm*; farm; farm NOT farm [AD]; farm work*; farmer* NOT farmer* [AU]; farming; farmwork*; (fertilizer* OR fertiliser*) AND humans [MH]; forestry work*; fungicid*; gardening; greenhouse*; grower*; harvester*; herbicid*; insecticid*; livestock*; (pesticide* NOT pesticides [MH]) NOT (pesticides [pharmacological action]); plantation* AND humans [MH]; planter*; rodenticid* NOT (rodenticides [pharmacological action]); “rural area”; “rural environment”; “rural health” NOT rural health [MH]; “rural population” NOT rural population [MH]; rural work*; shepherd NOT shepherd [AU]; tiller*; tractor*) are reported in Supplemental Material, Table SI.

Finally, we evaluated the ability of each of the “complex terms” (sub-search strings) to identify abstracts not picked up by the agricultural workers' diseases MeSH term (see Supplemental Material, Table SII).

Formulation of Search Strings

The two proposed search strings are presented in Table III. The “more specific” search strategy included those search terms which retrieved an estimated proportion of pertinent articles ≥40% (corresponding to an NNR value ≤2.5). The other search terms, for which the proportions of pertinent abstracts retrieved were <40%, were included in the “more sensitive” search strategy, which also included all the search terms of the “more specific” string.

Table III. Proposed PubMed Search Strategies for Identifying Potentially Pertinent Articles in Rural Occupational Field
  1. Usage notes: 1. It is possible to “copy and paste” each of the two strings into PubMed from a .doc file. Alternatively, the strings can be evoked in PubMed by entering the following shortened URLs (Uniform Resource Locators) in the browser address box: for the “more specific” string; for the “more sensitive” string. 2. The name-of-the-disease should be entered without any search tag. For diseases that have more than one name, the various “names-of-the-disease” should be entered in brackets, connected by the OR operator: e.g., … AND (epicondylitis OR tennis elbow).
1. “More specific” search strategy:
(agricultural workers' diseases [MH] OR agricultural exposure* OR “agricultural health” OR “agricultural medicine” OR agricultural work* OR (farm NOT farm [AD]) OR farm work* OR farming OR forestry work* OR tractor*) OR ((agriculture [MH] OR agricult*) OR (farmer* NOT farmer* [AU]) OR fungicid* OR herbicid* OR insecticid* OR ((pesticide* OR pesticides [MH]) NOT pesticides [pharmacological action]) AND (occupational diseases [MH] OR occupational exposure [MH] OR occupational medicine [MH] OR occupational risk [TW] OR occupational hazard [TW] OR (industry [MH] AND mortality [SH]) OR occupational group* [TW] OR work-related OR occupational air pollutants [MH] OR working environment [TW])) NOT (animals [MH] NOT humans [MH]) AND name(s)-of-the-disease
2. “More sensitive” search strategy:
(“agricultural environment” OR agricultural workers' diseases [MH] OR “agricultural health” OR “agricultural medicine” OR agricultural work* OR agriculture [MH] OR agricult* OR (agrochemicals [MH] NOT pesticides [MH]) OR “animal feeding” OR “animal husbandry” OR animal husbandry [MH] OR ((breed* NOT breed* [AU]) AND humans) OR cattle* OR dairy farm* OR farm OR farm work* OR (farmer* NOT farmer* [AU]) OR farming OR farmwork* OR ((fertilizer* OR fertiliser*) AND humans [MH]) OR forestry work* OR fungicid* OR gardening OR greenhouse* OR grower* OR harvester* OR herbicid* OR insecticid* OR livestock* OR ((pesticide* OR pesticides [MH]) NOT pesticides [pharmacological action]) OR (plantation* AND humans [MH]) OR planter* OR (rodenticid* NOT rodenticides [pharmacological action]) OR “rural area” OR “rural environment” OR “rural health” OR rural health [MH] OR “rural population” OR rural population [MH] OR rural work* OR tractor*) NOT (animals [MH] NOT humans [MH]) AND name(s)-of-the-disease

Assessment of Proposed Search Strategies

Available literature on three diseases (namely, kidney cancer, knee osteoarthritis, and multiple sclerosis) was used to test the two search strategies. The numbers of abstracts retrieved in PubMed, the proportion of abstracts that were considered pertinent, and the NNR values are reported for both strategies in Table IV. Overall, the NNR value was substantially lower for the “more specific” strategy compared to the “more sensitive” strategy. As expected, according to the degree of relatedness to the field of agriculture, we found—using the “more sensitive” string—the lowest NNR for the most probably associated disease (namely kidney cancer) and the highest NNR for the least probably associated disease (namely multiple sclerosis).

Table IV. Application of Search Strategies to Three Pathologies: Numbers of Citations Retrieved, Proportions of Potentially Pertinent Articles and Overall NNR Values
PubMed query“Kidney cancer” (n = 1,196)“Knee osteoarthritis” OR gonarthrosis (n = 10,396)“Multiple sclerosis” (n = 27,290)Overall (n = 38,882)
nn (%)nn (%)nn (%)nn (%)
  1. NNR, number needed to read value.
  2. a(Etiology/Narrow[filter]).
  3. b(Etiology/Broad[filter]).
“More specific” string1110 (91)1.12014 (70)1.41210 (83)1.24334 (79)1.3
“More specific” string + narrow PubMed clinical query for etiologya109 (90)1.155 (100)122 (100)11716 (94)1.1
“More specific” string + broad PubMed clinical query for etiologyb1110 (91)1.11613 (81)1.287 (88)1.13530 (86)1.2
“More sensitive” string2014 (68)1.49627 (28)3.616426 (16)6.328067 (24)4.2
“More sensitive” string NOT “more specific” string (incremental contribution of the “more sensitive” string)94 (44)2.37613 (17)5.815216 (11)9.523733 (14)7.2

In the overall search, the “more specific” search string retrieved 43 articles over 38,882 articles indexed for the terms “kidney cancer,” “knee osteoarthritis” OR gonarthrosis, and “multiple sclerosis” in the Medline database (0.1% of the total). Thirty-four articles out of 43 were adjudicated as pertinent by two reviewers (D.G., L.R.), accounting for 79% of pertinence and hence an NNR of 1.3.

The proportion of pertinent articles improved by filtering the “more specific” string by the narrow and the broad “Clinical Queries” concerning “etiology” [National Center for Biotechnology Information, 2012] accounted respectively for 94% and 86% of pertinence (corresponding to an NNR of 1.1 and 1.2, respectively—see Table IV).

In the overall search, the “more sensitive” search string retrieved 280 articles over 38,882 articles indexed for the terms “kidney cancer,” “knee osteoarthritis” OR gonarthrosis, and “multiple sclerosis” in the Medline database (0.7% of the total). Sixty-seven articles out of 280 were adjudicated as pertinent by the same team of reviewers, accounting for 24% of pertinence and hence an NNR of 4.2.

Obviously, all the abstracts retrieved by the “more specific” string are also retrieved by the “more sensitive” string. This depends on the fact that the “more specific” string is entirely included in the sensitive one.

Finally, about 46% and 15% of the abstracts retrieved by the “more specific” strategy and the “more sensitive” strategy that we propose for the field of agriculture overlap with those retrieved by the already published search strings for the study of occupational diseases [Mattioli et al., 2010].


A bibliometric study was performed to identify readily applicable PubMed search strategies for use by health professionals when investigating occupational determinants of medical conditions that could be related to the agricultural field. Two search strings were created: one “more specific” and the other “more sensitive” (see Table III). These strings offer a complement to the previously proposed search strings designed for evaluation of occupational etiology in the broader context of Occupational Medicine [Mattioli et al., 2010].

We applied our search strategy to three diseases and we calculated the NNR to evaluate the performance. On the one hand, the very low NNR estimated for the “more specific” string (overall NNR 1.3, corresponding to 79% of pertinent abstracts) suggests that this search strategy represents a valuable tool to answer etiological questions encountered in routine practice. On the other hand, the overall number of potentially pertinent articles almost doubled when using the “more sensitive” strategies—although the overall NNR was as high as 4.2 (Table IV). Of note, the incorporation of etiology search filters provided by PubMed for clinical queries [National Center for Biotechnology Information, 2012] did not appreciably improve the performance of our “more specific” string. In fact, when we applied this strategy for the three selected pathologies, the overall NNR were, respectively, 1.1 and 1.2 for the narrow and broad clinical queries. Furthermore, the use of the narrow PubMed clinical query for etiology determined the loss of 18 pertinent papers out of 34 (Table IV).

Based on our findings, we believe that the “more specific” string may provide an efficient frontline approach for health care professionals who need to explore the occupational etiology of agricultural work-related diseases in practice-based situations ranging from primary care to medico-legal issues or insurance claims. An easy and fast “copy and paste” tool could be particularly relevant since it could help to fill the gap between the standard practice of Occupational Medicine and the practice of Occupational Medicine in the rural context. Indeed, agricultural workers, in comparison to other workers, generally have less (or no) systematic health surveillance provided by an occupational physician. Furthermore, agricultural workers' diseases have often been neglected both in the medical literature and in insurance systems.

The “more sensitive” string presented in this paper could represent a second line approach, particularly useful in the presence of diseases which elicit only a few articles or to explore a little studied disease in more depth. For instance, we also tried applying the “more sensitive” string to Leber's Hereditary Optic Neuropathy (often referred to with the acronym LHON), a rare disease that has been studied little from the standpoint of occupational etiology but that has recently been related to solvent exposure [Carelli et al., 2007]. In this challenging context, the “more sensitive” string retrieved a total of 35 articles (34 with abstracts), 2 (6%) of which appeared to be potentially pertinent, while the “more specific” string was not able to retrieve any article. These findings suggest that while the “more specific” string could provide an efficient tool for initial research, with the potential to save time by rapidly retrieving most of the available articles, the “more sensitive” string may represent the core of a more complex search strategy to be employed when conducting systematic reviews of the literature both for research or medico-legal purposes.

Finally, the “more sensitive” string proposed for occupational diseases related to agricultural exposures and the one already published for occupational etiology [Mattioli et al., 2010] could be combined to conduct comprehensive research in multiple occupational settings (service, industry, or agriculture). In fact, the two search strings showed minimal overlapping, demonstrating that the already published search string [Mattioli et al., 2010] is not sufficiently suitable for agricultural exposures.

The practical decision to base the assessments of pertinence on articles with available (English language) abstracts may have led to some selection bias due to exclusion of certain article types, such as letters, which could contain relevant information. However, an analysis based on information contained in titles, performed in the previous study [Mattioli et al., 2010], suggested that this factor would not have constituted a major bias. Moreover, the assessments based on the abstracts could not take into account relevant information reported in the main body of the articles but not in the abstracts, the quality of which can vary considerably—especially in the absence of widespread implementation of more informative abstracts [Gehanno et al., 1998]. Furthermore, we did not attempt to evaluate the quality of the individual studies. Since no gold standard instrument exists for retrieval of pertinent articles, we were unable to evaluate sensitivity and specificity values for the two proposed search strings (although the NNR values do give some indication of specificity). It could be argued that our selection of non-MeSH search terms was to some extent arbitrary. However, the ability of the “more sensitive” string to retrieve most of the available pertinent abstracts for a range of diseases (see above) suggests that this a priori limitation did not greatly impact on the end product. It should be underlined that this study was restricted to PubMed: systematic reviews of the literature would require additional bibliographic searches using other relevant databases, such as Embase. In particular, in the field of agricultural workers' diseases the TOXNET database could be suggested, especially if the disease is thought to be related to a toxic. In that case, it would be necessary to create a customized search string according to the database search options and features. The proposed PubMed search strings are not readily applicable, also because of the wide use of asterisks and tags.

However, a recent article by Rollin et al. [2010] states that high quality papers, included in Cochrane Reviews, could be retrieved in 90% of the cases by PubMed. Changes in research and reporting practices (e.g., choice of key words) over time [Wilczynski and Haynes, 2003] will inevitably affect the retrievability of future literature. For instance, implementation of the STROBE guidelines [Vandenbroucke et al., 2007] could (hopefully) improve the reporting quality of titles and abstracts of epidemiologic studies, thereby facilitating the identification of pertinent articles.

In conclusion, Table III reports two proposed PubMed search strings—one “more specific,” one “more sensitive”—which may be used for rapid (or more lengthy) explorations of evidence regarding the existence of possible occupational determinants of diseases that could be related to agricultural exposures. Either string can be pasted into the PubMed search box alongside the name(s)-of-the-disease (see Table III). We recommend trying the “more specific” string first and then, if necessary, the “more sensitive” string. These search strings could be useful to many different types of health professional, ranging from primary care physicians, especially in rural areas, to specialized librarians, in contexts ranging from evidence-based patient evaluation to original research or planning rural health services. Moreover, these search methods could also be used in order to be constantly up-to-date on this field, maybe using the “My NCBI” tool with these saved searches [National Center for Biotechnology Information, 2013].

Field tests are required to assess the effectiveness of applying these strategies in the real world.


We are particularly grateful to Professor Keith Palmer (University of Southampton, UK) for his useful comments and suggestions.