Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Mini Review
  • Open access
  • Published: 07 March 2017

Study designs may influence results: the problems with questionnaire-based case–control studies on the epidemiology of glioma

  • Christoffer Johansen 1 , 2 ,
  • Joachim Schüz 3 ,
  • Anne-Marie Serena Andreasen 2 &
  • Susanne Oksbjerg Dalton 2  

British Journal of Cancer volume  116 ,  pages 841–848 ( 2017 ) Cite this article

4062 Accesses

14 Citations

3 Altmetric

Metrics details

  • Cancer epidemiology
  • Cancer therapy
  • Risk factors

Glioma is a rare brain tumour with a very poor prognosis and the search for modifiable factors is intense. We reviewed the literature concerning risk factors for glioma obtained in case–control designed epidemiological studies in order to discuss the influence of this methodology on the observed results. When reviewing the association between three exposures, medical radiation, exogenous hormone use and allergy, we critically appraised the evidence from both case–control and cohort studies. For medical radiation and hormone replacement therapy (HRT), questionnaire-based case–control studies appeared to show an inverse association, whereas nested case–control and cohort studies showed no association. For allergies, the inverse association was observed irrespective of study design. We recommend that the questionnaire-based case–control design be placed lower in the hierarchy of studies for establishing cause-and-effect for diseases such as glioma. We suggest that a state-of-the-art case–control study should, as a minimum, be accompanied by extensive validation of the exposure assessment methods and the representativeness of the study sample with regard to the exposures of interest. Otherwise, such studies cannot be regarded as ‘hypothesis testing’ but only ‘hypothesis generating’. We consider that this holds true for all questionnaire-based case–control studies on cancer and other chronic diseases, although perhaps not to the same extent for each exposure–outcome combination.

Similar content being viewed by others

hypothesis generating case control study

Large-scale systematic analysis of exposure to multiple cancer risk factors and the associations between exposure patterns and cancer incidence

Julia Steinberg, Sarsha Yap, … Dianne L. O’Connell

hypothesis generating case control study

Environmental variables and genome-environment interactions predicting IBD diagnosis in large UK cohort

Alan Z. Yang & Luke Jostins-Dean

hypothesis generating case control study

Identifying causal relationships of cancer treatment and long-term health effects among 5-year survivors of childhood cancer in Southern Sweden

Anders Holst, Jan Ekman, … Helena M. Linge

Studies of the aetiology of glioma, the commonest malignant brain tumour, with a very poor prognosis, are urgently needed, specifically to identify modifiable risk factors. The main reason that researchers have used the case–control design as the model of choice for epidemiological studies on the causes of glioma is that it a rare cancer, with an incidence of 4 per 100 000 people (World Standard Population) in Denmark, an incidence typical for a high-income country ( Christensen et al, 2003 ). Furthermore, the design limits the time required to obtain data, the cost is lower than that of more time-consuming designs and a wide range of suspected risk factors can be examined in the same study. In case–control studies, questionnaire data, blood samples and tissue specimens can be obtained from both cases and controls, thereby allowing analysis of both environmental and genetic factors and their interactions.

In questionnaire-based case–control studies, it is anticipated that cases can recall past events with sufficient accuracy. This a priori assumption is somewhat naive in the case of glioma in view of the well-known clinical presentation of the disease. The cancer itself, surgery, radiotherapy, chemotherapy and any combination of treatment may strongly influence the overall cognitive capacity of patients. Some have overt cognitive deficits and may therefore be unable to remember past events or have selective recall. Researchers may have to interview a proxy of the patient, as is often the case in case–control studies of risk factors for glioma.

Finding suitable controls presents another challenge. They must be from the same study population as the cases, as the source may influence reported exposures, and selection may be introduced when potential controls decide whether to participate in a study. If these sources of error are systematically different in terms of the exposures of interest from those in the case group, bias will be present. Bias can be addressed partly with statistical tools; however, they require either some idea of the nature and magnitude of bias from validation studies or assumptions about potential bias in sensitivity analyses. Neither necessarily leads to a satisfactory outcome, especially if the results differ substantially according to the assumptions. Despite these potentially serious limitations of case–control studies, there has been no in-depth debate about situations in which questionnaire-based case–control studies are unlikely to provide reliable results. In some narrative syntheses and meta-analytical reviews, the results of such studies contribute equally to the evidence base, even though application of study quality indicators is recommended when summarising evidence. It is therefore important to consider the level of evidence from case–control studies based solely on differential reconstruction of past exposures as compared with that from prospective studies for investigating glioma, when reconstruction of exposure is hampered by the outcome itself.

In this review, we critically appraise the evidence from both case–control and cohort studies of three risk factors for glioma in humans: medical radiation, exogenous hormone use and allergy. The objective is to provide some insight into the difficulty associated with choosing a study design when studying the risk factors for glioma. We also propose considerations for applying scientific weight to the results of case–control studies in this context.

We searched the Medline–PubMed database on 18 November 2015 using the following search strategy:

Search ((‘Glioma/epidemiology’[Majr] OR (glioma AND epidemiology))) AND ((((‘Risk Factors’[Mesh]) OR ‘Environment and Public Health’[Mesh])) OR (Risk OR exposure OR factor* OR cause*)) Filters: Humans; Meta-Analysis; Review; Systematic Reviews

Search (((((‘Glioma/epidemiology’[Majr] OR (glioma AND epidemiology))) AND ((((‘Risk Factors’[Mesh]) OR ‘Environment and Public Health’[Mesh])) OR (Risk OR exposure OR factor* OR cause*))) AND Humans[Mesh])) AND (((‘Case-Control Studies’[Mesh]) OR ‘Cohort Studies’[Mesh]) AND Humans[Mesh]) Filters: Humans

This search provided 3018 hits. Using the inclusion criteria English language paper, adult glioma, case–control study or cohort study and excluding reviews and/or meta-analyses, overview or commentary, qualitative methodology, children and adolescents, genetic exposures and mortality or survival as the outcome, we identified reports of original studies that included the three selected risk factors for glioma. Our search was intended to be neither comprehensive nor systematic for this review. We are aware that we did not identify some studies, such as those in which the word ‘glioma’ was not in the title, abstract or keywords and those in which none of the three risk factors was mentioned in the title or abstract.

From the selected papers, we extracted the characteristics of the study. We then compared the evidence from studies based on recall by cases and controls with that from studies with either a case–control design, with objective (recall-independent) assessment of exposure or a prospective cohort design.

We selected 30 case–control studies and six cohort studies on the association between glioma and medical radiation ( Preston-Martin et al, 1989 ; Neuberger et al, 1991 ; Schlehofer et al, 1992 ; Ryan et al, 1992 ; Zampieri et al, 1994 ; Ruder et al, 2006 ; Blettner et al, 2007 ; Davis et al, 2011 ), exogenous hormone use ( Huang et al, 2004 ; Hatch et al, 2005 ; Wigertz et al, 2006 ; Silvera et al, 2006 ; Benson et al, 2008 , 2015 ; Felini et al, 2009 ; Michaud et al, 2010 ; Kabat et al, 2011 ; Andersen et al, 2013 ; Anic et al, 2014 ; 2015 ) or allergic diseases ( Cicuttini et al, 1997 ; Schlehofer et al, 1999 , 2011 ; Wiemels et al, 2002 , 2004 , 2009 ; Brenner et al, 2002 ; Schwartzbaum et al, 2003 ; Schoemaker et al, 2006 ; Wigertz et al, 2007 ; Scheurer et al, 2008 ; Berg-Beckhoff et al, 2009 ; Il’yasova et al, 2009 ; 2009 ; McCarthy et al, 2011 ; Calboli et al, 2011 ; Turner et al, 2013 ; Cahoon et al, 2014 ). Table 1 lists the key characteristics of the selected studies.

Medical radiation: information from participants only

Ionising radiation is a long-established human carcinogen. Early cohort studies of patients who received radiation treatment to the scalp to treat tinea capitis or skin haemangioma during childhood had an increased risk for glioma, especially after treatment at an early age (see, e.g., Ron et al, 1988 ). Early case–control studies suggested increased risks for glioma after exposure to dental X-rays or X-rays to the head and neck ( Preston-Martin et al, 1989 ; Neuberger et al, 1991 ; Ryan et al, 1992 ; Schlehofer et al, 1992 ). In contrast, the German part of the Interphone study (a multinational interview-based case–control study on mobile phone use and other risk factors for brain tumours, acoustic neuroma and salivary gland tumours) indicated that exposure to any medical ionising radiation significantly reduced the risk for glioma (OR, 0.63; 95% CI, 0.48–0.83) in a study of 366 glioma patients (of whom 11% reported information on exposure through proxies) and 1538 controls ( Blettner et al, 2007 ). Other research groups have reported a similar protective effect of medical ionising radiation. In a study in Italy in 1984, of 195 cases and hospital controls, in which all information was obtained from proxies, the OR for any diagnostic X-ray was 0.4 (95% CI, 0.1–1.0; Zampieri et al, 1994 ). In two studies in the USA with 798 and 205 cases (proportions of proxies not reported), reduced ORs were found after exposure to full-mouth dental X-rays (OR, 0.75; 95% CI, 0.61–0.92; Ruder et al, 2006 ) and after one or more yearly dental X-rays or three or more full-mouth X-rays (0.60; 95% CI, 0.21–1.73 to 0.70; 95% CI, 0.40–1.21; Davis et al, 2011 ). In personal communications, we have been informed that medical radiation appears to be protective against glioma in the entire Interphone data set and that similar results were obtained in the Gliogene study. Although authors usually appropriately discuss the possibility of chance findings, residual confounding and (more importantly) recall bias, use of proxies and selection bias, data are required to estimate the magnitude and direction of the potential error; otherwise, most of the conclusions remain speculative. Most case–control studies continue to rely on self-reported information, whereas validation from records of medical radiation or dental records should be a minimal quality assurance component of studies. This may be difficult in countries where X-ray machines are available in all hospitals, big or small, and even in some general practices, so that it would be virtually impossible to review all the records for false negatives (that is, examinations not reported by study participants). It should, however, be feasible on a small sample.

Exogenous hormones: self-reported use versus prescription data

Two methods have been used to collect information on exposure in studies of the relation between use of exogenous hormones and glioma: self-reported use and prescription data. In a case–control study in the USA with 619 women with glioma and 650 controls, self-reported use of hormone replacement therapy (HRT) was associated with an OR of 0.56 (95% CI, 0.37–0.84; Felini et al, 2009 ). This result is in line with those of a number of other case–control studies of self-reported use of oral contraceptives or HRT, reported separately ( Huang et al, 2004 ; Hatch et al, 2005 ; Wigertz et al, 2006 ; Anic et al, 2014 ), which did not, however, reach statistical significance. In two case–control studies nested in population-based registries, with data on prescriptions collected prospectively and independently of the study hypothesis, use of HRT did not decrease the risk for glioma, based on 689 cases (OR, 1.14; 95% CI, 0.93–1.40) and 658 cases (OR, 0.9; 95% CI, 0.8–1.1; Benson et al, 2015 and Andersen et al, 2013 ). These results, based on administrative sources, corroborated those of several very large prospective cohort studies with self-reported data on use of oral contraceptives or HRT obtained before diagnosis of a glioma ( Silvera et al, 2006 ; Benson et al, 2008 ; Michaud et al, 2010 ; Kabat et al, 2011 ). Overall, therefore, relying on self-reported information on use of exogenous hormones obtained retrospectively resulted in systematically lower risk estimates than when exposure was measured prospectively or from prescription data, when no convincing reductions in risk were observed.

Allergy: same direction in risk irrespective of study design

The search of an immune factor that may have a role in glioma aetiology has led to studies of several different definitions of outcomes—ranging from self-reported allergic conditions or autoimmune disorders, discharge records of allergic disorders and use of serum IgE levels as a measure of a hyperactive immune system. Several case–control studies showed consistently that self-reported allergic conditions protect against glioma. For example, in the International Adult Brain Tumour Study, with 1178 glioma patients (26% of whom reported through proxies) and 2493 population controls, an OR of 0.59 (95% CI, 0.49–0.71) was found for any self-reported allergy ( Schlehofer et al, 1999 ). Other case–control studies found similarly reduced ORs; these often had substantial proportions of proxy informants: 44% ( Cicuttini et al, 1997 ), 24% ( Wiemels et al, 2002 ), 24% ( Brenner et al, 2002 ), 13% ( Wigertz et al, 2007 ), 4% ( Scheurer et al, 2008 ), 3% ( Berg-Beckhoff et al, 2009 ), 24% ( Wiemels et al, 2009 ) and 17% ( Turner et al, 2013 ); others did not provide information on the proportion of proxies ( Wiemels et al, 2004 ; Schoemaker et al, 2006 ; Il’yasova et al, 2009 ; McCarthy et al, 2011 ). Two Swedish cohorts who self-reported allergies had non-significantly reduced risks for glioma: OR, 0.45 (95% CI, 0.19–1.07) among twins born in 1986–1925 but a nonsignificantly increased risk (OR, 1.09; 95% CI, 0.48–2.48) among twins born in 1926–1958 ( Schwartzbaum et al, 2003 ). In a combined analysis of the two twin cohorts and discharge records of immune-related diseases, including both atopic allergic diseases as well as autoimmune diseases such as diabetes, rheumatoid arthritis and so on, as the exposure measure, the risk was reduced but not significantly (HR, 0.46; 95% CI, 0.14–1.48).

The biological marker immunoglobuline E (IgE) may provide more specificity and reduce bias stemming from self-report. In a case–control study from 2004, both self-reported allergies and IgE levels were reversely associated with gliomas in 258 cases and 289 controls but, as expected, concordance between the two outcomes was not high ( Wiemels et al, 2004 ). In a further study from 2009, both self-reported allergies and IgE levels were reversely associated in 535 cases and 532 controls, but analyses showed that IgE levels obtained in glioma patients were affected by treatment with telozomide, underscoring the need for prospectively collected data ( Wiemels et al, 2009 ). A case–control study nested in the EPIC cohort ( Schlehofer et al, 2011 ) and thus with prospectively collected data on serum IgE levels reported a statistically nonsignificant OR of 0.73 (95% CI, 0.51–1.06) based on 275 cases. Another case–control study, nested in four large cohorts in the USA with 181 cases of glioma, found an almost identical OR of 0.72 (95% CI, 0.51–1.03) for a serum IgE level above normal ( Calboli et al, 2011 ). A cohort study of hospital discharge records of 4.5 million men with a mean 12-year follow-up and 4383 events of glioma showed that any allergy was associated with an HR for glioma of 0.85 (95% CI, 0.72–1.01) with a latency of >2 years and 0.6 (95% CI, 0.4–0.8) with a latency of >10 years ( Cahoon et al, 2014 ). In a meta-analyses of the 14 studies in the international Gliogene case–control study, published after our literature search, with 4533 cases and 4177 controls and <10% proxies, respiratory allergy was associated with an OR of 0.72 (95% CI, 0.58–0.90; Amirian et al, 2016a ).

Imprecisely defined exposures such as allergic disease probably affect the validity of the findings of both case–control and cohort studies. The heterogeneous description of allergy in studies, different levels of detail in self-reporting on individual allergies and use of objective measures of serum IgE levels or discharge records further complicate interpretation of the results. Nevertheless, there is no doubt that most studies of any design, type of measure and size indicate that allergy or a hyperactive immune system, through some as yet unidenfied biological mechanisms might be protective against the development of glioma.

Synthesis of the three examples

In two of our examples, medical radiation and HRT, questionnaire-based case–control studies appeared to show an inverse association, whereas nested case–control and cohort studies showed no association. For allergies, the inverse association is observed irrespective of study design. If the inverse associations with medical radiation and HRT use are spurious, possible explanations are over-reporting by controls, under-reporting by cases or selection bias in relation to the exposure of interest. Over-reporting by controls seems unlikely, unless the time between the reference date (censoring of risk time) and the interview date is long, when controls may incorrectly remember the dates of examinations and report those occurring after censoring of the risk time, as observed in a case–control study on paediatric brain tumours in Germany ( Schüz et al, 2001 ). Selection bias may have some role, as medical radiation and HRT use are more common among more affluent people, while participation as a control is often associated with higher education and income. Underreporting is a concern. It might occur because a patient with the very serious diagnosis of a glioma might view other medical events as less important and could easily be forgotten in an interview. The last finding is curious, because, for environmental exposures, validation studies suggest over-reporting or exaggeration by cases (for instance, in studies on mobile phone use or occupational exposure), perhaps because they try to not miss reporting something they may consider relevant in terms of their cancer diagnosis. (For discussions on bias in case–control studies on brain tumours, see, for example, Vrijheid et al, 2006 , 2009 ).

After 30 years of research, we still do not know much about what causes glioma or protects people from the disease. In the search for causality, many researchers who are systematically evaluating the evidence give more weight to that from cohort studies than from case–control studies (e.g., Cochrane reviews); others go as far as considering case–control studies useful only for hypothesis generating because of their retrospective nature ( Mann, 2003 ). In many systematic reviews and meta-analyses in the peer-reviewed literature; however, there is a tendency to categorise the evidence from case–control studies with evidence derived from prospective cohort studies and to give them equal weight. In studying glioma, we consider it critical that studies based on the recall of patients with a disease that affects the brain and possibly cognition should not be given the same weight as nested case–control studies or cohort studies. In addition to the limitations inherent in questionnaire-based case–control studies on other diseases, the risk for recall bias among cases makes it difficult to draw firm conclusions. Validation studies of recall of exposures by glioma cases and by controls often show that cases recall the past differently from controls ( Vrijheid et al, 2006 , 2009 ). The treatment and even the symptoms that arise before treatment, due to the presence of the tumour, may influence cognitive function, underscoring these objections. In studies of glioma, the widespread acceptance of information obtained from the closest relative—a proxy—adds to the problem of the accuracy of self-reported information. Going back to our examples, would proxies really know about the dental X-rays that the patient had during childhood? Recall bias is an issue not only for the exposure of interest but also for potential confounders in analyses of the exposure–disease relationship, as inaccurately measured confounders obviate appropriate adjustment.

As we have illustrated, studies in which information on exposure is obtained from sources other than memory for both cases and controls and in which the information on outcome is from high quality sources, are more reliable, depending on the completeness and quality of the data that can be obtained.

The cohort design is not free of problems, but it is less vulnerable to methodological errors than case–control studies that rely on the memory of cases and controls. The cohort design is therefore the preferred type for observational studies. Nevertheless, because glioma is a rare event, the case–control design may be the only one possible. During critical appraisal of the evidence derived from such studies, however, quality indicators should be applied, as they should for cohort studies. These quality indicators should address the study population (sampling frame, response rates), exposure measures (ideally showing results from validations), and discussion of potential bias affecting the risk estimation.

The superiority of the cohort design and/or access to data obtained independently of the hypothesis in studies of potential risk factors for cancer have been illustrated by cohort studies of various issues, for example, that abortions increase the risk for breast cancer ( Melbye et al, 1997 ) and that our minds cause cancer ( Johansen, 2012 ). One may say that when studying i.e. low-dose radiation and rare outcomes such as gliomas with complicating problems of recall bias and lack of validation the question cannot be reduced to just choosing cohort studies over case–control studies. Cohort studies may actually not be feasible for evaluation of this exposure. One solution might instead be to extrapolate from cohort studies with greater ranges of exposure like atomic bomb survivors or people exposed to nuclear accidents. Poorly conducted studies give rise to risk, as their outcomes often contribute to public concern and may shift the focus from the relevant to the irrelevant, as for instance in the debate about cancer risks and mobile technologies.

Observational studies on the risk factors for glioma, i.e. reports from the early case–control studies conducted at the University of California at San Francisco (USA; see, e.g., Wrensch et al, 2000 ) and the University of California at Los Angeles (USA; Preston-Martin et al, 1989 ), coordinated by the US National Cancer Institute ( Inskip et al, 2001 ), the first international case–control study ( Schlehofer et al, 1999 ), the Interphone study ( Cardis et al, 2007 ) and probably also the most recent Gliogene case–control study ( Malmer et al, 2007 ), do not provide much evidence on what causes this devastating cancer. Thus, despite all the resources that went into those studies, the results did not provide striking evidence on which to base prevention. Nevertheless, as lifestyle and environmental factors were studied comprehensively, the results may suggest that not many of the usual cancer-causing suspects have an important role in glioma aetiology. This is an important finding to be acknowledged and suggests that for the identification of causes novel ideas are needed. Recent reports on genetic risk factors for glioma suggest that these factors do have a crucial role in the risk pattern ( Amirian et al, 2016b ).

The criteria for causality are the strength of the evidence, consistency across populations, specificity, temporality, dose–response and biological plausibility ( Hill, 1965 ). The temporal criterion should always be addressed in evaluating the evidence, whereas in case–control studies, unless secondary data sources can be used, the information is collected after diagnosis of a disease, that is, the reverse sequence in temporality. Furthermore, there are major problems in self-reporting, as cases are aware of having a fatal disease and may unconsciously change their way of looking at past events. Even physical measurements should be evaluated for the representativeness of contemporary measurements of exposure during the aetiologically relevant period, which might have been decades previously.

On the basis of this review, we recommend that the case–control design be placed lower in the hierarchy of studies for establishing cause-and-effect for diseases such as glioma, which pose challenges for accurate collection of retrospective data. A state-of-the-art case–control study should as a minimum, be accompanied by extensive validation of the exposure assessment methods and the representativeness of the study sample with regard to the exposures of interest. Otherwise, such studies cannot be termed ‘hypothesis testing’ but only ‘hypothesis generating’. We consider that this holds true for all questionnaire-based case–control studies on all cancers and chronic diseases, although perhaps not to the same extent for each exposure–outcome combination. For example, case–control studies clearly linked smoking with lung cancer in the 1950s, prenatal radiation to the fetus with childhood leukemia in the late 1950s/early 1960s, postmenopausal oestrogens with uterine endometrial cancer in the 1960s and diethylstilbestrol with vaginal adenocarcinoma in 1971. Almost all known risk factors for breast cancer were identified in case–control studies and much of the evidence that identified smoking and types of tobacco as the cause of about 50% of bladder cancer was based on case–control studies. However, this list does not include risk factors for glioma and these earlier studies, in some cases, showed risk estimates robust to such a degree that even potential bias could not hamper the associations observed.

We hope that the examples we have provided underscore our points and that our recommendation will be taken into account in ranking the evidence obtained from case–control studies and also in the design of such studies in cancer epidemiology.

Amirian ES, Zhou R, Wrensch MR, Olson SH, Scheurer ME, Il'yasova D, Lachance D, Armstrong GN, McCoy LS, Lau CC, Claus EB, Barnholtz-Sloan JS, Schildkraut J, Ali-Osman F, Sadetzki S, Johansen C, Houlston RS, Jenkins RB, Bernstein JL, Merrell RT, Davis FG, Lai R, Shete S, Amos CI, Melin BS, Bondy ML (2016a) Approaching a scientific consensus on the association between allergies and glioma risk: a report from the glioma international case-control study. Cancer Epidemiol Biomarkers Prev 25 : 282–290.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Amirian ES, Armstrong GN, Zhou R, Lau CC, Claus EB, Barnholtz-Sloan JS, Il'yasova D, Schildkraut J, Ali-Osman F, Sadetzki S, Johansen C, Houlston RS, Jenkins RB, Lachance D, Olson SH, Bernstein JL, Merrell RT, Wrensch MR, Davis FG, Lai R, Shete S, Amos CI, Scheurer ME, Aldape K, Alafuzoff I, Brännström T, Broholm H, Collins P, Giannini C, Rosenblum M, Tihan T, Melin BS, Bondy ML (2016b) The glioma international case-control study: a report from the Genetic Epidemiology of Glioma International Consortium. Am J Epidemiol 183 : 85–91.

PubMed   Google Scholar  

Andersen L, Friis S, Halls J, Ravn P, Gaist D (2013) Hormone replacement therapy and risk of glioma: a nationwide nested case-control study. Cancer Epidemiol 37 : 876–880.

Article   PubMed   Google Scholar  

Anic GM, Madden MH, Nabors B, Olson JJ, LaRocca RV, Thompson ZJ, Pamnani SJ, Forsyth PA, Thompson RC, Egan KM (2014) Reproductive factors and risk of primary brain tumors in women. J Neurooncol 118 : 297–304.

Benson VS, Pirie K, Green J, Casabonne D, Berai V the Million Women Study Collaboration (2008) Lifestyle factors and primary glioma and meningioma tumours in the Million Women Study cohort. Br J Cancer 99 : 185–190.

Benson VS, Kirichek O, Beral V, Green J (2015) Menopausal hormone therapy and central nervous system tumor risk: large UK prospective study and meta-analysis. Int J Cancer 136 : 2369–2377.

Article   CAS   PubMed   Google Scholar  

Berg-Beckhoff G, Schüz J, Blettner M, Münster E, Schlaefer K, Wahrendorf J, Schlehofer B (2009) History of allergic disease and epilepsy and risk of glioma and meningioma (Interphone study group, Germany). Eur J Epidemiol 24 : 433–440.

Blettner M, Schlehofer B, Samkange-Zeeb F, Berg G, Schlafer K, Schüz J (2007) Medical exposure to ionising radiation and the risk of brain tumours: Interphone study group, Germany. Eur J Cancer 43 : 1990–1998.

Brenner AV, Linet MS, Fine HA, Shapiro WR, Selker RG, Black PM, Inskip PD (2002) History of allergies and autoimmune diseases and risk of brain tumors in adults. Int J Cancer 99 : 252–259.

Cahoon EK, Inskip PD, Gridley G, Brenner AV (2014) Immune-related conditions and subsequent risk of brain cancer in a cohort of 4.5 million male US veterans. Br J Cancer 110 : 1825–1833.

Calboli FCF, Cox DG, Buring JE, Gaziano JM, Ma J, Stampfer M, Willett WC, Tworoger SS, Hunter DJ, Camargo-Jr CA, Michaud DS (2011) Prediagnostic plasma IgE levels and risk of adult glioma in four prospective cohort studies. J Natl Cancer Inst 103 : 1588–1595.

Cardis E, Richardson L, Deltour I, Armstrong B, Feychting M, Johansen C, Kilkenny M, McKinney P, Modan B, Sadetzki S, Schüz J, Swerdlow A, Vrijheid M, Auvinen A, Berg G, Blettner M, Bowman J, Brown J, Chetrit A, Christensen HC, Cook A, Hepworth S, Giles G, Hours M, Iavarone I, Jarus-Hakak A, Klaeboe L, Krewski D, Lagorio S, Lönn S, Mann S, McBride M, Muir K, Nadon L, Parent ME, Pearce N, Salminen T, Schoemaker M, Schlehofer B, Siemiatycki J, Taki M, Takebayashi T, Tynes T, van Tongeren M, Vecchia P, Wiart J, Woodward A, Yamaguchi N (2007) The Interphone study: design, epidemiological methods, and description of the study population. Eur J Epidemiol 22 : 647–664.

Christensen HC, Kosteljanetz M, Johansen C (2003) Incidences of gliomas and meningiomas in Denmark 1943 to 1997. Neurosurgery 52 : 1327–1334.

Cicuttini FM, Hurley SF, Forbes A, Donnan GA, Salzberg M, Giles GG, McNeil JJ (1997) Association of adult glioma with medical conditions, family and reproductive history. Int J Cancer 71 : 203–207.

Davis F, Il’yasova D, Rankin K, McCarthy K, McCarthy B, Bigner DD (2011) Medical diagnostic radiation exposures and risk of gliomas. Radiation Res 175 : 790–796.

Felini MJ, Olshan AF, Schroeder JC, Carozza SE, Miike R, Rice T, Wrensch M (2009) Reproductive factors and hormone use and risk of adult gliomas. Cancer Causes Control 20 : 87–96.

Hill AB (1965) The environment and disease: association or causation? Proc R Soc Med 58 : 295–300.

CAS   PubMed   PubMed Central   Google Scholar  

Hatch EE, Linet MS, Zhang J, Fine HA, Shapiro WR, Selker RG, Black PM, Inskip P (2005) Reproductive and hormonal factors and risk of brain tumors in adult females. Int J Cancer 114 : 797–805.

Huang K, Whelan EA, Ruder AM, Ward EM, Deddens JA, Davis-King KE, Carreón T, Waters MA, Butler MA, Calvert GM, Schulte PA, Zivkovich Z, Heineman EF, Mandel JS, Morton RF, Reding DJ, Rosenman KD the Brain Cancer Collaborative Study Group (2004) Reproductive factors and risk of glioma in women. Cancer Epidemiol Biomarkers Prev 13 : 1583–1588.

Il’yasova D, McCarthy B, Marcello J, Schildkrau JM, Moorman PG, Krishnamachari B, Ali-Osman F, Bigner DD, Davis F (2009) Association between glioma and history of allergies, asthma and eczema: a case-control study with three groups of controls. Cancer Epidemiol Biomarkers Prev 18 : 1232–1238.

Article   PubMed   PubMed Central   Google Scholar  

Inskip PD, Tarone RE, Hatch EE, Wilcosky TC, Shapiro WR, Selker RG, Fine HA, Black PM, Loeffler JS, Linet MS (2001) Cellular-telephone use and brain tumors. N Engl J Med 344 : 79–86.

Johansen C (2012) Mind as a risk factor for cancer – some comments. Psychooncology 21 : 922–926.

Kabat GC, Park Y, Hollenbeck AR, Schatzkin A, Rohan TE (2011) Reproductive factors and exogenous hormone use and risk of adult glioma in women in the NIH-AARP Diet and Health study. Int J Cancer 128 : 944–950.

Malmer B, Adatto P, Armstrong G, Barnholtz-Sloan J, Bernstein JL, Claus E, Davis F, Houlston R, Il'yasova D, Jenkins R, Johansen C, Lai R, Lau C, McCarthy B, Nielsen H, Olson SH, Sadetzki S, Shete S, Wiklund F, Wrensch M, Yang P, Bondy M (2007) Gliogene: an international consortium to understand familial glioma. Cancer Epidemiol Biomarkers Prev 16 : 1730–1734.

Mann CJ (2003) Observational research methods. Research design II: cohort, cross sectional, and case-control studies. Emerg Med J 20 : 54–60.

McCarthy BJ, Rankin K, Il’yasova D, Erdal S, Vick N, Ali-Osman F, Bigner DD, Davis F (2011) Assessment of type of allergy and antihistamine use in the development of glioma. Cancer Epidemiol Biomarkers Prev 20 : 370–378.

Melbye M, Wohlfahrt J, Olsen JH, Frisch M, Westergaard T, Helweg-Larsen K, Andersen PK (1997) Induced abortion and the risk of breast cancer. N Engl J Med 336 : 81–85.

Michaud DL, Gallo V, Schlehofer B, Tjønneland A, Olsen A, Overvad K, Dahm CC, Kaaks R, Lukanova A, Boeing H, Schütze M, Trichopoulou D, Bamia C, Kyrozis A, Sacerdote C, Agnoli C, Palli D, Tumino R, Mattiello A, Bueno-de-Mesquita HB, Ros MM, Peeters PH, van Gils CH, Lund E, Bakken K, Gram IT, Barricarte A, Navarro C, Dorronsoro M, Sánchz MJ, Rodríguez L, Duell EJ, Hallmans G, Melin BS, Manjer J, Borgquist S, Khaw K-T, Wareham N, Allen NE, Tsilidis KK, Romieu I, Rinaldi S, Vineis P, Riboli E (2010) Reproductive factors and exogenous hormone use in relation to risk of glioma and meningioma in a large European cohort study. Cancer Epidemiol Biomarkers Prev 19 : 2562–2569.

Neuberger JS, Brownson RC, Morantz RA, Chin TDY (1991) Association of brain cancer with dental x-rays and occupation in Missouri. Cancer Detect Prev 15 : 31–34.

CAS   PubMed   Google Scholar  

Preston-Martin S, Mack W, Henderson BE (1989) Risk factors for gliomas and meningiomas in males in Los Angeles County. Cancer Res 49 : 6137–6143.

Ron E, Modan B, Boice JD Jr (1988) Mortality after radiotherapy for ringworm of the scalp. Am J Epidemiol 127 : 713–725.

Ruder AM, Waters MA, Carreón T, Butler MA, Davis-King KE, Calvert GM, Schulte PA, Ward EM, Connally LB, Lu J, Wall D, Zivkovich Z, Heineman EF, Mandel JS, Morton RF, Reding DJ, Rosenman KD Brain Cancer Collaborative Study Group (2006) The Upper Midwest Health Study: a case-control study of primary intracranial gliomas in farm and rural residents. J Agric Saf Health 12 : 255–274.

Ryan P, Lee MW, North B, McMichael AJ (1992) Amalgam fillings, diagnostic dental x-rays and tumours of the brain and meninges. Oral Oncol Eur J Cancer 28B : 91–95.

Article   CAS   Google Scholar  

Scheurer ME, El-Zein R, Thompson PA, Aldape KD, Levin VA, Gilbert MR, Weinberg JS, Bondy ML (2008) Long-term anti-inflammatory and antihistamine medication use and adult glioma. Cancer Epidemiol Biomarkers Prev 17 : 1277–1281.

Schlehofer B, Blettner M, Becker N, Martinsohn C, Wahrendorf J (1992) Medical risk factors and the development of brain tumors. Cancer 69 : 2541–2547.

Schlehofer B, Blettner M, Preston-Martin S, Niehoff D, Wahrendorf J, Arslan A, Ahlbom A, Choi WN, Giles GG, Howe GR, Little J, Ménégoz F, Ryan P (1999) Role of medical history in brain tumour development. Results from the international adult brain tumour study. Int J Cancer 82 : 155–160.

Schlehofer B, Siegmund B, Linseisen J, Schüz J, Rohrman S, Becker S, Michaud D, Melin B, Bas Bueno-de-Mesquitaeri H, Peeters PHM, Vineis P, Tjønneland A, Olsen A, Overvad K, Romieu I, Boieng H, Aleksandrova K, Trichopoulou A, Bamia C, Lagiou P, Sacerdote C, Palli D, Panico S, Sieri S, Tuminoarricarte A, Borgquist S, Manjer J, Gallo V, Allen NE, Key TJ, Riboli E, Kaaks R, Wahrendorf J (2011) Primary brain tumours and specific serum immunoglobulin E: a case-control study nested in the European Prospective Investigation into Cancer and Nutrition cohort. Allergy 66 : 1434–1441.

Schoemaker MJ, Swerdlow A, Hepworth SJ, McKinney PA, van Tongeren M, Muir KR (2006) History of allergies and risk of glioma in adults. Int J Cancer 119 : 2165–2172.

Schüz J, Kaletsch U, Kaatsch P, Meinert R, Michaelis J (2001) Risk factors for pediatric tumors of the central nervous system: results from a German population-based case-control study. Med Pediatr Oncol 36 : 274–282.

Schwartzbaum J, Jonsson F, Ahlbom A, Preston-Martin S, Lönn S, Söderberg KC, Feychting M (2003) Cohort studies of association between self-reported allergic conditions, immune-related diagnoses and glioma and meningioma risk. Int J Cancer 106 : 423–428.

Silvera SAN, Miller AB, Rohan TE (2006) Hormonal and reproductive factors and risk of glioma: a prospective cohort study. Int J Cancer 118 : 1321–1324.

Turner MC, Krewski D, Armstrong BK, Chetrit A, Giles GG, Hours M, McBride ML, Parent M-E, Sadetzki S, Siemiatycki J, Woodward A, Cardis E (2013) Allergy and brain tumors in the Interphone study: pooled results from Australia, Canada, France, Israel, and New Zealand. Cancer Causes Control 24 : 949–960.

Vrijheid M, Deltour I, Krewski D, Sanchez M, Cardis E (2006) The effects of recall errors and of selection bias in epidemiologic studies of mobile phone use and cancer risk. J Expo Sci Environ Epidemiol 16 : 371–384.

Vrijheid M, Armstrong BK, Bédard D, Brown J, Deltour I, Iavarone I, Krewski D, Lagorio S, Moore S, Richardson L, Giles GG, McBride M, Parent ME, Siemiatycki J, Cardis E (2009) Recall bias in the assessment of exposure to mobile phones. J Expo Sci Environ Epidemiol 19 : 369–381.

Wiemels JL, Wiencke JK, Sison JD, Miike R, McMillan A, Wrensch M (2002) History of allergies among adults with glioma and controls. Int J Cancer 98 : 609–615.

Wiemels JL, Wiencke JK, Patoka J, Moghadassi M, Chew T, McMillan A, Miike R, Barger G, Wrensch M (2004) Reduced immunoglobulin E and allergy among adults with glioma compared with controls. Cancer Res 64 : 8464–8473.

Article   Google Scholar  

Wiemels JL, Wilson D, Pater C, Patoka J, McCoy L, Rice T, Schwarzbaum J, Heimberger A, Sampson JH, Chang S, Prados M, Wiencke JK, Wrensch M (2009) IgE, allergy, and risk of glioma: update from the San Francisco Bay Area Adult Glioma Study in the temozolomide era. Int J Cancer 125 : 680–687.

Wigertz A, Lönn S, Schwartzbaum J, Hall P, Auvinen A, Christensen HC, Johansen C, Klæboe L, Salminen T, Schoemaker MJ, Swerdlow AJ, Tynes T, Feychting M (2007) Allergic conditions and brain tumor risk. Am J Epidemiol 166 : 941–950.

Wigertz A, Lönn S, Mathiesen T, Ahlbom A, Hall P, Feychting M the Swedish Interphone Study Group (2006) Risk of brain tumors associated with exposure to exogenous female sex hormones. Am J Epidemiol 164 : 629–636.

Wrensch M, Miike R, Lee M, Neuhaus J (2000) Are prior head injuries or diagnostic X-rays associated with glioma in adults? The effects of control selection bias. Neuroepidemiology 19 : 234–244.

Zampieri P, Meneghini F, Grigoletto F, Gerosa M, Licata C, Casentini L, Longatti PL, Padoan A, Mingrino S (1994) Risk factors for cerebral glioma in adults: a case-control study in an Italian population. J Neurooncol 19 : 61–67.

Download references

Author information

Authors and affiliations.

Oncology Clinic, Finsen Centre, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100, Copenhagen, Denmark

Christoffer Johansen

Unit of Survivorship, Danish Cancer Society Research Center, Strandboulevarden 49, Copenhagen, 2100, Denmark

Christoffer Johansen, Anne-Marie Serena Andreasen & Susanne Oksbjerg Dalton

Section of Environment and Radiation, International Agency for Research on Cancer, 150 Cours Albert Thomas, Lyon Cedex 08, 69372, France

Joachim Schüz

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Christoffer Johansen .

Ethics declarations

Competing interests.

The authors declare no conflict of interest.

Rights and permissions

This work is licensed under the Creative Commons Attribution-Non-Commercial-Share Alike 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/4.0/

Reprints and permissions

About this article

Cite this article.

Johansen, C., Schüz, J., Andreasen, AM. et al. Study designs may influence results: the problems with questionnaire-based case–control studies on the epidemiology of glioma. Br J Cancer 116 , 841–848 (2017). https://doi.org/10.1038/bjc.2017.46

Download citation

Received : 22 March 2016

Revised : 30 January 2017

Accepted : 02 February 2017

Published : 07 March 2017

Issue Date : 28 March 2017

DOI : https://doi.org/10.1038/bjc.2017.46

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • case–control study
  • medical radiation
  • exogenous hormone use

This article is cited by

Association of a novel antisense lncrna tp73-as1 polymorphisms and expression with colorectal cancer susceptibility and prognosis.

  • Chunhong Fan

Genes & Genomics (2022)

Searching for causal relationships of glioma: a phenome-wide Mendelian randomisation study

  • Charlie N. Saunders
  • Alex J. Cornish
  • Melissa L. Bondy

British Journal of Cancer (2021)

Testing for causality between systematically identified risk factors and glioma: a Mendelian randomization study

  • A. E. Howell
  • J. W. Robinson
  • K. M. Kurian

BMC Cancer (2020)

Update on the effect of exogenous hormone use on glioma risk in women: a meta-analysis of case-control and cohort studies

  • Yu-Long Lan

Journal of Neuro-Oncology (2018)

Impact of atopy on risk of glioma: a Mendelian randomisation study

  • Linden Disney-Hogg
  • Richard S. Houlston

BMC Medicine (2018)

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

hypothesis generating case control study

  • Module Index
  • Epiville Chamber of Commerce
  • About this site
  • Requirements

Case-Control Study

  • Introduction
  • Learning Objectives
  • Student Role

Study Design

  • Data Collection
  • Data Analysis
  • Discussion Questions
  • Print Module

Now that you have thoroughly assessed the situation, you have enough information to generate some hypotheses. The two suspected causal agents of the outbreak of Susser Syndrome are Quench-It and EnduroBrick. Use the case-control method to design a study that will allow you to compare the exposures to these products among your cases of Susser Syndrome and healthy controls of your choice. From all of your class work, you know that you want your hypotheses to be as explicit and detailed as possible.

1. Based on the information you gathered, which of the following hypotheses is the most appropriate for your case-control study?

  • (1) Those who consumed EnduroBrick are more likely to be diagnosed with Susser Syndrome than those who did not; (2) Those who consumed Quench-It are more likely to be diagnosed with Susser Syndrome than those who did not consume Quench-It.
  • Individuals diagnosed with Susser Syndrome are more likely to have been members of the Superfit Fitness Center than individuals without Susser Syndrome.
  • Individuals diagnosed with Susser Syndrome are likely to be exposed to a variety of different exposures than are individuals not diagnosed with Susser Syndrome

Now that you have hypotheses, the next step is to prepare the case definition. This requires us to understand how Susser Syndrome is diagnosed. The more certain you are about your diagnosis the less error you will introduce into your study by incorrectly specifying cases. Based on information from the EDOH website, you decide that your case definition will be based on a clinical diagnosis of Susser Syndrome.

After you establish your case definition, you need to decide on the population from which the cases for your study will be obtained. Since the majority of cases from the recent outbreak were active members of the Superfit Fitness Center, you decide to base your study on this population.

Next you need to decide how you will classify your cases and controls based on exposure status. Remember, we are actually operating under two hypotheses here, each with its own unique exposure variable. Scientists working on the possible causal connection between consumption of EnduroBrick or Quench-It and the development of Susser Syndrome suggest that both exposures may have an Induction time of at least 6 months. Under this hypothesis, any cases of Susser Syndrome that occurred within 6 months of initial consumption of either EnduroBrick or Quench-It could not have plausibly been caused by the exposure. Thus, you stipulate that at least 6 months are required to have elapsed since the initial exposure, before your individual will be considered "'exposed".

Once all of these decisions have been made, it is time to create appropriate eligibility criteria for your cases and controls.

2. Which of the following do you think are the best eligibility criteria for the cases? [Aschengrau & Seage, pp. 239-243]

  • Cases should have been members of the Superfit Center in the last two years for at least 6 months (total) and consumed either EnduroBrick or Quench-It.
  • Cases should be correctly diagnosed with Susser Syndrome and be employed at Glop Industries.
  • Cases should be correctly diagnosed with Susser Syndrome and have been members of the Superfit Fitness Center for at least 6 months in the last two years.

Now you need to decide who is eligible to be a control.

You recall from your wonderful learning experience in P6400 that valid controls in a case-control study are individuals that, had they acquired the disease under investigation, would have ended up as cases in your study. The best way to ensure this is to sample controls from the same population that gave rise to the cases. To ensure that the controls accurately represent a sample of the distribution of exposure in the population giving rise to the cases, they should be sampled independently of exposure status.

3. Which of the following do you think are the best eligibility criteria for the controls?

  • Controls should be residents of Epiville who have not been diagnosed with Susser Syndrome.
  • Controls should be members of the Superfit Fitness Center who have been diagnosed with Susser Syndrome but have not consumed either EnduroBrick or Quench-It.
  • Controls should be members of the Superfit Center for at least 6 months in the last 2 years and not be diagnosed with Susser Syndrome at the time of data collection.

Now that the eligibility criteria have been set, you must determine the specifics of the case-control study design.

How many cases and controls should you recruit?

The answer to this question obviously depends on your time and resources. However, an equally important consideration is how much power you want the study to have. Conventionally, we want a study's power to be at least 80 percent in being able to find a significant difference between the groups. Generally, if the study has less than 80 percent power, we conclude that the study is underpowered. This does not mean our results are incorrect; but if we observe an insignificant result in an underpowered study we may not be able to tell whether this is because there truly is no association or whether this is due to the lack of power in the study.

Intellectually Curious?

Learn more about power and sample size .

After crunching the numbers, you determine that the study will require the following size to achieve a desired power of 80 percent:

Number of cases: 112 Number of controls: 224 Total number of subjects: 336

Bear in mind that the study is voluntary. Subjects, even when eligible, are in no way required to participate. Furthermore, subjects may drop out of the study before completion, further decreasing your sample size. Study participation depends in large part on the methods of recruitment. In-person recruitment is generally regarded as the most effective, followed by telephone interviews, and then mail invitations. The participation rate that you expect to achieve, given your method of recruitment, will help you to calculate approximately how many individuals you will need to contact in order to meet your sample size.

Should you recruit cases and controls simultaneously or cases first and then all controls? Learn more here .

Website URL: http://epiville.ccnmtl.columbia.edu/

  • Search Menu
  • Advance articles
  • Editor's Choice
  • Supplements
  • Submission Site
  • Author Guidelines
  • Open Access
  • About Schizophrenia Bulletin
  • About the University of Maryland School of Medicine
  • About the Maryland Psychiatric Research Center
  • About the NIH Public Access Policy
  • Editorial Board
  • Advertising and Corporate Services
  • Journals Career Network
  • Self-Archiving Policy
  • Dispatch Dates
  • Journals on Oxford Academic
  • Books on Oxford Academic

University of Maryland School of Medicine

Article Contents

Introduction.

  • < Previous

Results From a Hypothesis Generating Case-Control Study: Herpes Family Viruses and Schizophrenia Among Military Personnel

  • Article contents
  • Figures & tables
  • Supplementary Data

David W. Niebuhr, Amy M. Millikan, Robert Yolken, Yuanzhang Li, Natalya S. Weber, Results From a Hypothesis Generating Case-Control Study: Herpes Family Viruses and Schizophrenia Among Military Personnel, Schizophrenia Bulletin , Volume 34, Issue 6, November 2008, Pages 1182–1188, https://doi.org/10.1093/schbul/sbm139

  • Permissions Icon Permissions

Background: Herpes family viruses can cause central nervous system inflammatory changes that can present with symptoms indistinguishable from schizophrenia and therefore are of interest in schizophrenia research. Most existing studies of herpes viruses have used small populations and postdiagnosis specimens. As part of a larger research program, we conducted a hypothesis-generating case-control study of selected herpes virus antibodies among individuals discharged from the US military with schizophrenia and pre- and postdiagnosis sera. Methods : Cases ( n  = 180) were servicemembers hospitalized and discharged from military service with schizophrenia. Controls, 3:1 matched on several factors, were members not discharged. The military routinely collects and stores members' serum specimens. We used microplate enzyme immunoassay to measure immunoglobulin G (IgG) antibody levels to 6 herpes viruses in pre- and postdiagnosis specimens. Conditional logistic regression was used, and the measure of association was the hazard ratio (HR). Results : Overall, we found a significant association between human herpes virus type 6 and schizophrenia, with an HR of 1.17 (95% confidence interval [CI] = 1.04, 1.32). Women and blacks had significant negative associations with herpes simplex virus type 2 and cytomegalovirus; among blacks, there was a significant positive association with herpes simplex virus type 1. Among men, there was a HHV-6 temporal effect with an HR of 1.41 (95% CI = 1.02, 1.96) for sera drawn 6–12 months before diagnosis. Discussion : Findings from previous studies of herpes family viruses and schizophrenia have been inconsistent. Our study is based on a larger population than most previous studies and used serum specimens collected before onset of illness. This study adds to the body of knowledge and provides testable hypotheses for follow-on studies.

Recent research has focused on infectious agents as potential players in the etiologic pathway of chronic diseases, 1–3 including psychiatric illnesses such as schizophrenia. 4–10 Due to their potential neurotropism and latency, viral organisms in particular are considered possible agents in many chronic central nervous system (CNS) disorders. 2 , 11–13 Encephalitis and other conditions leading to CNS inflammatory changes often present with symptoms that are difficult to distinguish from new-onset schizophrenia. As significant causes of encephalitis, viruses in the Herpes Simplex Family (herpes simplex virus type 1 [HSV-1], herpes simplex virus type 2 [HSV-2], Epstein-Barr virus [EBV], cytomegalovirus [CMV], varicella-zoster virus [VZV], and human herpes virus type 6 [HHV-6]) are of prime interest in schizophrenia research. 14

Although laboratory-based research into herpes family viruses as possible etiologic agents for schizophrenia goes back decades, 15–22 ascertaining the nature of a possible etiologic association between infection and schizophrenia is highly challenging. There have been few consistent findings between studies, which could be due to many factors, including the heterogeneity of schizophrenia itself, the use of different immunologic assays across the studies and over time as technology changes, focusing attention on a variety of different infections, and the possibility that maternal infection occurring at the right time during pregnancy may be enough to increase the risk of psychosis in offspring. 23 There is more information regarding risks associated with maternal, neonatal, or childhood infection and schizophrenia than for adult infection, with a number of studies reporting increased risk of schizophrenia among persons exposed in utero to a number of infections 24–26 or born after epidemic. 27

Little is known about potential mechanisms of action for herpes family virus infections and risk of schizophrenia. Studies of maternal infection provide some evidence that modulation of immune response 28 and adverse effects on in utero maturation of critical brain structural and functional componenets 24 may correlate with increased risk of schizophrenia in offspring. From animal models, maternal infection can alter interleukin 6, interleukin 1b, or tumor necrosis factor α (TNF-α) in amniotic fluid or placenta and TNF-α in the fetal brain. 29 There is evidence that cytokines have an important function in the development of fetal neurons and glial cells, and abnormal levels of these (proinflammatory) cytokines may contribute to abnormal brain development, 30–34 at least in animals.

Findings from epidemiologic studies of herpes family viruses and schizophrenia among adults are mixed. In their 1995 review article, Yolken and Torrey 14 identified numerous published studies assessing viral (not just herpes family viruses) antibodies in serum of schizophrenia cases. Several of the studies were not interpretable because they lacked control groups or were evaluating changes in antibody titers over time. Of 21 interpretable studies, 3 reported positive significant associations with HSV-1 (or HSV unspecified), 35–37 1 study reported an association with EBV, 18 and 1 found a negative association with CMV. 38

The Military New-Onset Psychosis Project hypothesis-generating studies afford unique opportunities to correct for some of the weaknesses identified in other studies of herpes viruses and schizophrenia. Because the US military routinely collects and stores serum samples and medical data from all active duty personnel, we are able to study large numbers of cases who have prediagnostic serum available and document the prevalence of infection prior to onset of disease. Because military members provide a serum sample at accession and about every 2 years thereafter while they remain on active duty, more than one prediagnostic sample will be available for many individuals. Sera from these routine samples are stored in the Department of Defense Serum Repository (DoDSR). 39 Informed consent is not routinely obtained when the serum is drawn; however, the DoDSR is maintained for surveillance and research purposes. Information obtained from this study may lead to more effective means of preventing, identifying, and treating new-onset schizophrenia in this population. This project was reviewed and approved by the appropriate human protection committees at the authors' institutions.

We assayed the serum of 180 individuals diagnosed with schizophrenia who had been hospitalized in a military facility with a mental health diagnosis and subsequently medically discharged or retired from the military between 1992 and 2001 and the serum of 532 controls with no evidence of any mental illness. Diagnosis date for cases was estimated as the date of first mental health hospitalization, used as a proxy for onset. Controls were selected 3:1 to cases and matched on the following variables: date of birth (± 1 year), the corresponding case's accession date ± 6 months, sex, race (black, white, other), branch of military service, and the number of serum specimens available. We attempted to obtain the same number of specimens for cases and controls: the first available specimen (usually collected during the accession medical examination), a specimen collected in the 3- to 24-month period prior to the matched case's first mental health hospitalization, and the first available after the matched case's hospitalization. Additional details on study subject selection and inclusion, serum selection and shipping, and sources and types of ancillary data have been published. 40 Sera were assayed for 6 herpes family viruses: HSV-1, HSV-2, EBV, CMV, VZV, and HHV-6.

IgG antibody to human herpes viruses were measured using microplate enzyme immunoassay. Reagents were obtained from the following sources: HSV-1 and HSV-2 from Focus Diagnostics, Cypress, CA; HHV-6 from Advanced Biotechnologies, Columbia, MD; VZV and EBV nuclear antigen from IBL Laboratories, Minneapolis, MN; and CMV from Viro-Immun Labor Diagnostika, Oberursel, Germany.

Enzyme immunoassay consists of binding serum to solid-phase antigen and subsequent reactions with enzyme-labeled antihuman IgG and enzyme substrate. The amount of color generated by the enzyme substrate reaction was measured in optical density (OD) units with a microplate colorimeter. This method of analysis was selected because it allows for high throughput measurement of antibodies using a common platform and requiring only small amounts of sample. Samples were run on plates under code in matched groups in which case and control status was not identified.

Quantitative Antibody Measurements Data Normalization

To investigate the relationship between subject status (case or control) and antibody level in a matched study design, conditional logistic regression is used. Failure to account for matching in analysis may lead to biased results, usually toward the null. The conditional analysis, which has a higher (less negative) log likelihood, suggests a better “fit” for this data. 42 A guiding concern in regression modeling is that the relationship between the independent and dependent variables (the latter assumed to be continuous) should be as inherently “linear” as possible; hence, OD was analyzed as a continuous, rather than dichotomous (positive vs negative) variable. This approach also preserves the power to detect a difference between cases and controls, particularly because infection with most of these herpes viruses is ubiquitous and differences cannot be detected based on prevalence data.

For this analysis, we chose the proportional hazards (PH) model for conditional logistic regression. 42 The PH model is similar to regular conditional logistic regression but modified to allow for multiple and variable numbers of controls per case and specimens per subject. Dummy “survival time” to diagnosis was generated so that all samples for a given case had the same event time with corresponding controls censored at a later time. Proportional hazard regression was then used to study associations, reported as hazard ratios (HRs), between “survival time” and the risk factors. Because the PH assumption might not be true for all the data, but may be true among specific demographic subgroups, we performed stratified analysis on several of the matched variables.

First, we assessed the overall antibody effect for each agent separately. The analysis was stratified by the matched factors with case or control status as the outcome. Variables analyzed included antibody level, the time from serum collection to the case's diagnosis, and time in service for both cases and controls. Logistic models were developed to assess the antibody effect for modeling the agents separately as well as simultaneously.

To study the homogeneity of the agent effect across demographic levels and time, interaction terms were also evaluated. Where an interaction with demographic factor was observed, separate models were developed to explore the possibility of effect modification by those factors. For homogeneity of the agent effect across the time, we studied the interaction with time to diagnosis. Time to diagnosis was categorized as follows: greater than 2 years, 1–2 years, 0.5–1 year, less than 0.5 year before diagnosis, and after diagnosis. The interaction with time to diagnosis shows the temporal effect of antibody level and describes the consistency of risk across time periods. The interactions of the agents with demographic factors show the heterogeneity of the agent effect by those factors.

Because we used scaled values to represent measured antibody level, there is no recognized unit with which to describe increasing or decreasing levels. In this case, we chose the SD as the unit of measure. All results are reported as HRs for each increase of 1 SD of antibody level. This method also allows comparison between the effects of different antibody agents on schizophrenia. The SD of antibody level for the 6 agents ranged from 0.10 for HHV-6 to 0.59 for HSV-2.

A total of 180 cases (contributing 404 serum specimens) and 532 controls (with 1180 specimens) were included in the study population. Eight cases could only be matched to 2 controls. Table 1 shows the distribution of cases and controls by demographic factors. Overall, about 83% were males, 49% were whites, 44% were blacks, over 57% were younger than 25 years, 10% were older than 35 years, about 12% were Hispanic, and over 56% were in the army. Approximately 35% of cases had greater than 3 years of military service.

Demographic and Military Characteristics of Study Subjects

Overall Antibody Effects

We found that antibody levels for all 6 agents were only weakly correlated (data not shown), and therefore, collinearity between agents would not likely be a source of bias or instability in the regression modeling. There was little difference in the results between the separate agent antibody models and the model with all agents considered simultaneously. For example, modeling agent separately, with increasing 1 SD, the HR for CMV was 0.87 (95% confidence interval [CI] = 0.77, 0.98), for HHV-6 was 1.13 (95% CI = 1.00, 1.27), comparing the HRs of 0.86 (95% CI = 0.76, 0.98) and 1.17 (95% CI = 1.04, 1.32), respectively, when modeling all 6 agents together. Therefore, only the simultaneous models are presented in table 2 , which presents the overall estimation from conditional logistic modeling. No consistent pattern emerged: a significant protective HR (0.86, 95% CI = 0.76, 0.98) was observed for CMV, a significant HR (1.17, 95% CI = 1.04, 1.32) was demonstrated for HHV-6, while no other agents were significantly associated with schizophrenia.

Schizophrenia Hazard Ratio associated with Antibody Levels in Multiagent Model a

Note : HSV-1, herpes simplex virus type 1; HSV-2, herpes simplex virus type 2; CMV, cytomegalovirus; EBV, Epstein-Barr virus; VZV, varicella-zoster virus; HHV-6, human herpes virus type 6.

Adjusted for age, race, sex, hospitalization status, years of service, and temporal relationship to diagnosis.

Hazard ratio expressed per 1 SD increase of each agent antibody level.

Stratified Analysis

No significant effect modification was noted for sex (male vs female) or age (<25 vs ≥25) for any of the infectious agents ( P values > .10). A significant interaction effect with race (black vs white) was found for HSV-1 ( P < .01) and for CMV ( P = .03), suggesting that significant differences exist in these antibody effects between black and white cases. As seen in table 3 , the HR among black cases was significant for HSV-1 (1.23, 95% CI = 1.02, 1.47), HSV-2 (0.83, 95% CI = 0.70, 0.99), and CMV (0.69, 95% CI = 0.57, 0.84), but no significant findings were observed among white cases. There were no significant interactions between agents and gender, likely due to the small number of female cases ( n  = 30) in the analyses, but there were differences in antibody effect between men and women. Among women, significant negative HRs were noted for 2 agents, HSV-2 (0.69, 95% CI = 0.52, 0.92) and CMV (0.62, 95% CI = 0.44, 0.89). The HHV-6 effect among males (HR of 1.17, 95% CI = 1.03, 1.33) was almost the same as for all subjects because the majority of subjects were male (83%). For females, the HHV-6 HR was lower (1.08, 95% CI = 0.70, 1.66) but not significant and with a broader CI, reflecting less precision in the estimate. There was no significant difference in antibody effect by age.

Schizophrenia Hazard Ratio associated with Antibody Levels in Multiagent Models by Sex and Race a

a Adjusted for age, hospitalization status, years of service, and temporal relationship to diagnosis.

Among men, only HHV-6 showed weak temporal variability, with the highest HR for sera drawn 12–6 months before onset (1.41, 95% CI = 1.02, 1.96) and the lowest HR of for sera drawn 2 years or longer before diagnosis (1.08, 95% CI = 0.88, 1.33) as shown in table 4 . No temporal effect was observed for women and other demographic groups due to small sample size.

Hazard Ratios for HHV-6 Antibody Levels by Temporal Relationship to Diagnosis Among Men

Hazard ratio expressed per 1 SD increase of the human herpes virus type 6 (HHV-6) immunoglobulin G antibody level.

Our hypothesis-generating study found a statistically significant positive HR between HHV-6 and schizophrenia among men and between HSV-1 and schizophrenia among blacks discharged or retired from the military with a diagnosis of schizophrenia and a history of mental health hospitalization.

A negative association with HSV-2 and CMV was noted among women and blacks. Blacks dominated the results for women. These findings should be interpreted with caution, however, because they are driven by a small number of subjects ( n  = 80 for black cases, n  = 30 for female cases, n  = 22 for black females) and may be the result of type I error. Further analysis is warranted with a larger sample size. No significant associations were observed for HSV-2, EBV, or VZV among men or women. No significant association was found among whites for any agent.

Our subanalysis of HHV-6 IgG levels by time period for males around diagnosis is obviously limited by the sample size. We conducted this analysis to replicate the analytic modeling in our previous, related work on Toxoplasma gondii IgG level and risk of schizophrenia. 40 The P values of .04 in this hypothesis-generating study warrant further evaluation in the hypothesis-testing phase of our research.

More recently, studies of antibody levels in serum and cerebrospinal fluid demonstrate mixed findings. One analysis of untreated subjects with recent-onset schizophrenia found increased IgG antibody levels to CMV, decreased levels of antibodies to HHV-6 and VZV, and no differences in antibody level to HSV-1 and -2 and EBV. 9 Several other studies of cerebrospinal fluid yielded conflicting results with some reporting increased CMV antibody titers 38 , 43–45 while others demonstrate no association. 46–48 Increased levels of HSV-1 antibody were demonstrated in one group of schizophrenic patients compared with normal controls, and cases with higher levels of antibody also demonstrated decreased gray matter in 2 areas of the brain. 49 Another study noted that deficit schizophrenics were more likely to have antibodies to CMV than were nondeficit patients. 50

A recent review of the literature regarding CMV and schizophrenia identified a number of studies reporting more frequent infection or higher levels of antibody in serum or cerebrospinal fluid. 51 The authors noted that studies conducted in 1992 were all null but that the serum assays utilized had been complement fixation or other less sensitive methods. They note 3 unpublished studies (M. J. Schwarz and N. Mueller, S. Bachmann; J. Schröder; and R. H. Yolken, unpublished data) in which patients with schizophrenia were more likely to have antibodies to CMV, or had higher levels of antibodies, than did the control subjects. One of these studies (F. B. Dickerson, C. Stallings, A. Origoni, J. J. Boronow, R. H. Yolken, unpublished data) was of 415 outpatients with schizophrenia and 164 matched controls, in which the odds ratio for CMV positivity was 2.1. The authors note that patients who were seropositive were more likely to be female, black, older, and less educated. Leweke et al 9 found that CMV IgG antibody levels, but not HSV-1, HSV-2, EBV, HHV-6, or VZV, were higher among patients with schizophrenia. 9

Given the limited amount of research reported and the discordant findings among the existing articles regarding herpes viruses and schizophrenia, interpretation of our findings is challenging. Recent work has implicated HHV-6 in acute 52 and chronic 53 , 54 neurologic diseases. We note the negative association with HSV-2 and CMV among women and blacks and the positive association with HSV-1 among blacks. Although speculative, and limited by sample size, there is a potential for underlying genetic differences that could explain some portion of the racial differences.

There are a number of factors that could potentially account for the discrepancies observed between the various reports above and the present study. These include but are not limited to differences in diagnostic criteria for schizophrenia, all cases were adult onset, different time frames of serum collection related to illness onset, differences in laboratory assays, and multiple vs single serum specimens. In addition, our sample was drawn from the military population that differs from the general US population in several important ways. The male to female ratio in the military is much higher than in the general population, making it difficult to achieve adequate power when analyzing females separately. Comprehensive medical screening prior to entry into the military creates a healthy worker effect in the population and skews our sample toward individuals with later onset of schizophrenia. Also, our cases are a convenience sample of individuals with schizophrenia in the military. A small degree of bias introduced by any of these factors could account for the significance and direction of our findings.

This study has 2 important strengths. First, we used cases that were diagnosed and discharged from the military after a careful evaluation process. 40 A record review conducted on a sample of study subjects demonstrated a high level of concordance between the diagnoses documented in the military records and those assigned by 2 psychiatrist reviewers. 55 Also, because military applicants receive an extensive administrative and medical examination prior to accession, are directly supervised while on active duty, and have access to health care, we assumed that cases were not psychotic on accession and that the onset of psychosis would generally result in a mental health hospitalization for an expedited psychiatric evaluation. This assumption was validated by the record review. 55 Therefore, we are confident that diagnostic misclassification is not a major source of error in our findings.

In addition, the current study obtained multiple (between 1 and 3) specimens for most subjects both prior to and after onset of illness in the matched case. The second specimen, collected in the 3- to 24-month period prior to first mental health hospitalization was chosen as preonset of psychosis. The ability to analyze longitudinal specimens may be important if illness alters behaviors in a way that could impact antibody levels or if medical treatment for illness changes antibody responses.

Although this hypothesis-generating study does not resolve the issue of diverse findings between studies, we feel that our study has advantages over other studies with our high degree of diagnostic validity, adequate numbers of appropriate controls, and multiple serum specimens. It is clear that additional studies are needed to clarify the many remaining questions, particularly regarding HHV-6, CMV, and HSV-1. As part of our broad research program, we will be conducting a much larger case-control study with approximately 1600 cases and 2100 controls. Although herpes family viruses will not be the primary focus of this follow-on study, we intend to further explore the associations identified in this hypothesis-generating study.

Stanley Medical Research Institute, Bethesda, MD; Department of the Army; Funding Cooperative Research and Development Agreement (DAMD control, No: 17-04-0041).

The views expressed are those of the authors and should not be construed to represent the positions of the Department of the Army or Department of Defense. None of the authors have any associations, financial or otherwise, that may present a conflict of interest.

Google Scholar

Google Preview

Email alerts

Citing articles via.

  • Recommend to your Library

Affiliations

Schizophrenia International Research Society

  • Online ISSN 1745-1701
  • Print ISSN 0586-7614
  • Copyright © 2024 Maryland Psychiatric Research Center and Oxford University Press
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

THE CDC FIELD EPIDEMIOLOGY MANUAL

Designing and Conducting Analytic Studies in the Field

Brendan R. Jackson And Patricia M. Griffin

Analytic studies can be a key component of field investigations, but beware of an impulse to begin one too quickly. Studies can be time- and resource-intensive, and a hastily constructed study might not answer the correct questions. For example, in a foodborne disease outbreak investigation, if the culprit food is not on your study’s questionnaire, you probably will not be able to implicate it. Analytic studies typically should be used to test hypotheses, not generate them. However, in certain situations, collecting data quickly about patients and a comparison group can be a way to explore multiple hypotheses. In almost all situations, generating hypotheses before designing a study will help you clarify your study objectives and ask better questions.

  • Generating Hypotheses
  • Study Designs for Testing Hypotheses
  • Types of Observational Studies for Testing Hypotheses
  • Selection of Controls in Case–Control Studies
  • Matching in Case–Control Studies
  • Example: Using an Analytic Study to Solve an Outbreak at a Church Potluck Dinner (But Not That Church Potluck)
  • Outbreaks with Universal Exposure

The initial steps of an investigation, described in previous chapters, are some of your best sources of hypotheses. Key activities include the following:

  • By examining the sex distribution among persons in outbreaks, US enteric disease investigators have learned to suspect a vegetable as the source when most patients are women. (Of course, generalizations do not always hold true!)
  • In an outbreak of bloodstream infections caused by Serratia marcescens among patients receiving parenteral nutrition (food administered through an intravenous catheter), investigators had a difficult time finding the source until they noted that none of the 19 cases were among children. Further investigation of the parenteral nutrition administered to adults but not children in that hospital identified contaminated amino acid solution as the source ( 1 ).
  • Focus on outliers. Give extra attention to the earliest and latest cases on an epidemic curve and to persons who recently visited the neighborhood where the outbreak is occurring. Interviews with these patients can yield important clues (e.g., by identifying the index case, secondary case, or a narrowed list of common exposures).
  • Determine sources of similar outbreaks. Consult health department records, review the literature, and consult experts to learn about previous sources. Be mindful that new sources frequently occur, given ever-changing social, behavioral, and commercial trends.
  • Conduct a small number of in-depth, open-ended interviews. When a likely source is not quickly evident, conducting in-depth (often >1 hour), open-ended interviews with a subset of patients (usually 5 to 10) or their caregivers can be the best way to identify possible sources. It helps to begin with a semistructured list of questions designed to help the patient recall the events and exposures of every day during the incubation period. The interview can end with a “shotgun” questionnaire (see activity 6) ( Box 7.1 ). A key component of this technique is that one investigator ideally conducts, or at least participates in, as many interviews as possible (five or more) because reading notes from others’ interviews is no substitute for soliciting and hearing the information first-hand. For example, in a 2009 Escherichia coli O157 outbreak, investigators were initially unable to find the source through general and targeted questionnaires. During open-ended interviews with five patients, the interviewer noted that most reported having eaten strawberries, a particular type of candy, and uncooked prepackaged cookie dough. An analytic study was then conducted that included questions about these exposures; it confirmed cookie dough as the source ( 3 ).
  • Ask patients what they think. Patients can have helpful thoughts about the source of their illness. However, be aware that patients often associate their most recent food exposure (e.g., a meal) with illness, whereas the inciting exposure might have been long before.
  • Consider administering a shotgun questionnaire. Such questionnaires, which typically ask about hundreds of possible exposures, are best used on a limited number of patients as part of hypothesis-generating interviews. After generating hypotheses, investigators can create a questionnaire targeted to that investigation. Although not an ideal method, shotgun questionnaires can be used by multiple interviewers to obtain data about large numbers of patients ( Box 7.1 ).

In November 2014, a US surveillance system for foodborne diseases (PulseNet) detected a cluster (i.e., a possible outbreak) of listeriosis cases based on similar-appearing Listeria monocytogenes isolates by pulsed-field gel electrophoresis of the isolates. No suspected foods were identified through routine patient interviews by using a Listeria -specific questionnaire with approximately 40 common food sources of listeriosis (e.g., soft cheese and deli meat). The outbreak’s descriptive epidemiology offered no clear leads: the sex distribution was nearly even, the age spectrum was wide, and the case-fatality rate of approximately 20% was typical. Notably, however, 3 of the 35 cases occurred among previously healthy school-aged children, which is highly unusual for listeriosis. Most cases occurred during late October and early November.

Investigators began reinterviewing patients by using a hypothesis-generating shotgun questionnaire with more than 500 foods, but it did not include caramel apples. By comparing the first nine patient responses with data from a published survey of food consumption, strawberries and ice cream emerged as hypotheses. However, several interviewed patients denied having eaten these foods during the month before illness. An investigator then conducted lengthy, open-ended interviews with patients and their family members. During one interview, he asked about special foods eaten during recent holidays, and the patient’s wife replied that her husband had eaten prepackaged caramel apples around Halloween. Although produce items had been implicated in past listeriosis outbreaks, caramel apples seemed an unlikely source. However, the interviewer took note of this connection because he had previously interviewed another patient who reported having eaten caramel apples. This event underscores the importance of one person conducting multiple interviews because that person might make subtle mental connections that may be missed when reviewing other interviewers’ notes. In fact, several other investigators listening to the interview noted this exposure—among hundreds of others—but thought little of it.

In this investigation, the finding of high strawberry and ice cream consumption among patients, coupled with the timing of the outbreak during a holiday period, helped make a sweet food (i.e., caramel apples) seem more plausible as the possible source.

To explore the caramel apple hypothesis, investigators asked five other patients about this exposure, and four reported having eaten them. On the basis of these initial results, investigators designed and administered a targeted questionnaire to patients involved in the outbreak, as well as to patients infected with unrelated strains of L. monocytogenes (i.e., a case–case study). This study, combined with testing of apples and the apple packing facility, confirmed that caramel apples were the source (2). Had a single interviewer performed multiple open-ended interviews to generate hypotheses before the shotgun questionnaire, the outbreak might have been solved sooner.

As evident in public health and clinical guidelines, randomized controlled trials (e.g., trials of drugs, vaccines, and community-level interventions) are the reference standard for epidemiology, providing the highest level of evidence. However, such studies are not possible in certain situations, including outbreak investigations. Instead, investigators must rely on observational studies, which can provide sufficient evidence for public health action. In observational studies, the epidemiologist documents rather than determines the exposures, quantifying the statistical association between exposure and disease. Here again, the key when designing such studies is to obtain a relevant comparison group for the patients ( Box 7.2 ).

Because field analytic studies are used to quantify the association between exposure and disease, defining what is meant by exposure and disease is essential. Exposure is used broadly, meaning demographic characteristics, genetic or immunologic makeup, behaviors, environmental exposures, and other factors that might influence a person’s risk for disease. Because precise information can help accurately estimate an exposure’s effect on disease, exposure measures should be as objective and standard as possible. Developing a measure of exposure can be conceptually straightforward for an exposure that is a relatively discrete event or characteristic—for example, whether a person received a spinal injection with steroid medication compounded at a specific pharmacy or whether a person received a typhoid vaccination during the year before international travel. Although these exposures might be straightforward in theory, they can be subject to interpretation in practice. Should a patient injected with a medication from an unknown pharmacy be considered exposed? Whatever decision is made should be documented and applied consistently.

Additionally, exposures often are subject to the whims of memory. Memory aids (e.g., restaurant menus, vaccination cards, credit card receipts, and shopper cards) can be helpful. More than just a binary yes or no, the dose of an exposure can also be enlightening. For example, in an outbreak of fungal bloodstream infections linked to contaminated intravenous saline flushes administered at an oncology clinic, affected patients had received a greater number of flushes than unaffected patients ( 4 ). Similarly, in an outbreak of Listeria monocytogenes infections, the association with deli meat became apparent only when the exposure evaluated was consumption of deli meat more than twice a week ( 5 ).

Defining disease (e.g., does a person have botulism?) might sound simple, but often it is not; read more about making and applying disease case definitions in Chapter 3 .

Three types of observational studies are commonly used in the field. All are best performed by using a standard questionnaire specific for that investigation, developed on the basis of hypothesis-generating interviews.

Observational Study Type 1: Cohort

In concept, a cohort study, like an experimental study, begins with a group of persons without the disease under study, but with different exposure experiences, and follows them over time to find out whether they experience the disease or health condition of interest. However, in a cohort study, each person’s exposure is merely recorded rather than assigned randomly by the investigator. Then the occurrence of disease among persons with different exposures is compared to assess whether the exposures are associated with increased risk for disease. Cohort studies can be prospective or retrospective.

Prospective Cohort Studies

A prospective cohort study enrolls participants before they experience the disease or condition of interest. The enrollees are then followed over time for occurrence of the disease or condition. The unexposed or lowest exposure group serves as the comparison group, providing an estimate of the baseline or expected amount of disease. An example of a prospective cohort study is the Framingham Heart Study. By assessing the exposures of an original cohort of more than 5,000 adults without cardiovascular disease (CVD), beginning in 1948 and following them over time, the study was the first to identify common CVD risk factors ( 6 ). Each case of CVD identified after enrollment was counted as an incident case. Incidence was then quantified as the number of cases divided by the sum of time that each person was followed (incidence rate) or as the number of cases divided by the number of participants being followed (attack rate or risk or i ncidence proportion). In field epidemiology, prospective cohort studies also often involve a group of persons who have had a known exposure (e.g., survived the World Trade Center attack on September 11, 2001 [ 7 ]) and who are then followed to examine the risk for subsequent illnesses with long incubation or latency periods.

Retrospective Cohort Studies

A retrospective cohort study enrolls a defined participant group after the disease or condition of interest has occurred. In field epidemiology, these studies are more common than prospective studies. The population affected is often well-defined (e.g., banquet attendees, a particular school’s students, or workers in a certain industry). Investigators elicit exposure histories and compare disease incidence among persons with different exposures or exposure levels.

Observational Study Type 2: Case–Control

In a case–control study, the investigator must identify a comparison group of control persons who have had similar opportunities for exposure as the case-patients. Case–control studies are commonly performed in field epidemiology when a cohort study is impractical (e.g., no defined cohort or too many non-ill persons in the group to interview). Whereas a cohort study proceeds conceptually from exposure to disease or condition, a case–control study begins conceptually with the disease or condition and looks backward at exposures. Excluding controls by symptoms alone might not guarantee that they do not have mild cases of the illness under investigation. Table 7.1 presents selected key differences between a case–control and retrospective cohort study.

Observational Study Type 3: Case–Case

In case–case studies, a group of patients with the same or similar disease serve as a comparison group (8). This method might require molecular subtyping of the suspected pathogen to distinguish outbreak-associated cases from other cases and is especially useful when relevant controls are difficult to identify. For example, controls for an investigation of Listeria illnesses typically are patients with immunocompromising conditions (e.g., cancer or corticosteroid use) who might be difficult to identify among the general population. Patients with Listeria isolates of a different subtype than the outbreak strain can serve as comparisons to help reduce bias when comparing food exposures. However, patients with similar illnesses can have similar exposures, which can introduce a bias, making identifying the source more difficult. Moreover, other considerations should influence the choice of a comparison group. If most outbreak-associated case-patients are from a single neighborhood or are of a certain race/ethnicity, other patients with listeriosis from across the country will serve as an inadequate comparison group.

Considerations for Selecting Controls

Selecting relevant controls is one of the most important considerations when designing a case–control study. Several key considerations are presented here; consult other resources for in-depth discussion ( 9,10 ). Ideally, controls should

  • Thoroughly reflect the source population from which case-patients arose, and
  • Provide a good estimate of the level of exposure one would expect from that population. Sometimes the source population is not so obvious, and a case–control study using controls from the general population might be needed to implicate a general exposure (e.g., visiting a specific clinic, restaurant, or fair). The investigation can then focus on specific exposures among persons with the general exposure (see also next section).

Controls should be chosen independently of any specific exposure under evaluation. If you select controls on the basis of lack of exposure, you are likely to find an association between illness and that exposure regardless of whether one exists. Also important is selecting controls from a source population in a way that minimizes confounding (see Chapter 8 ), which is the existence of a factor (e.g., annual income) that, by being associated with both exposure and disease, can affect the associations you are trying to examine.

When trying to enroll controls who reflect the source population, try to avoid overmatching (i.e., enrolling controls who are too similar to case-patients, resulting in fewer differences among case-patients and controls than ought to exist and decreased ability to identify exposure–disease associations). When conducting case–control studies in hospitals and other healthcare settings, ensure that controls do not have other diseases linked to the exposure under study.

Commonly Used Control Selection Methods

When an outbreak does not affect a defined population (e.g., potluck dinner attendees) but rather the community at large, a range of options can be used to determine how to select controls from a large group of persons.

  • Random-digit dialing . This method, which involves selecting controls by using a system that randomly selects telephone numbers from a directory, has been a staple of US outbreak investigations. In recent years, however, declining response rates because of increasing use of caller identification and cellular phones and lack of readily available directory listings of cellular phone numbers by geographic area have made this method increasingly difficult. Even when this method was most useful, often 50 or more numbers needed to be dialed to reach one household or person who both answered and provided a usable match for the case-patient. Commercial databases that include cellular phone numbers have been used successfully to partially address this problem, but the method remains time-consuming ( 11 ).
  • Random or systematic sampling from a list . For investigations in settings where a roster is available (e.g., attendees at a resort on certain dates), controls can be selected by either random or systematic sampling. Government records (e.g., motor vehicle, voter, or tax records) can provide lists of possible controls, but they might not be representative of the population being studied ( 11 ). For random sampling, a table or computer-generated list of random numbers can be used to select every n th persons to contact (e.g., every 12th or 13th).
  • Neighborhood . Recruiting controls from the same neighborhood as case-patients (i.e., neighborhood matching) has commonly been used during case–control studies, particularly in low-and middle-income countries. For example, during an outbreak of typhoid fever in Tajikistan ( 12 ), investigators recruited controls by going door-to-door down a street, starting at a case-patient’s house; a study of cholera in Haiti used a similar method ( 13 ). Typically, the immediately neighboring households are skipped to prevent overmatching.
  • Patients’ friends or relatives . Using friends and relatives as controls can be an effective technique when the characteristics of case-patients (e.g., very young children) make finding controls by a random method difficult. Typically, the investigator interviews a patient or his or her parent, then asks for the names and contact information for more friends or relatives who are needed as controls. One advantage is that the friends of an ill person are usually willing to participate, knowing their cooperation can help solve the puzzle. However, because they can have similar personal habits and preferences as patients, their exposures might be similar. Such overmatching can decrease the likelihood of finding the source of the illness or condition.
  • Databases of persons with exposure information . Sources of data on persons with exposure information include survey data (e.g., FoodNet Population Survey [ 14 ]), public health databases of patients with other illnesses or a different subtype of the same illness, and previous studies. ( Chapter 4 describes additional sources.)

When considering outside data sources, investigators must determine whether those data provide an appropriate comparison group. For example, persons in surveys might differ from case-patients in ways that are impossible to determine. Other patients might be so similar to case-patients that risky exposures are unidentifiable, or they might be so different that exposures identified as risks are not true risks.

To help control for confounding, controls can be matched to case-patients on characteristics specified by investigators, including age group, sex, race/ethnicity, and neighborhood. Such matching does not itself reduce confounding, but it enables greater efficiency when matched analyses are performed that do ( 15 ). When deciding to match, however, be judicious. Matching on too many characteristics can make controls difficult to find (making a tough process even harder). Imagine calling hundreds of random telephone numbers trying to find a man of a particular ethnicity aged 50–54 years who is then willing to answer your questions. Also, remember not to match on the exposure of interest or on any other characteristic you wish to examine. Matched case–control study data typically necessitate a matched analysis (e.g., conditional logistic regression) ( 15 ).

Matching Types

The two main types of matching are pair matching and frequency matching.

Pair Matching

In pair matching, each control is matched to a specific case-patient. This method can be helpful logistically because it allows matching by friends or relatives, neighborhood, or telephone exchange, but finding controls who meet specific criteria can be burdensome.

Frequency Matching

In frequency matching, also called category matching , controls are matched to case-patients in proportion to the distribution of a characteristic among case-patients. For example, if 20% of case-patients are children aged 5–18 years, 50% are adults aged 19–49 years, and 30% are adults 50 years or older, controls should be enrolled in similar proportions. This method works best when most case-patients have been identified before control selection begins. It is more efficient than pair matching because a person identified as a possible control who might not meet the criteria for matching a particular case-patient might meet criteria for one of the case-patient groups.

Number of Controls

Most field case–control studies use control-to-case-patient ratios of 1:1, 2:1, or 3:1. Enrolling more than one control per case-patient can increase study power, which might be needed to detect a statistically significant difference in exposure between case-patients and controls, particularly when an outbreak involves a limited number of cases. The incremental gain of adding more controls beyond three or four is small because study power begins to plateau. Note that not all case-patients need to have the same number of controls. Sample size calculations can help in estimating a target number of controls to enroll, although sample sizes in certain field investigations are limited more by time and resource constraints. Still, estimating study power under a range of scenarios is wise because an analytic study might not be worth doing if you have little chance of detecting a statistically significant association. Sample size calculators for unmatched case–control studies are available at http://www.openepi.com and in the StatCalc function of Epi Info ( https://www.cdc.gov/epiinfo ).

More than One Control Group

Sometimes the choice of a control group is so vexing that investigators decide to use more than one type of control group (e.g., a hospital-based group and a community group). If the two control groups provide similar results and conclusions about risk factors for disease, the credibility of the findings is increased. In contrast, if the two control groups yield conflicting results, interpretation becomes more difficult.

Since the 1940s, field epidemiology students have studied a classic outbreak of gastrointestinal illness at a church potluck dinner in Oswego, New York ( 16 ). However, the case study presented here, used to illustrate study designs, is a different potluck dinner.

In April 2015, an astute neurologist in Lancaster, Ohio, contacted the local health department about a patient in the emergency department with a suspected case of botulism. Within 2 hours, four more patients arrived with similar symptoms, including blurred vision and shortness of breath. Health officials immediately recognized this as a botulism outbreak.

  • If the source is a widely distributed commercial product, then the population to study is persons across the United States and possibly abroad.
  • If the source is airborne, then the population to study is residents of a single city or area.
  • If the source is food from a restaurant, then the population to study is predominantly local residents and some travelers.
  • If the source is a meal at a workplace or social setting, then the population to study is meal attendees.
  • If the source is a meal at home, then the population to study is household members and any guests.

Descriptive epidemiology and questioning of the case-patients revealed that all had eaten at the same church potluck dinner and had no other common exposures, making the potluck the likely exposure site and attendees the likely source population. Thus, an analytic study would be targeted at potluck attendees, although investigators must remain alert to case-patients among nonattendees. As initial interviews were conducted, more cases of botulism were being diagnosed, quickly increasing to more than 25. The source of the outbreak needed to be identified rapidly to halt further exposure and illness.

  • List of foods served at the potluck.
  • Approximate number of attendees.
  • A case definition.
  • Information from 5–10 hypothesis-generating interviews with a few case-patients or their family members.
  • A cohort study would be a reasonable option because a defined group exists (i.e., a cohort) of exposed persons who could be interviewed in a reasonable amount of time. The study would be retrospective because the outcome (i.e., botulism) has already occurred, and investigators could assess exposures retrospectively (i.e., foods eaten at the potluck) by interviewing attendees.
  • In a cohort study, investigators can calculate the attack rate for botulism among potluck attendees who reported having eaten each food and for those who had not. For example, if 20 of the 30 attendees who had eaten a particular food (e.g., potato salad) had botulism, you would calculate the attack rate by dividing 20 (corresponding to cell a in Handout 7.1 ) by 30 (total exposed, or a + b), yielding approximately 67%. If 5 of the 45 attendees who had not eaten potato salad had botulism, the attack rate among the unexposed—5 / 45, corresponding to c/ (c + d)—would be approximately 11%. The risk ratio would be 6, which is calculated by dividing the attack rate among the exposed (67%) by the attack rate among the unexposed (11%).
  • A case–control study would be the most feasible option because the entire cohort could not be identified and because the large number of attendees could make interviewing them all difficult. Rather than interview all non-ill persons, a subset could be interviewed as control subjects.
  • The method of control subject selection should be considered carefully. If all attendees are not interviewed, determining the risk for botulism among the exposed and unexposed is impossible because investigators would not know the exposures for all non-ill attendees. Instead of risk, investigators calculate the odds of exposure, which can approximate risk. For example, if 20 (80%) of 25 case-patients had eaten potato salad, the odds of potato salad exposure among case-patients would be 20/ 5 = 4 (exposed/ unexposed, or a/ c in Handout 7.2 ). If 10 (20%) of 50 selected controls had eaten potato salad, the odds of exposure among control subjects would be 10/ 40 = 0.25 (or b/ d in Handout 7.2). Dividing the odds of exposure among the case-patients (a/ c) by the odds of exposure among control subjects (b / d) yields an odds ratio of 16 (4/ 0.25). The odds ratio is not a true measure of risk, but it can be used to implicate a food. An odds ratio can approximate a risk ratio when the outcome or disease is rare (e.g., roughly <5% of a population). In such cases, a/ b is similar to a/ (a + b). The odds ratio is typically higher than the risk ratio when >5% of exposed persons in the analysis have the illness.

In the actual outbreak, 29 (38%) of 77 potluck attendees had botulism. The investigators performed a cohort study, interviewing 75 of the 77 attendees about 52 foods served ( 17 ). The attack rate among persons who had eaten potato salad was significantly and substantially higher than the attack rate among those who had not, with a risk ratio of 14 (95% confidence interval 5–42). One of the potato salads served was made with incorrectly home-canned potatoes (a known source of botulinum toxin), and samples of discarded potato salad tested positive for botulinum toxin, supporting the findings of the analytic study. (Of note, persons often blame potato salad for causing illness when, in fact, it rarely is a source. This outbreak was a notable exception.)

In field epidemiology, the link between exposure and illness is often so strong that it is evident despite such inherent study limitations as small sample size and exposure misclassification. In this outbreak, a few of the patients with botulism reported not having eaten potato salad, and some of the attendees without botulism reported having eaten it. In epidemiologic studies, you rarely find 100% concordance between exposure and outcome for various reasons, including incomplete or erroneous recall because remembering everything eaten is difficult. Here, cross-contamination of potato salad with other foods might have helped explain cases among patients who had not eaten potato salad because only a small amount of botulinum toxin is needed to produce illness.

Two-by-Two Table to Calculate the Relative Risk, or Risk Ratio, in Cohort Studies

Two- by- two tables are covered in more detail in Chapter 8 .

Risk Ratio = Incidence in exposed over Incidence in unexposed = a over a+b over c over c+d

Two-by-Two Table to Calculate the Odds Ratio in Case–Control Studies

A risk ratio cannot be calculated from a case–control study because true attack rates cannot be calculated.

Odds ratio = Odds of exposure in cases over Odds of exposure in controls = a/c over b/d = ad over bc

What kind of study would you design if your hypothesis-generating interviews lead you to believe that everyone, or nearly everyone, was exposed to the same suspected infection source? How would you test hypotheses if all barbecue attendees, ill and non-ill, had eaten the chicken or if all town residents had drunk municipal tap water, and no unexposed group exists for comparison? A few factors that might be of help are the exposure timing (e.g., a particularly undercooked batch of barbeque), the exposure place (e.g., a section of the water system more contaminated than others), and the exposure dose (e.g., number of chicken pieces eaten or glasses of water drunk). Including questions about the time, place, and frequency of highly suspected exposures in a questionnaire can improve the chances of detecting a difference ( 18 ).

Cohort, case–control, and case–case studies are the types of analytic studies that field epidemiologists use most often. They are best used as mechanisms for evaluating—quantifying and testing—hypotheses identified in earlier phases of the investigation. Cohort studies, which are oriented conceptually from exposure to disease, are appropriate in settings in which an entire population is well-defined and available for enrollment (e.g., guests at a wedding reception). Cohort studies are also appropriate when well-defined groups can be enrolled by exposure status (e.g., employees working in different parts of a manufacturing plant). Case–control studies, in contrast, are useful when the population is less clearly defined. Case–control studies, oriented from disease to exposure, identify persons with disease and a comparable group of persons without disease (controls). Then the exposure experiences of the two groups are compared. Case–case studies are similar to case–control studies, except that controls have an illness not linked to the outbreak. Case–control studies are probably the type most often appropriate for field investigations. Although conceptually straightforward, the design of an effective epidemiologic study requires many careful decisions. Taking the time needed to develop good hypotheses can result in a questionnaire that is useful for identifying risk factors. The choice of an appropriate comparison group, how many controls per case-patient to enroll, whether to match, and how best to avoid potential biases are all crucial decisions for a successful study.

This chapter relies heavily on the work of Richard C. Dicker, who authored this chapter in the previous edition.

  • Gupta N, Hocevar SN, Moulton-Meissner HA, et al. Outbreak of Serratia marcescens bloodstream infections in patients receiving parenteral nutrition prepared by a compounding pharmacy. Clin Infect Dis. 2014;59:1–8.
  • Angelo K, Conrad A, Saupe A, et al. Multistate outbreak of Listeria monocytogenes infections linked to whole apples used in commercially produced, prepackaged caramel apples: United States, 2014–2015. Epidemiol Infect. 2017;145:848–56.
  • Neil KP, Biggerstaff G, MacDonald JK, et al. A novel vehicle for transmission of Escherichia coli O157: H7 to humans: multistate outbreak of E. coli O157: H7 infections associated with consumption of ready-to-bake commercial prepackaged cookie dough—United States, 2009. Clin Infect Dis. 2012;54:511–8.
  • Vasquez AM, Lake J, Ngai S, et al. Notes from the field: fungal bloodstream infections associated with a compounded intravenous medication at an outpatient oncology clinic—New York City, 2016. MMWR. 2016;65:1274–5.
  • Gottlieb SL, Newbern EC, Griffin PM, et al. Multistate outbreak of listeriosis linked to turkey deli meat and subsequent changes in US regulatory policy. Clin Infect Dis. 2006;42:29–36.
  • Framingham Heart Study: A Project of the National Heart, Lung, and Blood Institute and Boston University. Framingham, MA: Framingham Heart Study; 2017. https://www.framinghamheartstudy.org/
  • Jordan HT, Brackbill RM, Cone JE, et al. Mortality among survivors of the Sept 11, 2001, World Trade Center disaster: results from the World Trade Center Health Registry cohort. Lancet. 2011;378:879–87.
  • McCarthy N, Giesecke J. Case– case comparisons to study causation of common infectious diseases. Int J Epidemiol. 1999;28:764–8.
  • Rothman KJ, Greenland S. Modern epidemiology . 3rd ed. Philadelphia: Lippincott Williams & Wilkins; 2008.
  • Wacholder S, McLaughlin JK, Silverman DT, Mandel JS. Selection of controls in case–control studies. I. Principles. Am J Epidemiol. 1992;135:1019–28.
  • Chintapalli S, Goodman M, Allen M, et al. Assessment of a commercial searchable population directory as a means of selecting controls for case–control studies. Public Health Rep. 2009;124:378–83.
  • Centers for Disease Control and Prevention. Epidemiologic case studies: typhoid in Tajikistan. http://www.cdc.gov/epicasestudies/classroom_typhoid.html
  • Dunkle SE, Mba-Jonas A, Loharikar A, Fouche B, Peck M, Ayers T. Epidemic cholera in a crowded urban environment, Port-au-Prince, Haiti. Emerg Infect Dis. 2011;17:2143–6.
  • Centers for Disease Control and Prevention. Foodborne Diseases Active Surveillance Network (FoodNet): population survey. http://www.cdc.gov/foodnet/surveys/population.html
  • Pearce N. Analysis of matched case–control studies. BMJ. 2016;352:1969.
  • Centers for Disease Control and Prevention. Case studies in applied epidemiology: Oswego: an outbreak of gastrointestinal illness following a church supper. http://www.cdc.gov/eis/casestudies.html
  • McCarty CL, Angelo K, Beer KD, et al. Notes from the field.: large outbreak of botulism associated with a church potluck meal—Ohio, 2015. MMWR. 2015;64:802–3.
  • Tostmann A, Bousema JT, Oliver I. Investigation of outbreaks complicated by universal exposure. Emerg Infect Dis. 2012;18:1717–22.

< Previous Chapter 6: Describing Epidemiologic Data

Next Chapter 8: Analayzing and Interpreting Data >

The fellowship application period is open now through June 5, 2024.

Apply Online

The host site application period is now closed.

For questions, please contact the EIS program directly at [email protected] .

  • Laboratory Leadership Service (LLS)
  • Fellowships and Training Opportunities
  • Division of Workforce Development

Exit Notification / Disclaimer Policy

  • The Centers for Disease Control and Prevention (CDC) cannot attest to the accuracy of a non-federal website.
  • Linking to a non-federal website does not constitute an endorsement by CDC or any of its employees of the sponsors or the information and products presented on the website.
  • You will be subject to the destination website's privacy policy when you follow the link.
  • CDC is not responsible for Section 508 compliance (accessibility) on other federal or private website.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Indian J Dermatol
  • v.61(2); Mar-Apr 2016

Methodology Series Module 2: Case-control Studies

Maninder singh setia.

Epidemiologist, MGM Institute of Health Sciences, Navi Mumbai, Maharashtra, India

Case-Control study design is a type of observational study. In this design, participants are selected for the study based on their outcome status. Thus, some participants have the outcome of interest (referred to as cases), whereas others do not have the outcome of interest (referred to as controls). The investigator then assesses the exposure in both these groups. The investigator should define the cases as specifically as possible. Sometimes, definition of a disease may be based on multiple criteria; thus, all these points should be explicitly stated in case definition. An important aspect of selecting a control is that they should be from the same ‘study base’ as that of the cases. We can select controls from a variety of groups. Some of them are: General population; relatives or friends; and hospital patients. Matching is often used in case-control control studies to ensure that the cases and controls are similar in certain characteristics, and it is a useful technique to increase the efficiency of the study. Case-Control studies can usually be conducted relatively faster and are inexpensive – particularly when compared with cohort studies (prospective). It is useful to study rare outcomes and outcomes with long latent periods. This design is not very useful to study rare exposures. Furthermore, they may also be prone to certain biases – selection bias and recall bias.

Introduction

Case-Control study design is a type of observational study design. In an observational study, the investigator does not alter the exposure status. The investigator measures the exposure and outcome in study participants, and studies their association.

In a case-control study, participants are selected for the study based on their outcome status. Thus, some participants have the outcome of interest (referred to as cases), whereas others do not have the outcome of interest (referred to as controls). The investigator then assesses the exposure in both these groups. Thus, by design, in a case-control study the outcome has to occur in some of the participants that have been included in the study.

As seen in Figure 1 , at the time of entry into the study (sampling of participants), some of the study participants have the outcome (cases) and others do not have the outcome (controls). During the study procedures, we will examine the exposure of interest in cases as well as controls. We will then study the association between the exposure and outcome in these study participants.

An external file that holds a picture, illustration, etc.
Object name is IJD-61-146-g001.jpg

Example of a case-control study

Examples of Case-Control Studies

Smoking and lung cancer study.

In their landmark study, Doll and Hill (1950) evaluated the association between smoking and lung cancer. They included 709 patients of lung carcinoma (defined as cases). They also included 709 controls from general medical and surgical patients. The selected controls were similar to the cases with respect to age and sex. Thus, they included 649 males and 60 females in cases as well as controls.

They found that only 0.3% of males were non-smokers among cases. However, the proportion of non-smokers among controls was 4.2%; the different was statistically significant ( P = 0.00000064). Similarly they found that about 31.7% of the female were non-smokers in cases compared with 53.3% in controls; this difference was also statistically significant (0.01< p <0.02).

Melanoma and tanning (Lazovic et al ., 2010)

The authors conducted a case-control study to study the association between melanoma and tanning. The 1167 cases - individuals with invasive cutaneous melanoma – were selected from Minnesota Cancer Surveillance System. The 1101 controls were selected randomly from Minnesota State Driver's License list; they were matched for age (+/- 5 years) and sex.

The data were collected by self administered questionnaires and telephone interviews. The investigators assessed the use of tanning devices (using photographs), number of years, and frequency of use of these devices. They also collected information on other variables (such as sun exposure; presence of freckles and moles; and colour of skin, hair, among other exposures.

They found that melanoma was higher in individuals who used UVB enhances and primarily UVA-emitting devices. The risk of melanoma also increased with increase in years of use, hours of use, and sessions.

Risk factors for erysipelas (Pitché et al, 2015)

Pitché et al (2015) conducted a case-control study to assess the factors associated with leg erysipelas in sub-Saharan Africa. This was a multi-centre study; the cases and controls were recruited from eight countries in sub-Saharan Africa.

They recruited cases of acute leg cellulitis in these eight countries. They recruited two controls for each case; these were matched for age (+/- 5 years) and sex. Thus, the final study has 364 cases and 728 controls. They found that leg erysipelas was associated with obesity, lympoedema, neglected traumatic wound, toe-web intertrigo, and voluntary cosmetic depigmentation.

We have provided details of all the three studies in the bibliography. We strongly encourage the readers to read the papers to understand some practical aspects of case-control studies.

Selection of Cases and Controls

Selection of cases and controls is an important part of this design. Wacholder and colleagues (1992 a, b, and c) have published wonderful manuscripts on design and conduct of case-control of studies in the American Journal of Epidemiology. The discussion in the next few sections is based on these manuscripts.

Selection of case

The investigator should define the cases as specifically as possible. Sometimes, definition of a disease may be based on multiple criteria; thus, all these points should be explicitly stated in case definition.

For example, in the above mentioned Melanoma and Tanning study, the researchers defined their population as any histologic variety of invasive cutaneous melanoma. However, they added another important criterion – these individuals should have a driver's license or State identity card. This probably is not directly related to the clinic condition, so why did they add this criterion? We will discuss this in detail in the next few paragraphs.

Selection of a control

The next important point in designing a case-control study is the selection of control patients.

In fact, Wacholder and colleagues have extensively discussed aspects of design of case control studies and selection of controls in their article.

According to them, an important aspect of selecting a control is that they should be from the same ‘study base’ as that of the cases. Thus, the pool of population from which the cases and controls will be enrolled should be same. For instance, in the Tanning and Melanoma study, the researchers recruited cases from Minnesota Cancer Surveillance System; however, it was also required that these cases should either have a State identity card or Driver's license. This was important since controls were randomly selected from Minnesota State Driver's license list (this also included the list of individuals who have the State identity card).

Another important aspect of a case-control study is that we should measure the exposure similarly in cases and controls. For instance, if we design a research protocol to study the association between metabolic syndrome (exposure) and psoriasis (outcome), we should ensure that we use the same criteria (clinically and biochemically) for evaluating metabolic syndrome in cases and controls. If we use different criteria to measure the metabolic syndrome, then it may cause information bias.

Types of Controls

We can select controls from a variety of groups. Some of them are: General population; relatives or friends; or hospital patients.

Hospital controls

An important source of controls is patients attending the hospital for diseases other than the outcome of interest. These controls are easy to recruit and are more likely to have similar quality of medical records.

However, we have to be careful while recruiting these controls. In the above example of metabolic syndrome and psoriasis, we recruit psoriasis patients from the Dermatology department of the hospital as controls. We recruit patients who do not have psoriasis and present to the Dermatology as controls. Some of these individuals have presented to the Dermatology department with tinea pedis. Do we recruit these individuals as controls for the study? What is the problem if we recruit these patients? Some studies have suggested that diabetes mellitus and obesity are predisposing factors for tinea pedis. As we know, fasting plasma glucose of >100 mg/dl and raised trigylcerides (>=150 mg/dl) are criteria for diagnosis of metabolic syndrome. Thus, it is quite likely that if we recruit many of these tinea pedis patients, the exposure of interest may turn out to be similar in cases and controls; this exposure may not reflect the truth in the population.

Relative and friend controls

Relative controls are relatively easy to recruit. They can be particularly useful when we are interested in trying to ensure that some of the measurable and non-measurable confounders are relatively equally distributed in cases and controls (such as home environment, socio-economic status, or genetic factors).

Another source of controls is a list of friends referred by the cases. These controls are easy to recruit and they are also more likely to be similar to the cases in socio-economic status and other demographic factors. However, they are also more likely to have similar behaviours (alcohol use, smoking etc.); thus, it may not be prudent to use these as controls if we want to study the effect of these exposures on the outcome.

Population controls

These controls can be easily conducted the list of all individuals is available. For example, list from state identity cards, voter's registration list, etc., In the Tanning and melanoma study, the researchers used population controls. They were identified from Minnesota state driver's list.

We may have to use sampling methods (such as random digit dialing or multistage sampling methods) to recruit controls from the population. A main advantage is that these controls are likely to satisfy the ‘study-base’ principle (described above) as suggested by Wacholder and colleagues. However, they can be expensive and time consuming. Furthermore, many of these controls will not be inclined to participate in the study; thus, the response rate may be very low.

Matching in a Case-Control Study

Matching is often used in case-control control studies to ensure that the cases and controls are similar in certain characteristics. For example, in the smoking and lung cancer study, the authors selected controls that were similar in age and sex to carcinoma cases. Matching is a useful technique to increase the efficiency of study.

’Individual matching’ is one common technique used in case-control study. For example, in the above mentioned metabolic syndrome and psoriasis, we can decide that for each case enrolled in the study, we will enroll a control that is matched for sex and age (+/- 2 years). Thus, if 40 year male patient with psoriasis is enrolled for the study as a case, we will enroll a 38-42 year male patient without psoriasis (and who will not be excluded for other reason) as controls.

If the study has used ‘individual matching’ procedures, then the data should also reflect the same. For instance, if you have 45 males among cases, you should also have 45 males among controls. If you show 60 males among controls, you should explain the discrepancy.

Even though matching is used to increase the efficiency in case-control studies, it may have its own problems. It may be difficult to fine the exact matching control for the study; we may have to screen many potential enrollees before we are able to recruit one control for each case recruited. Thus, it may increase the time and cost of the study.

Nonetheless, matching may be useful to control for certain types of confounders. For instance, environment variables may be accounted for by matching controls for neighbourhood or area of residence. Household environment and genetic factors may be accounted for by enrolling siblings as controls.

If we use controls from the past (time period when cases did not occur), then the controls are sometimes referred to historic controls. Such controls may be recruited from past hospital records.

Strengths of a Case-Control Study

  • Case-Control studies can usually be conducted relatively faster and are inexpensive – particularly when compared with cohort studies (prospective)
  • It is useful to study rare outcomes and outcomes with long latent periods. For example, if we wish to study the factors associated with melanoma in India, it will be useful to conduct a case-control study. We will recruit cases of melanoma as cases in one study site or multiple study sites. If we were to conduct a cohort study for this research question, we may to have follow individuals (with the exposure under study) for many years before the occurrence of the outcome
  • It is also useful to study multiple exposures in the same outcome. For example, in the metabolic syndrome and psoriasis study, we can study other factors such as Vitamin D levels or genetic markers
  • Case-control studies are useful to study the association of risk factors and outcomes in outbreak investigations. For instance, Freeman and colleagues (2015) in a study published in 2015 conducted a case-control study to evaluate the role of proton pump inhibitors in an outbreak of non-typhoidal salmonellosis.

Limitations of a Case-control Study

  • The design, in general, is not useful to study rare exposures. It may be prudent to conduct a cohort study for rare exposures

Since the investigator chooses the number of cases and controls, the proportion of cases may not be representative of the proportion in the population. For instance if we choose 50 cases of psoriasis and 50 controls, the prevalence of proportion of psoriasis cases in our study will be 50%. This is not true prevalence. If we had chosen 50 cases of psoriasis and 100 controls, then the proportion of the cases will be 33%.

  • The design is not useful to study multiple outcomes. Since the cases are selected based on the outcome, we can only study the association between exposures and that particular outcome
  • Sometimes the temporality of the exposure and outcome may not be clearly established in case-control studies
  • The case-control studies are also prone to certain biases

If the cases and controls are not selected similarly from the study base, then it will lead to selection bias.

  • Odds Ratio: We are able to calculate the odds ratios (OR) from a case-control study. Since we are not able to measure incidence data in case-control study, an odds ratio is a reasonable measure of the relative risk (under some assumptions). Additional details about OR will be discussed in the biostatistics section.

The OR in the above study is 3.5. Since the OR is greater than 1, the outcome is more likely in those exposed (those who are diagnosed with metabolic syndrome) compared with those who are not exposed (those who do are not diagnosed with metabolic syndrome). However, we will require confidence intervals to comment on further interpretation of the OR (This will be discussed in detail in the biostatistics section).

  • Other analysis : We can use logistic regression models for multivariate analysis in case-control studies. It is important to note that conditional logistic regressions may be useful for matched case-control studies.

Calculating an Odds Ratio (OR)

An external file that holds a picture, illustration, etc.
Object name is IJD-61-146-g002.jpg

Hypothetical study of metabolic syndrome and psoriasis

An external file that holds a picture, illustration, etc.
Object name is IJD-61-146-g003.jpg

Additional Points in A Case-Control Study

How many controls can i have for each case.

The most optimum case-to-control ratio is 1:1. Jewell (2004) has suggested that for a fixed sample size, the chi square test for independence is most powerful if the number of cases is same as the number of controls. However, in many situations we may not be able recruit a large number of cases and it may be easier to recruit more controls for the study. It has been suggested that we can increase the number of controls to increase statistical power (if we have limited number of cases) of the study. If data are available at no extra cost, then we may recruit multiple controls for each case. However, if it is expensive to collect exposure and outcome information from cases and controls, then the optimal ratio is 4 controls: 1 case. It has been argued that the increase in statistical power may be limited with additional controls (greater than four) compared with the cost involved in recruiting them beyond this ratio.

I have conducted a randomised controlled trial. I have included a group which received the intervention and another group which did not receive the intervention. Can I call this a case-control study?

A randomised controlled trial is an experimental study. In contrast, case-control studies are observational studies. These are two different groups of studies. One should not use the word case-control study for a randomised controlled trial (even though you have a control group in the study). Every study with a control group is not a case-control study. For a study to be classified as a case-control study, the study should be an observational study and the participants should be recruited based on their outcome status (some have the disease and some do not).

Should I call case-control studies prospective or retrospective studies?

In ‘The Dictionary of Epidemiology’ by Porta (2014), the authors have suggested that even though the term ‘retrospective’ was used for case-control studies, the study participants are often recruited prospectively. In fact, the study on risk factors for erysipelas (Pitché et al ., 2015) was a prospective case case-control study. Thus, it is important to remember that the nature of the study (case-control or cohort) depends on the sampling method. If we sample the study participants based on exposure and move towards the outcome, it is a cohort study. However, if we sample the participants based on the outcome (some with outcome and some do not) and study the exposures in both these groups, it is a case-control study.

In case-control studies, participants are recruited on the basis of disease status. Thus, some of participants have the outcome of interest (referred to as cases), whereas others do not have the outcome of interest (referred to as controls). The investigator then assesses the exposure in both these groups. Case-control studies are less expensive and quicker to conduct (compared with prospective cohort studies at least). The measure of association in this type of study is an odds ratio. This type of design is useful for rare outcomes and those with long latent periods. However, they may also be prone to certain biases – selection bias and recall bias.

Financial support and sponsorship

Conflicts of interest.

There are no conflicts of interest.

Bibliography

Rectal cancer and occupational risk factors: a hypothesis-generating, exposure-based case-control study

Affiliation.

  • 1 INRS-Institut Armand-Frappier, Laval, Quebec, Canada.
  • PMID: 10956400
  • DOI: 10.1002/1097-0215(20000915)87:6<874::aid-ijc18>3.0.co;2-l

In 1979, a hypothesis-generating, population-based case-control study was undertaken in Montreal, Canada, to explore the association between occupational exposure to 294 substances, 130 occupations and industries, and various cancers. Interviews were carried out with 3, 630 histologically confirmed cancer cases, of whom 257 had rectal cancer, and with 533 population controls, to obtain detailed job history and data on potential confounders. The job history of each subject was evaluated by a team of chemists and hygienists and translated into occupational exposures. Logistic regression analyses adjusted for age, education, cigarette smoking, beer consumption, body mass index, and respondent status were performed using population controls and cancer controls, e.g., 1,295 subjects with cancers at sites other than the rectum, lung, colon, rectosigmoid junction, small intestine, and peritoneum. We present here the results based on cancer controls. The following substances showed some association with rectal cancer: rubber dust, rubber pyrolysis products, cotton dust, wool fibers, rayon fibers, a group of solvents (carbon tetrachloride, methylene chloride, trichloroethylene, acetone, aliphatic ketones, aliphatic esters, toluene, styrene), polychloroprene, glass fibers, formaldehyde, extenders, and ionizing radiation. The independent effect of many of these substances could not be disentangled as many were highly correlated with each other.

Copyright 2000 Wiley-Liss, Inc.

Publication types

  • Research Support, Non-U.S. Gov't
  • Case-Control Studies
  • Dust / adverse effects
  • Middle Aged
  • Occupational Diseases / etiology*
  • Occupational Exposure / adverse effects*
  • Occupations
  • Rectal Neoplasms / etiology*
  • Regression Analysis

IMAGES

  1. Case Control

    hypothesis generating case control study

  2. What is a Case Control Study?

    hypothesis generating case control study

  3. PPT

    hypothesis generating case control study

  4. PPT

    hypothesis generating case control study

  5. Nested Case Control Study

    hypothesis generating case control study

  6. Hypothesis Testing- Meaning, Types & Steps

    hypothesis generating case control study

VIDEO

  1. Cohort and Case Control Studies

  2. Case-control and Cohort Study Designs

  3. Study Designs (Cross-sectional, Case-control, Cohort)

  4. Case-Control Studies: A Brief Overview

  5. Biostatistics

  6. Case control studies

COMMENTS

  1. Formulating Hypotheses for Different Study Designs

    Formulating Hypotheses for Different Study Designs. Generating a testable working hypothesis is the first step towards conducting original research. Such research may prove or disprove the proposed hypothesis. Case reports, case series, online surveys and other observational studies, clinical trials, and narrative reviews help to generate ...

  2. Hypothesis Testing in Case-Control Studies

    the model (1) directly, as if the data came from a prospective study. Thus inferences about 8 can be made using a standard program for logistic regression, which is a particularly attractive feature of the logistic regression model. In a stratified case-control study the population is partitioned into t strata and, within each

  3. PDF Guidelines for reading a Cohort Study

    The evolving case-control study. J Chronic Dis 1979; 32: 15-27. 6. Greenland S, Thomas DC. On the need for the rare disease assumption in case-control studies. Am J Epidemiol. 19821; 116: 547-553. 7. Poole C. Exposure opportunity in case-control studies. Am J Epidemiol 1986; 122: 352-358. 8. Miettinen OS. Estimability and estimation in case ...

  4. Case Control Studies

    A case-control study is a type of observational study commonly used to look at factors associated with diseases or outcomes.[1] The case-control study starts with a group of cases, which are the individuals who have the outcome of interest. The researcher then tries to construct a second group of individuals called the controls, who are similar to the case individuals but do not have the ...

  5. Clinical research study designs: The essentials

    Case‐control study. Case‐control studies are study designs that compare two groups, such as the subjects with disease (cases) to the subjects without disease (controls), and to look for differences in risk factors. 8 This study is used to study risk factors or etiologies for a disease, especially if the disease is rare. Thus, case‐control ...

  6. What Is a Case-Control Study?

    Revised on June 22, 2023. A case-control study is an experimental design that compares a group of participants possessing a condition of interest to a very similar group lacking that condition. Here, the participants possessing the attribute of study, such as a disease, are called the "case," and those without it are the "control.".

  7. Study designs may influence results: the problems with ...

    Otherwise, such studies cannot be regarded as 'hypothesis testing' but only 'hypothesis generating'. We consider that this holds true for all questionnaire-based case-control studies on ...

  8. Epiville: Case-Control Study -- Study Design

    Answer (a) — correct: These hypotheses clearly state the expected causal factors (EnduroBrock and Quench-it), and the expected direction of effect. They are sufficiently explicit to allow the hypotheses to be tested after collecting data. In addition, we know that, although a case-control methodology selects individuals based on disease status and compares the exposure distribution between ...

  9. Formulating Hypotheses for Different Study Designs

    Generating a testable working hypothesis is the first step towards conducting original research. Such research may prove or disprove the proposed hypothesis. Case reports, case series, online ...

  10. Case-control studies in the genomic era: a clinician's guide

    The goal of case-control association studies is to find genetic variants in the human genome that influence common traits. ... Hence, genome-wide searches should be viewed as hypothesis generating and not hypothesis testing, and significance thresholds should be corrected for the number of tests, thereby offsetting the small prior probability ...

  11. The VITAL study: case control studies are hypothesis-generating

    The VITAL study: case control studies are hypothesis-generating. from the VITAL study, a single-armed, case-control, prospective trial of isavuconazole for mucormycosis. Matching can only approximate the confounding by known covariates, and is not possible for unknown factors that affect outcomes. Furthermore, patients enrolled in prospective ...

  12. Results From a Hypothesis Generating Case-Control Study: Herpes Family

    Most existing studies of herpes viruses have used small populations and postdiagnosis specimens. As part of a larger research program, we conducted a hypothesis-generating case-control study of selected herpes virus antibodies among individuals discharged from the US military with schizophrenia and pre- and postdiagnosis sera.

  13. Observational Studies: Cohort and Case-Control Studies

    Cohort studies and case-control studies are two primary types of observational studies that aid in evaluating associations between diseases and exposures. In this review article, we describe these study designs, methodological issues, and provide examples from the plastic surgery literature. Keywords: observational studies, case-control study ...

  14. PDF Lecture 1b Descriptive Epidemiology

    Descriptive studies are Hypothesis-Generating (Descriptive) studies: Meaning, these studies provide contextual information within which to develop hypotheses. ... These include clinical trials, case-control studies, cohort studies, and mixtures of the above such as a nested case-control study.

  15. Rectal cancer and occupational risk factors: A hypothesis‐generating

    Abstract In 1979, a hypothesis-generating, population-based case-control study was undertaken in Montreal, Canada, ... In summary, this hypothesis-generating study earmarked a number of possible occupational risk factors for rectal cancer. However, the available evidence remains too limited to justify firm conclusions, especially since, for ...

  16. Designing and Conducting Analytic Studies in the Field

    Most field case-control studies use control-to-case-patient ratios of 1:1, 2:1, or 3:1. Enrolling more than one control per case-patient can increase study power, which might be needed to detect a statistically significant difference in exposure between case-patients and controls, particularly when an outbreak involves a limited number of cases.

  17. Methodology Series Module 2: Case-control Studies

    Case-Control study design is a type of observational study. In this design, participants are selected for the study based on their outcome status. Thus, some participants have the outcome of interest (referred to as cases), whereas others do not have the outcome of interest (referred to as controls). The investigator then assesses the exposure ...

  18. [PDF] Results From a Hypothesis Generating Case-Control Study: Herpes

    A hypothesis-generating case-control study of selected herpes virus antibodies among individuals discharged from the US military with schizophrenia and pre- and postdiagnosis sera found a significant association between human herpes virus type 6 and schizophrenia. Background: Herpes family viruses can cause central nervous system inflammatory changes that can present with symptoms ...

  19. Rectal cancer and occupational risk factors: a hypothesis-generating

    In 1979, a hypothesis-generating, population-based case-control study was undertaken in Montreal, Canada, to explore the association between occupational exposure to 294 substances, 130 occupations and industries, and various cancers. Interviews were carried out with 3, 630 histologically confirmed …