• Search Menu
  • Sign in through your institution
  • Author Guidelines
  • Submission Site
  • Open Access Options
  • Self-Archiving Policy
  • Reasons to Submit
  • About Journal of Surgical Protocols and Research Methodologies
  • Editorial Board
  • Advertising & Corporate Services
  • Journals on Oxford Academic
  • Books on Oxford Academic

Issue Cover

Article Contents

Introduction, contents of a systematic review/meta-analysis protocol, conflict of interest statement.

  • < Previous

How to write a systematic review or meta-analysis protocol

  • Article contents
  • Figures & tables
  • Supplementary Data

Julien Al Shakarchi, How to write a systematic review or meta-analysis protocol, Journal of Surgical Protocols and Research Methodologies , Volume 2022, Issue 3, July 2022, snac015, https://doi.org/10.1093/jsprm/snac015

  • Permissions Icon Permissions

A protocol is an important document that specifies the research plan for a systematic review/meta analysis. In this paper, we have explained a simple and clear approach to writing a research study protocol for a systematic review or meta-analysis.

A study protocol is an essential part of any research project. It sets out in detail the research methodology to be used for the systematic review or meta-analysis. It assists the research team to stay focused on the question to be answered by the study. Prospero, from the Centre for Reviews and Dissemination at the University of York, is an international prospective register of systematic reviews and authors should consider registering their research to reduce the potential for duplication of work. In this paper, we will explain how to write a research protocol by describing what needs to be included.


This section sets out the need for the planned research and the context of the current evidence. It should be supported by an extensive background to the topic with appropriate references to the literature. This should be followed by a brief description of the condition and the target population. A clear explanation for the rationale and objective of the project is also expected to justify the need of the study.

Methods and analysis

A detailed search strategy is necessary to be described in the protocol. It should set out which databases are to be included as well as the specific keywords be searched and publication timeframe. The inclusion/exclusion criteria should be described for the type of studies, participants and interventions. The population, intervention, comparator and outcome (PICO) framework is a useful tool to consider for this section.

The methodology of the data extraction should be detailed in this section and should include how many reviewers will be involved and how any disagreement will be resolved. The methodology to be used for quality and bias assessment of included studies should also be described in this section. Data analysis including statistical methodology needs to be established clearly in this section of the protocol. Finally details of any planned subgroup analyses should also be included.

Ethics and dissemination

Any competing interests of the researchers should also be stated in this section. The authorship of any publication should have a clear and fair criterion which should be described in this section of the protocol. By doing so, it will resolve any issues arising at the publication stage.

Funding statement

It is important to explain who are the sponsors and funders of the study. It should clearly clarify the involvement and potential influence of any party. The protocol should explicitly outline the roles and responsibilities of any funder(s) in study design, data analysis and interpretation, manuscript writing and dissemination of results.

A protocol is an important document that specifies the research plan for a systematic review or meta-analysis. It should be written in detail and researchers should aim to publish their study protocols. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement provides a useful checklist on what should be included in a systematic review [ 1 ]. In this paper, we have explained a simple and clear approach to writing a research study protocol for a systematic review or meta-analysis.

None declared.

Page   MJ , McKenzie   JE , Bossuyt   PM , Boutron   I , Hoffmann   TC , Mulrow   CD , et al.    The PRISMA 2020 statement: an updated guideline for reporting systematic reviews . BMJ   2021 ; 372 : n71 .

Google Scholar

  • data analysis

Email alerts

Citing articles via.

  • Advertising and Corporate Services
  • Journals Career Network
  • JSPRM Twitter


  • Online ISSN 2752-616X
  • Copyright © 2024 Oxford University Press and JSCR Publishing Ltd
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

  • Link to facebook
  • Link to linkedin
  • Link to twitter
  • Link to youtube
  • Writing Tips

How to Conduct a Meta-Analysis for Research

How to Conduct a Meta-Analysis for Research

  • 6-minute read
  • 19th January 2024

Are you considering conducting a meta-analysis for your research paper ? When applied to the right problem, meta-analyses can be useful. In this post, we will discuss what a meta-analysis is, when it’s the appropriate method to use, and how to perform one. Let’s jump in! 

What is a Meta-Analysis?

Meta-analysis is a statistical technique that allows researchers to combine findings from multiple individual studies to reach more reliable and generalizable conclusions. It provides a systematic and objective way of synthesizing the results of different studies on a particular topic. There are several benefits of meta-analyses in academic research:

  • Synthesizing diverse evidence : Meta-analysis allows researchers to synthesize evidence from diverse studies, providing a more comprehensive understanding of a research question .
  • Statistical power enhancement : By pooling data from multiple studies, meta-analysis increases statistical power , enabling researchers to detect effects that may be missed in individual studies with smaller sample sizes.
  • Precision and reliability : Meta-analysis offers a more precise estimate of the true effect size , enhancing the reliability and precision of research findings.

When Should I Conduct a Meta-Analysis?

  Although some similarities exist between meta-analyses, literature reviews, and systematic reviews, these methods are distinct, and they serve different purposes. Here’s a breakdown of when to use each.


Meta-analysis is a statistical method that combines and analyzes quantitative data from multiple independent studies to provide an overall estimate of an effect. You should conduct a meta-analysis: 

  • When you want to quantitatively synthesize the results of multiple studies that have measured similar outcomes
  • When there is a sufficient number of studies with compatible data and statistical methods
  • When you’re interested in obtaining a more precise and generalizable estimate of an effect size

Systematic Review

A systematic review is a comprehensive, structured review of existing literature that follows a predefined protocol to identify, select, and critically appraise relevant research studies. You should perform a systematic review: 

  • When you want to provide a comprehensive overview of the existing evidence on a particular research question
  • When you need to assess the quality of available studies and identify gaps or limitations in the literature
  • When a quantitative synthesis (meta-analysis) is not feasible due to variability in study designs or outcomes

Literature Review

A literature review is a broader examination and narrative summary of existing research that may not follow the strict methodology of a systematic review. You should utilize a literature review: 

  • When you want to familiarize yourself with the existing research on a topic without the rigorous methodology required for a systematic review 
  • When you’re exploring a new research area and want to understand the key concepts, theories, and findings 
  • When a more narrative and qualitative synthesis of the literature is sufficient for your purpose

The nature of your research question and the available evidence will guide your choice. If you’re interested in a quantitative summary of results, a meta-analysis might be appropriate. For a comprehensive overview, you could use a systematic review. In many cases, researchers use a combination of these methods. For instance, a systematic review may precede a meta-analysis to identify and evaluate relevant studies before their results are pooled quantitatively. Always consider the specific goals of your research and the nature of the available evidence when deciding which type of data analysis to employ.

Steps to Perform a Meta-Analysis

If you’ve decided that a meta-analysis is the best approach for your research, follow the steps below to guide you through the process.

  • Define your research question and objective.

Clearly define the research objective of your meta-analysis. Doing this will help you narrow down your search and establish inclusion and exclusion criteria for selecting studies.

      2. Conduct a comprehensive literature search.

Thoroughly search electronic databases, such as PubMed, Google Scholar, or Scopus, to identify all relevant studies on your research question. Use a combination of keywords , subject heading terms, and search strategies to ensure a comprehensive search.

      3. Screen and select studies.

Carefully read the titles and abstracts of the identified studies to determine their relevance to your research question. Exclude studies that do not meet your inclusion criteria. Obtain the full text of potentially relevant studies and assess their eligibility based on predefined criteria.

      4. Extract data from selected studies.

Develop a standardized data extraction form to record relevant information from each selected study. Extract data such as study characteristics, sample size, outcomes, and statistical measures. Doing this ensures consistency and reliability in data extraction.

Find this useful?

Subscribe to our newsletter and get writing tips from our editors straight to your inbox.

      5. Evaluate the study quality and biases.

Assess the quality and risk of bias in each study using established tools, such as the Cochrane Collaboration’s risk-of-bias tool. Consider factors such as study design, sample size, randomization, blinding, and the handling of missing data. This step helps identify potential sources of bias in the included studies.

       6. Perform a statistical analysis.

Choose appropriate statistical methods to combine the results from the selected studies. Commonly used measures include odds ratios, risk ratios, and mean differences. Calculate the effect sizes and their associated confidence intervals. You might consider using statistical software to help you with this step.

        7. Assess heterogeneity.

Assess the heterogeneity of the included studies to determine whether the results can be pooled. Use statistical tests, such as Cochran’s Q test or I 2 statistic, to quantify the degree of heterogeneity.

       8. Interpret and report the results.

Interpret the pooled effect size and its confidence interval in light of the research question. Provide a clear summary of the findings, including any limitations or caveats. Use forest plots or other graphical tools to present the results visually. Make sure to adhere to reporting guidelines, such as the PRISMA Statement .

       9. Assess the publication bias.

Publication bias occurs when studies with positive results are more likely to be published, leading to an overestimation of the effect size. Assess the publication bias using methods such as funnel plots, Egger’s test, or the Begg and Mazumdar test. Consider exploring potential publication bias through a sensitivity analysis.

     10. Discuss the implications and limitations.

Finally, discuss the implications of the meta-analysis findings in the context of the existing literature. Identify any limitations or potential biases that may affect the validity of the results. You might also highlight areas for further research or recommendations for practice.

There you have it! Now that we’ve gone over what a meta-analysis is, when to use one in research, and what steps to take to conduct a robust meta-analysis, you’re well prepared to begin your research journey.

Finally, if you’d like any help proofreading your research paper , consider our research paper editing services . You can even try a free sample . Good luck with your meta-analysis!

Share this article:

Post A New Comment

Got content that needs a quick turnaround? Let us polish your work. Explore our editorial business services.

9-minute read

How to Use Infographics to Boost Your Presentation

Is your content getting noticed? Capturing and maintaining an audience’s attention is a challenge when...

8-minute read

Why Interactive PDFs Are Better for Engagement

Are you looking to enhance engagement and captivate your audience through your professional documents? Interactive...

7-minute read

Seven Key Strategies for Voice Search Optimization

Voice search optimization is rapidly shaping the digital landscape, requiring content professionals to adapt their...

4-minute read

Five Creative Ways to Showcase Your Digital Portfolio

Are you a creative freelancer looking to make a lasting impression on potential clients or...

How to Ace Slack Messaging for Contractors and Freelancers

Effective professional communication is an important skill for contractors and freelancers navigating remote work environments....

3-minute read

How to Insert a Text Box in a Google Doc

Google Docs is a powerful collaborative tool, and mastering its features can significantly enhance your...

Logo Harvard University

Make sure your writing is the best it can be with our expert English proofreading and editing.

  • - Google Chrome

Intended for healthcare professionals

  • Access provided by Google Indexer
  • My email alerts
  • BMA member login
  • Username * Password * Forgot your log in details? Need to activate BMA Member Log In Log in via OpenAthens Log in via your institution


Search form

  • Advanced search
  • Search responses
  • Search blogs
  • A guide to prospective...

A guide to prospective meta-analysis

  • Related content
  • Peer review
  • Anna Lene Seidler , research fellow 1 ,
  • Kylie E Hunter , senior project officer 1 ,
  • Saskia Cheyne , senior evidence analyst 1 ,
  • Davina Ghersi , senior principal research scientist, adjunct professor 1 2 ,
  • Jesse A Berlin , vice president, global head of epidemiology 3 ,
  • Lisa Askie , professor and director of systematic reviews and health technology assessment, manager of the Australian New Zealand Clinical Trials Registry 1
  • 1 NHMRC Clinical Trials Centre, University of Sydney, Locked bag 77, Camperdown NSW 1450, Australia
  • 2 National Health and Medical Research Council, Canberra, Australia
  • 3 Johnson & Johnson, Titusville, NJ, USA
  • Correspondence to: A L Seidler lene.seidler{at}ctc.usyd.edu.au
  • Accepted 8 August 2019

In a prospective meta-analysis (PMA), study selection criteria, hypotheses, and analyses are specified before the results of the studies related to the PMA research question are known, reducing many of the problems associated with a traditional (retrospective) meta-analysis. PMAs have many advantages: they can help reduce research waste and bias, and they are adaptive, efficient, and collaborative. Despite an increase in the number of health research articles labelled as PMAs, the methodology remains rare, novel, and often misunderstood. This paper provides detailed guidance on how to address the key elements for conducting a high quality PMA with a case study to illustrate each step.

Summary points

In a prospective meta-analysis (PMA), studies are identified and determined to be eligible for inclusion before the results of the studies related to the PMA research question are known

PMAs are applicable to high priority research questions where limited previous evidence exists and where new studies are expected to emerge

Compared with standard systematic review and meta-analysis protocols, key adaptations should be made to a PMA protocol, including search methods to identify planned and ongoing studies, details of studies that have already been identified for inclusion, core outcomes to be measured by all studies, collaboration management, and publication policy

A systematic search for planned and ongoing studies should precede a PMA, including a search of clinical trial registries and medical literature databases, and contacting relevant stakeholders in the specialty

PMAs are ideally conducted by a collaboration or consortium, including a central steering and data analysis committee, and representatives from each individual study

Usually PMAs collect individual participant data, but PMAs of aggregate data are also possible. PMAs can include interventional or observational studies

PMAs can enable harmonised collection of core outcomes, which can be particularly useful for rare but important outcomes, such as adverse side effects

Adaptive forms of PRISMA (preferred reporting items for systematic reviews and meta-analyses) and quality assessment approaches such as GRADE (grading of recommendations assessment, development, and evaluation) should be used to report and assess the quality of evidence for a PMA. The development of a standardised set of reporting guidelines and PMA specific evidence rating tools is highly desirable

PMAs can help to reduce research waste and bias, and they are adaptive, efficient, and collaborative

Systematic reviews and meta-analyses of the best available evidence are widely used to inform healthcare policy and practice. 1 2 Yet the retrospective nature of traditional systematic reviews and meta-analyses can be problematic. Positive results are more likely to be reported and published (phenomena known as selective outcome reporting and publication bias), and therefore including only published results in a meta-analysis can produce misleading results 3 and pose a threat to the validity of evidence based medicine. 4 In the planning stage of a traditional meta-analysis, knowledge of individual study results can influence the study selection process as choosing the key components of the review question and eligibility criteria might be based on one or more positive studies. 2 5 Meta-analyses on the same topic can reach conflicting conclusions because of different eligibility criteria. 2 Also, inconsistencies across individual studies in outcome measurement and analyses can make the combination of data difficult. 6

Prospective meta-analyses (PMAs, see box 1) have recently been described as next generation systematic reviews 7 that reduce the problems of traditional retrospective meta-analyses. Ioannidis and others even argue that “all primary original research may be designed, executed, and interpreted as prospective meta-analyses.” 8 9 For PMAs, studies are included prospectively, meaning before any individual study results related to the PMA research question are known. 10 This reduces the risk of publication bias and selective reporting bias and can enable better harmonisation of study outcomes.

Definition of a prospective meta-analysis

The key feature of a prospective meta-analysis (PMA) is that the studies or cohorts are identified as eligible for inclusion in the meta-analysis, and hypotheses and analysis strategies are specified, before the results of the studies or cohorts related to the PMA research question are known

The number of meta-analyses described as PMAs is increasing ( fig 1 ). But the definition, methodology, and reporting of previous PMAs vary greatly, and guidance on how to conduct them is limited, outdated, and inconsistent. 11 12 With recent advancements in computing capabilities, and the ability to identify planned and ongoing studies through increased trial registration, the planning and conduct of PMAs have become more efficient and effective. For PMAs to be successfully implemented in future health research, a revised PMA definition and expanded guidance are required. In this article, we, the Cochrane PMA Methods Group, present a step by step guide on how to perform a PMA. Our aim is to provide up to date guidance on the key principles, rationale, methods, and challenges for each step, to enable more researchers to understand and use this methodology successfully. Figure 2 shows a summary of the steps needed to perform a PMA.

Fig 1

Number of prospective meta-analyses (PMAs) over time. Possible PMA describes studies that seem to fulfil the criteria for a PMA but not enough information was reported to make a definite decision on their status as a PMA. These data are based on a systematic search of the literature (see appendix 1 for methodology)

  • Download figure
  • Open in new tab
  • Download powerpoint

Fig 2

Steps in conducting a prospective meta-analysis (PMA)

Case study: Neonatal Oxygenation Prospective Meta-analysis (NeOProM)

We will illustrate each step with an example of a PMA of randomised controlled trials conducted by the Neonatal Oxygenation Prospective Meta-analysis (NeOProM) Collaboration. 13 In this PMA, five groups prospectively planned to conduct separate, but similar, trials assessing different target ranges for oxygen saturation in preterm infants, and combine their results on completion. Although no difference was found in the composite primary outcome of death or major disability, a statistically significant reduction in the secondary outcome of death alone was found for the higher oxygen target range, but no change in major disability. This PMA resolved a major debate in neonatology.

Steps for performing a prospective meta-analysis

Step 0: deciding if a pma is the right methodology.

PMA methodology should be considered for a high priority research question for which new studies are expected to emerge and limited previous evidence exists (fig 3):

Priority research question —PMAs should be planned for research questions that are a high priority for healthcare decision makers. Ideally, these questions should be identified using priority setting methods within consumer-clinician collaborations, and/or they should address priorities identified by guideline committees, funding bodies, or clinical and research associations. Often these questions are in areas where important new treatment or prevention strategies have recently emerged, or where practice varies because of insufficient evidence.

New studies expected —PMAs are only feasible if new studies are likely to be included—for example, if the research question is an explicit priority for funding bodies or research associations. Some PMAs have been initiated after researchers learnt they were planning or conducting similar studies, and so they decided to collaborate and prospectively plan to combine their data. In other cases, a research question is posed by a consortium of investigators who then decide to plan similar studies that are combined on completion. A research team planning a PMA can play an active role in motivating other researchers to conduct similar studies addressing the same research question. A PMA can therefore be a catalyst for initiating a programme of priority research to answer important questions. 8 Initiating a PMA rather than conducting a large multicentre study can be advantageous as PMAs allow flexibility for each study to answer additional local questions, and the studies can be funded independently, which circumvents the problem of funding a mega study.

Insufficient previous evidence —A PMA should only be conducted if insufficient evidence exists to answer the research question. If sufficient evidence is available (eg, based on a retrospective meta-analysis), no further studies and no PMA should be planned, to avoid research waste.

Fig 3

When to conduct a prospective meta-analysis (PMA)

If evidence is available, but is insufficient for clinical decision making, a nested PMA should be considered. A nested PMA integrates prospective evidence into a retrospective meta-analysis, making best use of existing and emerging evidence while also retaining some benefits of PMAs. A nested PMA allows the assessment of publication bias and selective reporting bias by comparing prospectively included evidence with retrospective evidence in a sensitivity analysis. Studies that are prospectively included can be harmonised with other ongoing studies, and with previous related retrospective studies, to optimise evidence synthesis (see step 5).

PMA methodology was chosen to determine the optimal target range for oxygen saturation in preterm infants for several reasons:

Priority research question —oxygen has been used to treat preterm infants for more than 60 years. The different oxygen saturation target ranges used in practice have been associated with clinically important outcomes, such as mortality, disability, and blindness. Changing the oxygen saturation target range would be relatively easy to implement in clinical practice.

Insufficient previous evidence —evidence was mainly observational, with no recent, high quality randomised controlled trials available.

New studies expected— a total sample size of about 5000 infants was needed to detect an absolute difference in death or major disability of 4%. The NeOProM PMA was originally proposed as one large multicentre, multinational trial. 14 But because expensive masked pulse oximeters were needed, one funder could not support a study of sufficient sample size to reliably answer the clinical question. Instead, a PMA collaboration was initiated. Each group of NeOProM investigators obtained funding to conduct their own trial (although alone each study was underpowered to answer the main clinical question), could choose their own focus, and publish their own results, but with agreement to contribute data to the PMA to ensure sufficient combined statistical power to reliably detect differences in important outcomes.

Step 1: defining the research question and the eligibility criteria

At the start of a PMA, a research question needs to be specified. Research questions for PMAs should be formed in a similar way to traditional retrospective systematic reviews. Guidance for formulating a review question is available in the Cochrane Handbook for Systematic Reviews of Interventions . 15 For PMAs of interventional studies, the PICO system (population, intervention, comparison, outcome) should be used. To avoid selective reporting bias, the PMA research question and hypotheses need to be specified before any study results related to the PMA research questions are known.

PMAs are possible for a wide range of different study types—their applicability reaches beyond randomised controlled trials. An interventional PMA includes interventional studies (eg, randomised controlled trials or non-randomised studies of interventions). For interventional PMAs, the key inclusion criterion of “no results being known” usually means that the analyses have not been conducted in any of the trials included in the PMA.

An observational PMA includes observational studies. For observational PMAs, “no results being known” would mean that no analyses related to the PMA research question have been done. As many observational studies collect data on different outcomes, a meta-analysis can be classified as a PMA if unrelated research questions have already been analysed before inclusion in the PMA. For instance, for a PMA on the risk of lung cancer for people exposed to air pollution, observational studies where the relation between cardiovascular disease and air pollution has already been analysed can be included in the PMA, but only if the analyses on the association between lung cancer and air pollution have not been done. In this case, however, little harmonisation of outcome collection is possible (unless the investigators agree to collect additional data).

The NeOProM PMA addressed the research question, does targeting a lower oxygen saturation range in extremely preterm infants, from birth or soon after, increase or decrease the composite outcome of death or major disability in survivors by 4% or more?

The PICOS system was applied to define the eligibility criteria:

• Participants=infants born before 28 weeks’ gestation and enrolled within 24 hours of birth

• Intervention=target a lower (85-89%) oxygen saturation (SpO 2 ) range

• Comparator=target a higher (91-95%) SpO 2 range

• Outcome=composite of death or major disability at a corrected age of 18-24 months

• Study type=double blinded, randomised controlled trial (making this an interventional PMA).

Step 2: writing the protocol

Key elements of the protocol need to be finalised for the PMA before any individual study results related to the PMA research question are known. These include specification of the research questions, eligibility criteria for inclusion of studies, hypotheses, outcomes, and the statistical analysis strategy. The preferred reporting items for systematic reviews and meta-analyses extension for protocols (PRISMA-P) 16 provides some guidance on what should be included. As these reporting items were created for retrospective meta-analyses, however, key adaptations need to be made for PMA protocols (see box 2).

Key additional reporting items for a PMA protocol

For a PMA, several key items should be reported in the protocol in addition to PRISMA-P items:

Search methods

The search methods need to include how planned and ongoing studies are identified and how potential collaborators will be or have been contacted to participate (see step 3)

Study details

Details for studies already identified for inclusion should be listed, along with a statement that their results related to the PMA research question are not yet known (see step 1)

Core outcomes

Any core outcomes that will be measured by all the included studies should be specified, along with details on how and why they should be measured, to facilitate outcome harmonisation (see step 5)

Type of data collected

PMAs often collect individual participant data (that is, row by row data for each participant) but they may also collect aggregate data (that is, summary data for each study), and some combine both (see step 6)

Collaboration management and publication policy

Collaboration management and publication policy (see steps 4 and 7) should be specified, including details of any central steering and data analysis committees

An initial PMA protocol should be drafted before the search for eligible studies, but it can be amended after searching and after all studies have been included if the results of the included studies are not known when the PMA protocol is finalised. The investigators of the included studies can agree on the collection and analysis of additional rare outcomes and these outcomes can be included in a revised version of the protocol.

The final PMA protocol should be publicly available on the international prospective register of systematic reviews, PROSPERO 17 (which supports registration of PMAs), before the results (relating to the PMA research question) of any of the included studies are known. A full version of the PMA protocol can be published in a peer reviewed journal or elsewhere.

For the NeOProM PMA, an initial protocol was drafted by the lead investigators and discussed and refined by collaborators from all the included trials. The PMA protocol was registered on ClinicalTrials.gov in 2010 ( NCT01124331 ) because PROSPERO had not yet been launched. After the launch of PROSPERO in 2011, the protocol was registered (CRD42015019508). The full version of the protocol was published in BMC Pediatrics . 18

Step 3: searching for studies

After the PMA protocol is finalised, a systematic literature search is conducted, similar to that of a systematic review for a high quality meta-analysis. The main resources available for identifying planned and ongoing studies are clinical trial registries. Currently, 17 global clinical trial registries provide data to the World Health Organization’s International Clinical Trials Registry Platform. 19 Views on the best strategies for searching trial registries differ. 20 Limiting the search by date can be useful (eg, only studies registered within a reasonable time frame, taking into account the expected study duration and follow-up times) to reduce the search burden and exclude studies registered earlier that would likely be completed and thus ineligible for a PMA. Ideally, searches should be repeated on a regular basis to identify new eligible studies.

Prospective trial registration is mandated by various legislative, ethical, and regulatory bodies but compliance is not complete. 21 22 23 Observational studies are not required to be registered. Hence additional approaches to identifying planned and ongoing studies should be pursued, including searching bibliographic databases for conference abstracts, study protocols, and cohort descriptions, and approaching relevant stakeholders. The existence and possibility of joining the PMA can be publicised through the publication of PMA protocols, presentations at relevant conferences and research forums, and through an online presence (eg, a collaboration website).

For NeOProM, the Cochrane Central Register of Controlled Trials, Medline through PubMed, Embase, and CINAHL, clinical trial registries (using the WHO portal ( www.who.int/ictrp/en/ ) and ClinicalTrials.gov), conference proceedings, and the reference lists of retrieved articles were searched. Key researchers in the specialty were contacted to inquire if they were aware of additional trials. The abstracts of the relevant perinatal meetings (including the Neonatal Register and the Society for Paediatric Research) were searched using the keywords “oxygen saturation”. Five planned or ongoing trials meeting the inclusion criteria for the NeOProM PMA were identified, based in Australia, New Zealand, Canada, the United Kingdom, and the United States. The trials completed enrolment and follow-up between 2005 and 2014 and recruited a total of 4965 preterm infants born before 28 weeks’ gestation. No results for any of the trials were known at the time each trial agreed to be included in the PMA. All the NeOProM trials were identified by discussion with collaborators, and no additional trials were identified from electronic database searches.

Step 4: forming a collaboration of study investigators

Ideally, PMAs are conducted by a collaboration or consortium, including a central steering committee (leading the PMA and managing the collaboration), a data analysis committee (responsible for data management, processing, and analysis), and representatives from each study (involved in decisions on the protocol, analysis, and interpretation of the results). Regular collaboration meetings can be beneficial for achieving consensus on disagreements and in keeping study investigators involved in the PMA process. Transparent processes and a priori agreements are crucial for building and maintaining trust within a PMA collaboration.

Investigators might refuse to collaborate. Refusal to collaborate is less likely in a PMA than in a retrospective individual participant data meta-analysis as reaching agreement to share data is easier if studies are in their planning phases and can still be amended and harmonised after internal discussions. Aggregate data can be included in the PMA even if investigators refuse to collaborate, if the relevant summary data can be extracted from the resulting publications when the studies are completed. The ability to harmonise studies (step 5), however, may be limited if eligible investigators refuse to participate.

The NeOProM Collaboration comprised at least one investigator and a statistician from each of the included trials, and a steering group. All investigators and the steering group agreed on key aspects of the protocol before the results of the trials were known, and they also developed and agreed on a common data collection form, coding sheet, and detailed analysis plan. The NeOProM Collaboration met regularly by teleconference, and at least once a year face to face, to reach consensus on disagreements and to discuss the progress of individual trials, funding, data harmonisation, analysis plans, and interpretation of the PMA findings.

Step 5: harmonisation of included study population, intervention/exposure, and outcome collection

When a collaboration of investigators of planned or ongoing studies has been formed, the investigators can work together to harmonise the design, conduct, and outcome collection of the included studies to facilitate a meta-analysis and interpretation. A common problem with retrospective meta-analyses is that interventions are administered slightly differently across studies, or to different populations, and outcome collection, measurement, or reporting can differ. These differences make it difficult, and sometimes impossible, to synthesise results that are directly relevant to the study outcomes, interventions, and populations. In a PMA, studies are included as they are being planned or are ongoing, allowing researchers to agree on how to conduct their studies and collect common core outcomes. The PMA design enables the generation of evidence that is directly relevant to the research questions and thus increases confidence in the strength of the statements and recommendations derived from the PMA.

The ability to harmonise varies depending on the time when the PMA is first planned ( fig 4 ). In a de novo PMA, studies are planned as part of a PMA. For PMAs of interventional studies, a de novo PMA is similar to a multicentre trial: the included trials often share a common protocol, and usually the study population, interventions, and outcome collection are fully harmonised. In contrast, some PMAs identify studies for inclusion when data collection has already finished but no analyses related to the PMA research question have been conducted (outside of data safety monitoring committees). These types of PMAs allow little to no data harmonisation and are more similar to traditional retrospective meta-analyses. Yet they still have the advantage of reducing selection bias as the studies are deemed eligible for inclusion before their PMA specific results are known.

Fig 4

Different scenarios and time points when studies can be included in a prospective meta-analysis (PMA)

Harmonisation of studies in a PMA can occur for different elements of the included studies: study populations and settings; interventions or exposures (that is, independent variables); and outcomes collection. For study populations, settings, and interventions/exposures, harmonisation of studies to some degree is often beneficial to enable their successful synthesis. But some variation in the individual study protocols, populations, and interventions/exposures is often desirable to improve the generalisability (that is, external validity) of the research findings beyond one study, one form of the intervention, or narrow study specific populations. The variation in populations also enables subgroup analyses, evaluating if differences in populations between and within the studies leads to differences in treatment effects. If particular subgroups appear in more than one study, additional statistical power for subgroup analyses is also achieved.

Harmonisation of outcome collection requires careful consideration of the amount of common data needed to answer the relevant research questions. These discussions should aim to minimise unnecessary burden on participants and reduce research waste by avoiding excessive data collection, while increasing the ability to answer important research questions. Researchers can also agree to collect and analyse rare outcomes, such as severe but rare adverse events, that their individual studies would not have had the statistical power to detect. Collaborations should be specific on exactly how shared outcomes will be measured to avoid heterogeneity in outcome collection and difficulties in combining data. The COMET (core outcome measures in effectiveness trials) initiative ( www.comet-initiative.org/ ) has introduced methods for the development of core outcome sets, as detailed in its handbook. 24 These core outcome sets specify what and how outcomes should be measured by all studies of specific conditions to facilitate comparison and synthesis of the results. For health conditions with common core outcome sets, PMA collaborators should include the core outcomes, and also consider collecting other common outcomes that are particularly relevant for the specific research question posed. Not all outcomes have to be harmonised and collected by all studies: individual studies in a PMA have more autonomy than individual centres in a multicentre study and can collect study specific outcomes for their own purposes.

The improved availability of common core outcomes in a PMA has recently been shown in a PMA of childhood obesity interventions. 25 Harmonisation increased from 18% of core outcomes collected by all trials before the trial investigators agreed to collaborate, to 91% after the investigators decided to collaborate in a PMA.

Investigators of the five NeOProM trials first met in 2005 when the first trial was about to begin and the other four studies were in the early planning stages. With de novo PMA planning, all trials had the same intervention and comparator and collected similar outcome and subgroup variables. Some inconsistencies in outcome definitions and assessment methods across studies remained, however, and required substantial discussion to harmonise the final outcome collection and analyses.

Step 6: synthesising the evidence and assessing certainty of evidence

When all the individual studies have been completed, data can be synthesised in a PMA. For aggregate data PMA, results are extracted from publications or provided by the study authors. For individual participant data PMA, the line by line data from each participant in each study must be collated, harmonised, and analysed. This process is usually easier for PMAs than for traditional, retrospective individual participant data meta-analyses because if outcome collection and coding were previously harmonised, fewer inconsistencies should arise. If possible, plans to share data should be outlined in each study’s ethics application and consent form. For PMAs that are planned after the eligible studies have commenced, amendments to ethics applications may be necessary for data sharing. To assure independent data analysis, some PMAs appoint an independent data manager and statistician who have not been involved in any of the studies. The initial time intensive planning and harmonisation phase is followed by a waiting period when all the individual studies are completed before their data are made available and synthesised. During this middle period, PMAs usually demand little time and can run alongside other projects.

For studies where data safety monitoring committees are appropriate, it might be sensible for the committees to communicate and plan joint interim analyses to take account of all the available evidence when making recommendations to continue or stop a study. The PMA collaboration should consider establishing a joint data monitoring committee to synthesise data from all included studies at prespecified times. Methods for sequential meta-analysis and adaptive trial design could be considered in this context. 26

When all studies have been synthesised, the methodological quality of the included studies needs to be appraised with validated tools, such as those recommended by Cochrane. 27 28 The certainty of the evidence can be assessed with the grading of recommendations assessment, development and evaluation (GRADE) approach. 29

The NeOProM Collaboration was established in 2005, the first trial commenced in 2005, the last trial’s results were available in 2016, and the final combined analysis was published in 2018. At the request of two of the trials’ data monitoring committees, an interim analysis of data from these two trials was undertaken in 2011 and both trials were stopped early. 30 The five trials included in NeOProM were assessed for risk of bias with the Cochrane domains, 31 and consensus was reached by discussion with the full study group. The risk of bias assessments were more accurate and complete after detailed discussion of several domains (eg, allocation concealment and blinding) between the NeOProM Collaborators than would have been possible with their publications alone. GRADE assessments were performed and published in the Cochrane version of the meta-analysis. 32

Step 7: interpretation and reporting of results

Generally, the quality of the evidence derived from a PMA, and the extent to which causal inferences can be made, directly depend on the type and quality of the studies included in a PMA. The prospective nature of interventional PMAs make them similar to large multicentre trials, allowing for causal conclusions to be drawn rather than only associations, as sometimes suggested for traditional retrospective meta-analyses. The results of observational PMAs should generally be interpreted as providing associations, not causal effects, as only the results of observational studies are included. But with modern methods for causal inference from observational studies, justification for supporting conclusions about causality can sometimes be found. 33

Currently no PMA specific reporting standards exist, but where applicable, PMA authors should follow the PRISMA-IPD (PRISMA of individual participant data) statement 34 if they are reporting an individual participant data PMA, or the PRISMA statement 35 if they are reporting an aggregate data PMA. As well as the PRISMA items, authors of PMAs need to report on identification of planned and ongoing studies, the PMA timeline, collaboration policies, and outcome harmonisation processes.

Discussions about methodology and interpretation of the results among all collaborators can sometimes be difficult to navigate, particularly if the results from the combination of the studies contradict the results of some of the individual studies. Although these discussions can be demanding and time consuming, robust discussion among experts can lead to well considered and high quality publications that can directly inform policy and practice.

For the successful management of a PMA collaboration, an explicit authorship policy should be in place. One model is to offer authorship to each member of the secretariat, and one investigator from each included study, for the main PMA publication, assuming they fulfil the authorship criteria of the International Committee of Medical Journal Editors (ICMJE). This model incentivises ongoing involvement and allows for multiple viewpoints to be integrated in the final publication. The collaborators usually agree that the final PMA results cannot be published until the results of each study are accepted for publication, but this is not essential.

At least one investigator from each of the participating trials was a co-author on the final publication for NeOProM. 13 Collaborators met regularly, face to face and by phone, to resolve opposing views and achieve consensus on the interpretation of the PMA findings. Face to face meetings were crucial in resolving major disagreements within the NeOProM Collaboration. The collaborators used the PRISMA-IPD checklist for reporting of the PMA.

PMAs have many advantages: they help reduce research waste and bias, while greatly improving use of data, and they are adaptive, efficient, and collaborative. PMAs increase the statistical power to detect effects of treatment and enable harmonised collection of core outcomes, while allowing enough variation to obtain greater generalisability of findings. Compared with a multicentre study, PMAs are more decentralised and allow greater flexibility in terms of funding and timelines. Compared with a retrospective meta-analysis, PMAs enable more data harmonisation and control. Planning a PMA can help a group of researchers prioritise a research question they can address collaboratively and determine the optimal sample size a priori. Disadvantages of PMAs include difficulties in searching for planned and ongoing studies, often long waiting periods for studies to be completed, and difficulties in reaching consensus on the interpretation of the results. Table 1 shows a detailed comparison of the features and advantages and disadvantages of PMAs, multicentre studies, and retrospective meta-analyses.

Advantages and disadvantages of a prospective meta-analysis (PMA) compared with a multicentre study and a retrospective meta-analysis

  • View inline

Integration of PMAs with other next generation systematic review methodologies

PMAs can be combined with other new systematic review methodologies. Living systematic reviews begin with a traditional systematic review but have continual updates with a predetermined frequency. Living systematic reviews address similar research questions as PMAs (high priority questions with inconclusive evidence in an active research field). 36 In some instances it might be beneficial to combine these two methodologies. If authors are considering a PMA in a discipline where evidence is expected to become available gradually, a living PMA is an option. In living PMAs, new studies are included as they are being planned (but importantly before any of the results related to the PMA research questions are known), until a definitive effect has been found or the maximum required statistical information has been reached to conclude that no clinically important effect has been found. 37 Appropriate statistical methods for multiple testing should be strongly considered in living PMAs, such as sequential meta-analysis methodology which controls for type 1 and type 2 errors and takes into account heterogeneity. 26 PMA methodology can also be combined with other methods, such as network meta-analysis or meta-analysis of prognostic models.

Future for PMAs

With the advancement of machine learning, artificial intelligence, and big data, new horizons are seen for PMAs. Several steps need to be taken to improve the feasibility and quality of PMAs. Firstly, the ability to identify planned and ongoing studies needs to be improved by introducing further mechanisms to promote and enforce study registration and providing guidance on the best search strategies. The ICMJE requirement for prospective registration of clinical trials, together with several other ethical and regulatory initiatives, has improved registration rates of clinical trials but more improvement is needed. 38 22 Possible solutions include the integration of data submitted to ethics committees, funding bodies, and clinical trial registries. 21 The Cochrane PMA Methods Group, in collaboration with several trial registries, is working on improving methods for identifying planned and ongoing studies. Future technologies might automate the searching and screening process for planned and ongoing studies and automatically connect researchers who are planning similar relevant studies. Furthermore, the reporting and quality of PMAs needs to be improved. The reporting of PMAs would be greatly helped by the development of a standardised set of reporting guidelines to which PMA authors can adhere. Such guidelines are currently under development. Also, the development of PMA specific evidence rating tools (such as an extension to the GRADE approach) would be highly desirable. The Cochrane PMA Methods Group will publicise any new developments in this area on their website ( https://methods.cochrane.org/pma/ ).

PMAs have many advantages, and mandating trial registration, development of core outcome sets, and improved data sharing abilities have increased opportunities for conducting PMAs. We hope this step by step guidance on PMAs will improve the understanding of PMAs in the research community and enable more researchers to conduct successful PMAs. The Cochrane PMA Methods Group can offer advice for researchers planning to undertake PMAs.

Contributors: ALS conceived the idea and facilitated the workshop and discussions. LA, DG, KEH, and ALS participated in the workshop, and JAB and SC contributed to further discussions after the workshop. ALS, SC, and KEH performed the searches for a scoping review that was conducted in preparation for this article, reviewing all prospective meta-analyses and methods papers on prospective meta-analyses in health research to date. LA was the coordinator of the NeOProM Collaboration and KEH was a member. ALS wrote the first draft of the manuscript. All authors contributed to and revised the manuscript. ALS is the guarantor.

Competing interests: We have read and understood the BMJ Group policy on declaration of interests and declare the following: all authors are convenors or members of the Cochrane PMA Methods Group and have been involved in numerous prospective meta-analyses. LA, DG, and JAB have published several methods articles on prospective meta-analyses and are authors of the prospective meta-analysis chapter in the Cochrane Handbook for Systematic Reviews of Interventions . LA manages the Australian New Zealand Clinical Trials Registry (ANZCTR). ALS and KEH work for the ANZCTR. JAB is a full time employee of Johnson & Johnson.

Provenance and peer review: Not commissioned; externally peer reviewed.

  • National Health and Medical Research Council
  • Ioannidis JP
  • Krleza-Jerić K ,
  • Berlin JA ,
  • Ioannidis J
  • Halpern SD ,
  • Karlawish JHT ,
  • ↵ Ghersi D, Berlin J, Askie L. Prospective meta‐analysis. In: Higgins JPT, Green S (eds), Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0 (updated March 2011). The Cochrane Collaboration, 2011:559-70.
  • Margitić SE ,
  • Morgan TM ,
  • Probstfield J ,
  • Applegate WB
  • Darlow BA ,
  • Neonatal Oxygenation Prospective Meta-analysis (NeOProM) Collaboration
  • Wright KW ,
  • Tarnow-Mordi W ,
  • Phelps DL ,
  • Pulse Oximetry Saturation Trial for Prevention of Retinopathy of Prematurity Planning Study Group
  • ↵ Green S, Higgins JPT (eds). Preparing a Cochrane review. In: Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0 (updated March 2011). The Cochrane Collaboration, 2011.
  • Shamseer L ,
  • PRISMA-P Group
  • Brocklehurst P ,
  • Schmidt B ,
  • NeOProM Collaborative Group
  • ↵ World Health Organization (WHO). WHO International Clinical Trials Registry Platform (ICTRP) Search Portal: http://apps.who.int/trialsearch/ [accessed 6 November 2018].
  • Isojarvi J ,
  • Lefebvre C ,
  • Glanville J
  • Hunter KE ,
  • Seidler AL ,
  • Harriman SL ,
  • Williamson PR ,
  • Altman DG ,
  • Seidler A ,
  • Mihrshahi S ,
  • Simmonds M ,
  • Salanti G ,
  • McKenzie J ,
  • Elliott J ,
  • Living Systematic Review Network
  • Higgins JPT ,
  • Sterne JAC ,
  • Savović J ,
  • Sterne JA ,
  • Hernán MA ,
  • Reeves BC ,
  • Stenson B ,
  • U.K. BOOST II trial ,
  • Australian BOOST II trial ,
  • New Zealand BOOST II trial
  • Higgins J ,
  • Stewart LA ,
  • PRISMA-IPD Development Group
  • Liberati A ,
  • Tetzlaff J ,
  • PRISMA Group
  • Elliott JH ,
  • Guyatt GH ,

research proposal meta analysis

Jump to navigation


Cochrane Training

Chapter 10: analysing data and undertaking meta-analyses.

Jonathan J Deeks, Julian PT Higgins, Douglas G Altman; on behalf of the Cochrane Statistical Methods Group

Key Points:

  • Meta-analysis is the statistical combination of results from two or more separate studies.
  • Potential advantages of meta-analyses include an improvement in precision, the ability to answer questions not posed by individual studies, and the opportunity to settle controversies arising from conflicting claims. However, they also have the potential to mislead seriously, particularly if specific study designs, within-study biases, variation across studies, and reporting biases are not carefully considered.
  • It is important to be familiar with the type of data (e.g. dichotomous, continuous) that result from measurement of an outcome in an individual study, and to choose suitable effect measures for comparing intervention groups.
  • Most meta-analysis methods are variations on a weighted average of the effect estimates from the different studies.
  • Studies with no events contribute no information about the risk ratio or odds ratio. For rare events, the Peto method has been observed to be less biased and more powerful than other methods.
  • Variation across studies (heterogeneity) must be considered, although most Cochrane Reviews do not have enough studies to allow for the reliable investigation of its causes. Random-effects meta-analyses allow for heterogeneity by assuming that underlying effects follow a normal distribution, but they must be interpreted carefully. Prediction intervals from random-effects meta-analyses are a useful device for presenting the extent of between-study variation.
  • Many judgements are required in the process of preparing a meta-analysis. Sensitivity analyses should be used to examine whether overall findings are robust to potentially influential decisions.

Cite this chapter as: Deeks JJ, Higgins JPT, Altman DG (editors). Chapter 10: Analysing data and undertaking meta-analyses. In: Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA (editors). Cochrane Handbook for Systematic Reviews of Interventions version 6.4 (updated August  2023). Cochrane, 2023. Available from www.training.cochrane.org/handbook .

10.1 Do not start here!

It can be tempting to jump prematurely into a statistical analysis when undertaking a systematic review. The production of a diamond at the bottom of a plot is an exciting moment for many authors, but results of meta-analyses can be very misleading if suitable attention has not been given to formulating the review question; specifying eligibility criteria; identifying and selecting studies; collecting appropriate data; considering risk of bias; planning intervention comparisons; and deciding what data would be meaningful to analyse. Review authors should consult the chapters that precede this one before a meta-analysis is undertaken.

10.2 Introduction to meta-analysis

An important step in a systematic review is the thoughtful consideration of whether it is appropriate to combine the numerical results of all, or perhaps some, of the studies. Such a meta-analysis yields an overall statistic (together with its confidence interval) that summarizes the effectiveness of an experimental intervention compared with a comparator intervention. Potential advantages of meta-analyses include the following:

  • T o improve precision . Many studies are too small to provide convincing evidence about intervention effects in isolation. Estimation is usually improved when it is based on more information.
  • To answer questions not posed by the individual studies . Primary studies often involve a specific type of participant and explicitly defined interventions. A selection of studies in which these characteristics differ can allow investigation of the consistency of effect across a wider range of populations and interventions. It may also, if relevant, allow reasons for differences in effect estimates to be investigated.
  • To settle controversies arising from apparently conflicting studies or to generate new hypotheses . Statistical synthesis of findings allows the degree of conflict to be formally assessed, and reasons for different results to be explored and quantified.

Of course, the use of statistical synthesis methods does not guarantee that the results of a review are valid, any more than it does for a primary study. Moreover, like any tool, statistical methods can be misused.

This chapter describes the principles and methods used to carry out a meta-analysis for a comparison of two interventions for the main types of data encountered. The use of network meta-analysis to compare more than two interventions is addressed in Chapter 11 . Formulae for most of the methods described are provided in the RevMan Web Knowledge Base under Statistical Algorithms and calculations used in Review Manager (documentation.cochrane.org/revman-kb/statistical-methods-210600101.html), and a longer discussion of many of the issues is available ( Deeks et al 2001 ).

10.2.1 Principles of meta-analysis

The commonly used methods for meta-analysis follow the following basic principles:

  • Meta-analysis is typically a two-stage process. In the first stage, a summary statistic is calculated for each study, to describe the observed intervention effect in the same way for every study. For example, the summary statistic may be a risk ratio if the data are dichotomous, or a difference between means if the data are continuous (see Chapter 6 ).

research proposal meta analysis

  • The combination of intervention effect estimates across studies may optionally incorporate an assumption that the studies are not all estimating the same intervention effect, but estimate intervention effects that follow a distribution across studies. This is the basis of a random-effects meta-analysis (see Section 10.10.4 ). Alternatively, if it is assumed that each study is estimating exactly the same quantity, then a fixed-effect meta-analysis is performed.
  • The standard error of the summary intervention effect can be used to derive a confidence interval, which communicates the precision (or uncertainty) of the summary estimate; and to derive a P value, which communicates the strength of the evidence against the null hypothesis of no intervention effect.
  • As well as yielding a summary quantification of the intervention effect, all methods of meta-analysis can incorporate an assessment of whether the variation among the results of the separate studies is compatible with random variation, or whether it is large enough to indicate inconsistency of intervention effects across studies (see Section 10.10 ).
  • The problem of missing data is one of the numerous practical considerations that must be thought through when undertaking a meta-analysis. In particular, review authors should consider the implications of missing outcome data from individual participants (due to losses to follow-up or exclusions from analysis) (see Section 10.12 ).

Meta-analyses are usually illustrated using a forest plot . An example appears in Figure 10.2.a . A forest plot displays effect estimates and confidence intervals for both individual studies and meta-analyses (Lewis and Clarke 2001). Each study is represented by a block at the point estimate of intervention effect with a horizontal line extending either side of the block. The area of the block indicates the weight assigned to that study in the meta-analysis while the horizontal line depicts the confidence interval (usually with a 95% level of confidence). The area of the block and the confidence interval convey similar information, but both make different contributions to the graphic. The confidence interval depicts the range of intervention effects compatible with the study’s result. The size of the block draws the eye towards the studies with larger weight (usually those with narrower confidence intervals), which dominate the calculation of the summary result, presented as a diamond at the bottom.

Figure 10.2.a Example of a forest plot from a review of interventions to promote ownership of smoke alarms (DiGuiseppi and Higgins 2001). Reproduced with permission of John Wiley & Sons

research proposal meta analysis

10.3 A generic inverse-variance approach to meta-analysis

A very common and simple version of the meta-analysis procedure is commonly referred to as the inverse-variance method . This approach is implemented in its most basic form in RevMan, and is used behind the scenes in many meta-analyses of both dichotomous and continuous data.

The inverse-variance method is so named because the weight given to each study is chosen to be the inverse of the variance of the effect estimate (i.e. 1 over the square of its standard error). Thus, larger studies, which have smaller standard errors, are given more weight than smaller studies, which have larger standard errors. This choice of weights minimizes the imprecision (uncertainty) of the pooled effect estimate.

10.3.1 Fixed-effect method for meta-analysis

A fixed-effect meta-analysis using the inverse-variance method calculates a weighted average as:

research proposal meta analysis

where Y i is the intervention effect estimated in the i th study, SE i is the standard error of that estimate, and the summation is across all studies. The basic data required for the analysis are therefore an estimate of the intervention effect and its standard error from each study. A fixed-effect meta-analysis is valid under an assumption that all effect estimates are estimating the same underlying intervention effect, which is referred to variously as a ‘fixed-effect’ assumption, a ‘common-effect’ assumption or an ‘equal-effects’ assumption. However, the result of the meta-analysis can be interpreted without making such an assumption (Rice et al 2018).

10.3.2 Random-effects methods for meta-analysis

A variation on the inverse-variance method is to incorporate an assumption that the different studies are estimating different, yet related, intervention effects (Higgins et al 2009). This produces a random-effects meta-analysis, and the simplest version is known as the DerSimonian and Laird method (DerSimonian and Laird 1986). Random-effects meta-analysis is discussed in detail in Section 10.10.4 .

10.3.3 Performing inverse-variance meta-analyses

Most meta-analysis programs perform inverse-variance meta-analyses. Usually the user provides summary data from each intervention arm of each study, such as a 2×2 table when the outcome is dichotomous (see Chapter 6, Section 6.4 ), or means, standard deviations and sample sizes for each group when the outcome is continuous (see Chapter 6, Section 6.5 ). This avoids the need for the author to calculate effect estimates, and allows the use of methods targeted specifically at different types of data (see Sections 10.4 and 10.5 ).

When the data are conveniently available as summary statistics from each intervention group, the inverse-variance method can be implemented directly. For example, estimates and their standard errors may be entered directly into RevMan under the ‘Generic inverse variance’ outcome type. For ratio measures of intervention effect, the data must be entered into RevMan as natural logarithms (for example, as a log odds ratio and the standard error of the log odds ratio). However, it is straightforward to instruct the software to display results on the original (e.g. odds ratio) scale. It is possible to supplement or replace this with a column providing the sample sizes in the two groups. Note that the ability to enter estimates and standard errors creates a high degree of flexibility in meta-analysis. It facilitates the analysis of properly analysed crossover trials, cluster-randomized trials and non-randomized trials (see Chapter 23 ), as well as outcome data that are ordinal, time-to-event or rates (see Chapter 6 ).

10.4 Meta-analysis of dichotomous outcomes

There are four widely used methods of meta-analysis for dichotomous outcomes, three fixed-effect methods (Mantel-Haenszel, Peto and inverse variance) and one random-effects method (DerSimonian and Laird inverse variance). All of these methods are available as analysis options in RevMan. The Peto method can only combine odds ratios, whilst the other three methods can combine odds ratios, risk ratios or risk differences. Formulae for all of the meta-analysis methods are available elsewhere (Deeks et al 2001).

Note that having no events in one group (sometimes referred to as ‘zero cells’) causes problems with computation of estimates and standard errors with some methods: see Section 10.4.4 .

10.4.1 Mantel-Haenszel methods

When data are sparse, either in terms of event risks being low or study size being small, the estimates of the standard errors of the effect estimates that are used in the inverse-variance methods may be poor. Mantel-Haenszel methods are fixed-effect meta-analysis methods using a different weighting scheme that depends on which effect measure (e.g. risk ratio, odds ratio, risk difference) is being used (Mantel and Haenszel 1959, Greenland and Robins 1985). They have been shown to have better statistical properties when there are few events. As this is a common situation in Cochrane Reviews, the Mantel-Haenszel method is generally preferable to the inverse variance method in fixed-effect meta-analyses. In other situations the two methods give similar estimates.

10.4.2 Peto odds ratio method

Peto’s method can only be used to combine odds ratios (Yusuf et al 1985). It uses an inverse-variance approach, but uses an approximate method of estimating the log odds ratio, and uses different weights. An alternative way of viewing the Peto method is as a sum of ‘O – E’ statistics. Here, O is the observed number of events and E is an expected number of events in the experimental intervention group of each study under the null hypothesis of no intervention effect.

The approximation used in the computation of the log odds ratio works well when intervention effects are small (odds ratios are close to 1), events are not particularly common and the studies have similar numbers in experimental and comparator groups. In other situations it has been shown to give biased answers. As these criteria are not always fulfilled, Peto’s method is not recommended as a default approach for meta-analysis.

Corrections for zero cell counts are not necessary when using Peto’s method. Perhaps for this reason, this method performs well when events are very rare (Bradburn et al 2007); see Section . Also, Peto’s method can be used to combine studies with dichotomous outcome data with studies using time-to-event analyses where log-rank tests have been used (see Section 10.9 ).

10.4.3 Which effect measure for dichotomous outcomes?

Effect measures for dichotomous data are described in Chapter 6, Section 6.4.1 . The effect of an intervention can be expressed as either a relative or an absolute effect. The risk ratio (relative risk) and odds ratio are relative measures, while the risk difference and number needed to treat for an additional beneficial outcome are absolute measures. A further complication is that there are, in fact, two risk ratios. We can calculate the risk ratio of an event occurring or the risk ratio of no event occurring. These give different summary results in a meta-analysis, sometimes dramatically so.

The selection of a summary statistic for use in meta-analysis depends on balancing three criteria (Deeks 2002). First, we desire a summary statistic that gives values that are similar for all the studies in the meta-analysis and subdivisions of the population to which the interventions will be applied. The more consistent the summary statistic, the greater is the justification for expressing the intervention effect as a single summary number. Second, the summary statistic must have the mathematical properties required to perform a valid meta-analysis. Third, the summary statistic would ideally be easily understood and applied by those using the review. The summary intervention effect should be presented in a way that helps readers to interpret and apply the results appropriately. Among effect measures for dichotomous data, no single measure is uniformly best, so the choice inevitably involves a compromise.

Consistency Empirical evidence suggests that relative effect measures are, on average, more consistent than absolute measures (Engels et al 2000, Deeks 2002, Rücker et al 2009). For this reason, it is wise to avoid performing meta-analyses of risk differences, unless there is a clear reason to suspect that risk differences will be consistent in a particular clinical situation. On average there is little difference between the odds ratio and risk ratio in terms of consistency (Deeks 2002). When the study aims to reduce the incidence of an adverse event, there is empirical evidence that risk ratios of the adverse event are more consistent than risk ratios of the non-event (Deeks 2002). Selecting an effect measure based on what is the most consistent in a particular situation is not a generally recommended strategy, since it may lead to a selection that spuriously maximizes the precision of a meta-analysis estimate.

Mathematical properties The most important mathematical criterion is the availability of a reliable variance estimate. The number needed to treat for an additional beneficial outcome does not have a simple variance estimator and cannot easily be used directly in meta-analysis, although it can be computed from the meta-analysis result afterwards (see Chapter 15, Section 15.4.2 ). There is no consensus regarding the importance of two other often-cited mathematical properties: the fact that the behaviour of the odds ratio and the risk difference do not rely on which of the two outcome states is coded as the event, and the odds ratio being the only statistic which is unbounded (see Chapter 6, Section 6.4.1 ).

Ease of interpretation The odds ratio is the hardest summary statistic to understand and to apply in practice, and many practising clinicians report difficulties in using them. There are many published examples where authors have misinterpreted odds ratios from meta-analyses as risk ratios. Although odds ratios can be re-expressed for interpretation (as discussed here), there must be some concern that routine presentation of the results of systematic reviews as odds ratios will lead to frequent over-estimation of the benefits and harms of interventions when the results are applied in clinical practice. Absolute measures of effect are thought to be more easily interpreted by clinicians than relative effects (Sinclair and Bracken 1994), and allow trade-offs to be made between likely benefits and likely harms of interventions. However, they are less likely to be generalizable.

It is generally recommended that meta-analyses are undertaken using risk ratios (taking care to make a sensible choice over which category of outcome is classified as the event) or odds ratios. This is because it seems important to avoid using summary statistics for which there is empirical evidence that they are unlikely to give consistent estimates of intervention effects (the risk difference), and it is impossible to use statistics for which meta-analysis cannot be performed (the number needed to treat for an additional beneficial outcome). It may be wise to plan to undertake a sensitivity analysis to investigate whether choice of summary statistic (and selection of the event category) is critical to the conclusions of the meta-analysis (see Section 10.14 ).

It is often sensible to use one statistic for meta-analysis and to re-express the results using a second, more easily interpretable statistic. For example, often meta-analysis may be best performed using relative effect measures (risk ratios or odds ratios) and the results re-expressed using absolute effect measures (risk differences or numbers needed to treat for an additional beneficial outcome – see Chapter 15, Section 15.4 . This is one of the key motivations for ‘Summary of findings’ tables in Cochrane Reviews: see Chapter 14 ). If odds ratios are used for meta-analysis they can also be re-expressed as risk ratios (see Chapter 15, Section 15.4 ). In all cases the same formulae can be used to convert upper and lower confidence limits. However, all of these transformations require specification of a value of baseline risk that indicates the likely risk of the outcome in the ‘control’ population to which the experimental intervention will be applied. Where the chosen value for this assumed comparator group risk is close to the typical observed comparator group risks across the studies, similar estimates of absolute effect will be obtained regardless of whether odds ratios or risk ratios are used for meta-analysis. Where the assumed comparator risk differs from the typical observed comparator group risk, the predictions of absolute benefit will differ according to which summary statistic was used for meta-analysis.

10.4.4 Meta-analysis of rare events

For rare outcomes, meta-analysis may be the only way to obtain reliable evidence of the effects of healthcare interventions. Individual studies are usually under-powered to detect differences in rare outcomes, but a meta-analysis of many studies may have adequate power to investigate whether interventions do have an impact on the incidence of the rare event. However, many methods of meta-analysis are based on large sample approximations, and are unsuitable when events are rare. Thus authors must take care when selecting a method of meta-analysis (Efthimiou 2018).

There is no single risk at which events are classified as ‘rare’. Certainly risks of 1 in 1000 constitute rare events, and many would classify risks of 1 in 100 the same way. However, the performance of methods when risks are as high as 1 in 10 may also be affected by the issues discussed in this section. What is typical is that a high proportion of the studies in the meta-analysis observe no events in one or more study arms. Studies with no events in one or more arms

Computational problems can occur when no events are observed in one or both groups in an individual study. Inverse variance meta-analytical methods involve computing an intervention effect estimate and its standard error for each study. For studies where no events were observed in one or both arms, these computations often involve dividing by a zero count, which yields a computational error. Most meta-analytical software routines (including those in RevMan) automatically check for problematic zero counts, and add a fixed value (typically 0.5) to all cells of a 2×2 table where the problems occur. The Mantel-Haenszel methods require zero-cell corrections only if the same cell is zero in all the included studies, and hence need to use the correction less often. However, in many software applications the same correction rules are applied for Mantel-Haenszel methods as for the inverse-variance methods. Odds ratio and risk ratio methods require zero cell corrections more often than difference methods, except for the Peto odds ratio method, which encounters computation problems only in the extreme situation of no events occurring in all arms of all studies.

Whilst the fixed correction meets the objective of avoiding computational errors, it usually has the undesirable effect of biasing study estimates towards no difference and over-estimating variances of study estimates (consequently down-weighting inappropriately their contribution to the meta-analysis). Where the sizes of the study arms are unequal (which occurs more commonly in non-randomized studies than randomized trials), they will introduce a directional bias in the treatment effect. Alternative non-fixed zero-cell corrections have been explored by Sweeting and colleagues, including a correction proportional to the reciprocal of the size of the contrasting study arm, which they found preferable to the fixed 0.5 correction when arm sizes were not balanced (Sweeting et al 2004). Studies with no events in either arm

The standard practice in meta-analysis of odds ratios and risk ratios is to exclude studies from the meta-analysis where there are no events in both arms. This is because such studies do not provide any indication of either the direction or magnitude of the relative treatment effect. Whilst it may be clear that events are very rare on both the experimental intervention and the comparator intervention, no information is provided as to which group is likely to have the higher risk, or on whether the risks are of the same or different orders of magnitude (when risks are very low, they are compatible with very large or very small ratios). Whilst one might be tempted to infer that the risk would be lowest in the group with the larger sample size (as the upper limit of the confidence interval would be lower), this is not justified as the sample size allocation was determined by the study investigators and is not a measure of the incidence of the event.

Risk difference methods superficially appear to have an advantage over odds ratio methods in that the risk difference is defined (as zero) when no events occur in either arm. Such studies are therefore included in the estimation process. Bradburn and colleagues undertook simulation studies which revealed that all risk difference methods yield confidence intervals that are too wide when events are rare, and have associated poor statistical power, which make them unsuitable for meta-analysis of rare events (Bradburn et al 2007). This is especially relevant when outcomes that focus on treatment safety are being studied, as the ability to identify correctly (or attempt to refute) serious adverse events is a key issue in drug development.

It is likely that outcomes for which no events occur in either arm may not be mentioned in reports of many randomized trials, precluding their inclusion in a meta-analysis. It is unclear, though, when working with published results, whether failure to mention a particular adverse event means there were no such events, or simply that such events were not included as a measured endpoint. Whilst the results of risk difference meta-analyses will be affected by non-reporting of outcomes with no events, odds and risk ratio based methods naturally exclude these data whether or not they are published, and are therefore unaffected. Validity of methods of meta-analysis for rare events

Simulation studies have revealed that many meta-analytical methods can give misleading results for rare events, which is unsurprising given their reliance on asymptotic statistical theory. Their performance has been judged suboptimal either through results being biased, confidence intervals being inappropriately wide, or statistical power being too low to detect substantial differences.

In the following we consider the choice of statistical method for meta-analyses of odds ratios. Appropriate choices appear to depend on the comparator group risk, the likely size of the treatment effect and consideration of balance in the numbers of experimental and comparator participants in the constituent studies. We are not aware of research that has evaluated risk ratio measures directly, but their performance is likely to be very similar to corresponding odds ratio measurements. When events are rare, estimates of odds and risks are near identical, and results of both can be interpreted as ratios of probabilities.

Bradburn and colleagues found that many of the most commonly used meta-analytical methods were biased when events were rare (Bradburn et al 2007). The bias was greatest in inverse variance and DerSimonian and Laird odds ratio and risk difference methods, and the Mantel-Haenszel odds ratio method using a 0.5 zero-cell correction. As already noted, risk difference meta-analytical methods tended to show conservative confidence interval coverage and low statistical power when risks of events were low.

At event rates below 1% the Peto one-step odds ratio method was found to be the least biased and most powerful method, and provided the best confidence interval coverage, provided there was no substantial imbalance between treatment and comparator group sizes within studies, and treatment effects were not exceptionally large. This finding was consistently observed across three different meta-analytical scenarios, and was also observed by Sweeting and colleagues (Sweeting et al 2004).

This finding was noted despite the method producing only an approximation to the odds ratio. For very large effects (e.g. risk ratio=0.2) when the approximation is known to be poor, treatment effects were under-estimated, but the Peto method still had the best performance of all the methods considered for event risks of 1 in 1000, and the bias was never more than 6% of the comparator group risk.

In other circumstances (i.e. event risks above 1%, very large effects at event risks around 1%, and meta-analyses where many studies were substantially imbalanced) the best performing methods were the Mantel-Haenszel odds ratio without zero-cell corrections, logistic regression and an exact method. None of these methods is available in RevMan.

Methods that should be avoided with rare events are the inverse-variance methods (including the DerSimonian and Laird random-effects method) (Efthimiou 2018). These directly incorporate the study’s variance in the estimation of its contribution to the meta-analysis, but these are usually based on a large-sample variance approximation, which was not intended for use with rare events. We would suggest that incorporation of heterogeneity into an estimate of a treatment effect should be a secondary consideration when attempting to produce estimates of effects from sparse data – the primary concern is to discern whether there is any signal of an effect in the data.

10.5 Meta-analysis of continuous outcomes

An important assumption underlying standard methods for meta-analysis of continuous data is that the outcomes have a normal distribution in each intervention arm in each study. This assumption may not always be met, although it is unimportant in very large studies. It is useful to consider the possibility of skewed data (see Section 10.5.3 ).

10.5.1 Which effect measure for continuous outcomes?

The two summary statistics commonly used for meta-analysis of continuous data are the mean difference (MD) and the standardized mean difference (SMD). Other options are available, such as the ratio of means (see Chapter 6, Section 6.5.1 ). Selection of summary statistics for continuous data is principally determined by whether studies all report the outcome using the same scale (when the mean difference can be used) or using different scales (when the standardized mean difference is usually used). The ratio of means can be used in either situation, but is appropriate only when outcome measurements are strictly greater than zero. Further considerations in deciding on an effect measure that will facilitate interpretation of the findings appears in Chapter 15, Section 15.5 .

The different roles played in MD and SMD approaches by the standard deviations (SDs) of outcomes observed in the two groups should be understood.

For the mean difference approach, the SDs are used together with the sample sizes to compute the weight given to each study. Studies with small SDs are given relatively higher weight whilst studies with larger SDs are given relatively smaller weights. This is appropriate if variation in SDs between studies reflects differences in the reliability of outcome measurements, but is probably not appropriate if the differences in SD reflect real differences in the variability of outcomes in the study populations.

For the standardized mean difference approach, the SDs are used to standardize the mean differences to a single scale, as well as in the computation of study weights. Thus, studies with small SDs lead to relatively higher estimates of SMD, whilst studies with larger SDs lead to relatively smaller estimates of SMD. For this to be appropriate, it must be assumed that between-study variation in SDs reflects only differences in measurement scales and not differences in the reliability of outcome measures or variability among study populations, as discussed in Chapter 6, Section .

These assumptions of the methods should be borne in mind when unexpected variation of SDs is observed across studies.

10.5.2 Meta-analysis of change scores

In some circumstances an analysis based on changes from baseline will be more efficient and powerful than comparison of post-intervention values, as it removes a component of between-person variability from the analysis. However, calculation of a change score requires measurement of the outcome twice and in practice may be less efficient for outcomes that are unstable or difficult to measure precisely, where the measurement error may be larger than true between-person baseline variability. Change-from-baseline outcomes may also be preferred if they have a less skewed distribution than post-intervention measurement outcomes. Although sometimes used as a device to ‘correct’ for unlucky randomization, this practice is not recommended.

The preferred statistical approach to accounting for baseline measurements of the outcome variable is to include the baseline outcome measurements as a covariate in a regression model or analysis of covariance (ANCOVA). These analyses produce an ‘adjusted’ estimate of the intervention effect together with its standard error. These analyses are the least frequently encountered, but as they give the most precise and least biased estimates of intervention effects they should be included in the analysis when they are available. However, they can only be included in a meta-analysis using the generic inverse-variance method, since means and SDs are not available for each intervention group separately.

In practice an author is likely to discover that the studies included in a review include a mixture of change-from-baseline and post-intervention value scores. However, mixing of outcomes is not a problem when it comes to meta-analysis of MDs. There is no statistical reason why studies with change-from-baseline outcomes should not be combined in a meta-analysis with studies with post-intervention measurement outcomes when using the (unstandardized) MD method. In a randomized study, MD based on changes from baseline can usually be assumed to be addressing exactly the same underlying intervention effects as analyses based on post-intervention measurements. That is to say, the difference in mean post-intervention values will on average be the same as the difference in mean change scores. If the use of change scores does increase precision, appropriately, the studies presenting change scores will be given higher weights in the analysis than they would have received if post-intervention values had been used, as they will have smaller SDs.

When combining the data on the MD scale, authors must be careful to use the appropriate means and SDs (either of post-intervention measurements or of changes from baseline) for each study. Since the mean values and SDs for the two types of outcome may differ substantially, it may be advisable to place them in separate subgroups to avoid confusion for the reader, but the results of the subgroups can legitimately be pooled together.

In contrast, post-intervention value and change scores should not in principle be combined using standard meta-analysis approaches when the effect measure is an SMD. This is because the SDs used in the standardization reflect different things. The SD when standardizing post-intervention values reflects between-person variability at a single point in time. The SD when standardizing change scores reflects variation in between-person changes over time, so will depend on both within-person and between-person variability; within-person variability in turn is likely to depend on the length of time between measurements. Nevertheless, an empirical study of 21 meta-analyses in osteoarthritis did not find a difference between combined SMDs based on post-intervention values and combined SMDs based on change scores (da Costa et al 2013). One option is to standardize SMDs using post-intervention SDs rather than change score SDs. This would lead to valid synthesis of the two approaches, but we are not aware that an appropriate standard error for this has been derived.

A common practical problem associated with including change-from-baseline measures is that the SD of changes is not reported. Imputation of SDs is discussed in Chapter 6, Section .

10.5.3 Meta-analysis of skewed data

Analyses based on means are appropriate for data that are at least approximately normally distributed, and for data from very large trials. If the true distribution of outcomes is asymmetrical, then the data are said to be skewed. Review authors should consider the possibility and implications of skewed data when analysing continuous outcomes (see MECIR Box 10.5.a ). Skew can sometimes be diagnosed from the means and SDs of the outcomes. A rough check is available, but it is only valid if a lowest or highest possible value for an outcome is known to exist. Thus, the check may be used for outcomes such as weight, volume and blood concentrations, which have lowest possible values of 0, or for scale outcomes with minimum or maximum scores, but it may not be appropriate for change-from-baseline measures. The check involves calculating the observed mean minus the lowest possible value (or the highest possible value minus the observed mean), and dividing this by the SD. A ratio less than 2 suggests skew (Altman and Bland 1996). If the ratio is less than 1, there is strong evidence of a skewed distribution.

Transformation of the original outcome data may reduce skew substantially. Reports of trials may present results on a transformed scale, usually a log scale. Collection of appropriate data summaries from the trialists, or acquisition of individual patient data, is currently the approach of choice. Appropriate data summaries and analysis strategies for the individual patient data will depend on the situation. Consultation with a knowledgeable statistician is advised.

Where data have been analysed on a log scale, results are commonly presented as geometric means and ratios of geometric means. A meta-analysis may be then performed on the scale of the log-transformed data; an example of the calculation of the required means and SD is given in Chapter 6, Section . This approach depends on being able to obtain transformed data for all studies; methods for transforming from one scale to the other are available (Higgins et al 2008b). Log-transformed and untransformed data should not be mixed in a meta-analysis.

MECIR Box 10.5.a Relevant expectations for conduct of intervention reviews

10.6 Combining dichotomous and continuous outcomes

Occasionally authors encounter a situation where data for the same outcome are presented in some studies as dichotomous data and in other studies as continuous data. For example, scores on depression scales can be reported as means, or as the percentage of patients who were depressed at some point after an intervention (i.e. with a score above a specified cut-point). This type of information is often easier to understand, and more helpful, when it is dichotomized. However, deciding on a cut-point may be arbitrary, and information is lost when continuous data are transformed to dichotomous data.

There are several options for handling combinations of dichotomous and continuous data. Generally, it is useful to summarize results from all the relevant, valid studies in a similar way, but this is not always possible. It may be possible to collect missing data from investigators so that this can be done. If not, it may be useful to summarize the data in three ways: by entering the means and SDs as continuous outcomes, by entering the counts as dichotomous outcomes and by entering all of the data in text form as ‘Other data’ outcomes.

There are statistical approaches available that will re-express odds ratios as SMDs (and vice versa), allowing dichotomous and continuous data to be combined (Anzures-Cabrera et al 2011). A simple approach is as follows. Based on an assumption that the underlying continuous measurements in each intervention group follow a logistic distribution (which is a symmetrical distribution similar in shape to the normal distribution, but with more data in the distributional tails), and that the variability of the outcomes is the same in both experimental and comparator participants, the odds ratios can be re-expressed as a SMD according to the following simple formula (Chinn 2000):

research proposal meta analysis

The standard error of the log odds ratio can be converted to the standard error of a SMD by multiplying by the same constant (√3/π=0.5513). Alternatively SMDs can be re-expressed as log odds ratios by multiplying by π/√3=1.814. Once SMDs (or log odds ratios) and their standard errors have been computed for all studies in the meta-analysis, they can be combined using the generic inverse-variance method. Standard errors can be computed for all studies by entering the data as dichotomous and continuous outcome type data, as appropriate, and converting the confidence intervals for the resulting log odds ratios and SMDs into standard errors (see Chapter 6, Section 6.3 ).

10.7 Meta-analysis of ordinal outcomes and measurement scale s

Ordinal and measurement scale outcomes are most commonly meta-analysed as dichotomous data (if so, see Section 10.4 ) or continuous data (if so, see Section 10.5 ) depending on the way that the study authors performed the original analyses.

Occasionally it is possible to analyse the data using proportional odds models. This is the case when ordinal scales have a small number of categories, the numbers falling into each category for each intervention group can be obtained, and the same ordinal scale has been used in all studies. This approach may make more efficient use of all available data than dichotomization, but requires access to statistical software and results in a summary statistic for which it is challenging to find a clinical meaning.

The proportional odds model uses the proportional odds ratio as the measure of intervention effect (Agresti 1996) (see Chapter 6, Section 6.6 ), and can be used for conducting a meta-analysis in advanced statistical software packages (Whitehead and Jones 1994). Estimates of log odds ratios and their standard errors from a proportional odds model may be meta-analysed using the generic inverse-variance method (see Section 10.3.3 ). If the same ordinal scale has been used in all studies, but in some reports has been presented as a dichotomous outcome, it may still be possible to include all studies in the meta-analysis. In the context of the three-category model, this might mean that for some studies category 1 constitutes a success, while for others both categories 1 and 2 constitute a success. Methods are available for dealing with this, and for combining data from scales that are related but have different definitions for their categories (Whitehead and Jones 1994).

10.8 Meta-analysis of counts and rates

Results may be expressed as count data when each participant may experience an event, and may experience it more than once (see Chapter 6, Section 6.7 ). For example, ‘number of strokes’, or ‘number of hospital visits’ are counts. These events may not happen at all, but if they do happen there is no theoretical maximum number of occurrences for an individual. Count data may be analysed using methods for dichotomous data if the counts are dichotomized for each individual (see Section 10.4 ), continuous data (see Section 10.5 ) and time-to-event data (see Section 10.9 ), as well as being analysed as rate data.

Rate data occur if counts are measured for each participant along with the time over which they are observed. This is particularly appropriate when the events being counted are rare. For example, a woman may experience two strokes during a follow-up period of two years. Her rate of strokes is one per year of follow-up (or, equivalently 0.083 per month of follow-up). Rates are conventionally summarized at the group level. For example, participants in the comparator group of a clinical trial may experience 85 strokes during a total of 2836 person-years of follow-up. An underlying assumption associated with the use of rates is that the risk of an event is constant across participants and over time. This assumption should be carefully considered for each situation. For example, in contraception studies, rates have been used (known as Pearl indices) to describe the number of pregnancies per 100 women-years of follow-up. This is now considered inappropriate since couples have different risks of conception, and the risk for each woman changes over time. Pregnancies are now analysed more often using life tables or time-to-event methods that investigate the time elapsing before the first pregnancy.

Analysing count data as rates is not always the most appropriate approach and is uncommon in practice. This is because:

  • the assumption of a constant underlying risk may not be suitable; and
  • the statistical methods are not as well developed as they are for other types of data.

The results of a study may be expressed as a rate ratio , that is the ratio of the rate in the experimental intervention group to the rate in the comparator group. The (natural) logarithms of the rate ratios may be combined across studies using the generic inverse-variance method (see Section 10.3.3 ). Alternatively, Poisson regression approaches can be used (Spittal et al 2015).

In a randomized trial, rate ratios may often be very similar to risk ratios obtained after dichotomizing the participants, since the average period of follow-up should be similar in all intervention groups. Rate ratios and risk ratios will differ, however, if an intervention affects the likelihood of some participants experiencing multiple events.

It is possible also to focus attention on the rate difference (see Chapter 6, Section 6.7.1 ). The analysis again can be performed using the generic inverse-variance method (Hasselblad and McCrory 1995, Guevara et al 2004).

10.9 Meta-analysis of time-to-event outcomes

Two approaches to meta-analysis of time-to-event outcomes are readily available to Cochrane Review authors. The choice of which to use will depend on the type of data that have been extracted from the primary studies, or obtained from re-analysis of individual participant data.

If ‘O – E’ and ‘V’ statistics have been obtained (see Chapter 6, Section 6.8.2 ), either through re-analysis of individual participant data or from aggregate statistics presented in the study reports, then these statistics may be entered directly into RevMan using the ‘O – E and Variance’ outcome type. There are several ways to calculate these ‘O – E’ and ‘V’ statistics. Peto’s method applied to dichotomous data (Section 10.4.2 ) gives rise to an odds ratio; a log-rank approach gives rise to a hazard ratio; and a variation of the Peto method for analysing time-to-event data gives rise to something in between (Simmonds et al 2011). The appropriate effect measure should be specified. Only fixed-effect meta-analysis methods are available in RevMan for ‘O – E and Variance’ outcomes.

Alternatively, if estimates of log hazard ratios and standard errors have been obtained from results of Cox proportional hazards regression models, study results can be combined using generic inverse-variance methods (see Section 10.3.3 ).

If a mixture of log-rank and Cox model estimates are obtained from the studies, all results can be combined using the generic inverse-variance method, as the log-rank estimates can be converted into log hazard ratios and standard errors using the approaches discussed in Chapter 6, Section 6.8 .

10.10 Heterogeneity

10.10.1 what is heterogeneity.

Inevitably, studies brought together in a systematic review will differ. Any kind of variability among studies in a systematic review may be termed heterogeneity. It can be helpful to distinguish between different types of heterogeneity. Variability in the participants, interventions and outcomes studied may be described as clinical diversity (sometimes called clinical heterogeneity), and variability in study design, outcome measurement tools and risk of bias may be described as methodological diversity (sometimes called methodological heterogeneity). Variability in the intervention effects being evaluated in the different studies is known as statistical heterogeneity , and is a consequence of clinical or methodological diversity, or both, among the studies. Statistical heterogeneity manifests itself in the observed intervention effects being more different from each other than one would expect due to random error (chance) alone. We will follow convention and refer to statistical heterogeneity simply as heterogeneity .

Clinical variation will lead to heterogeneity if the intervention effect is affected by the factors that vary across studies; most obviously, the specific interventions or patient characteristics. In other words, the true intervention effect will be different in different studies.

Differences between studies in terms of methodological factors, such as use of blinding and concealment of allocation sequence, or if there are differences between studies in the way the outcomes are defined and measured, may be expected to lead to differences in the observed intervention effects. Significant statistical heterogeneity arising from methodological diversity or differences in outcome assessments suggests that the studies are not all estimating the same quantity, but does not necessarily suggest that the true intervention effect varies. In particular, heterogeneity associated solely with methodological diversity would indicate that the studies suffer from different degrees of bias. Empirical evidence suggests that some aspects of design can affect the result of clinical trials, although this is not always the case. Further discussion appears in Chapter 7 and Chapter 8 .

The scope of a review will largely determine the extent to which studies included in a review are diverse. Sometimes a review will include studies addressing a variety of questions, for example when several different interventions for the same condition are of interest (see also Chapter 11 ) or when the differential effects of an intervention in different populations are of interest. Meta-analysis should only be considered when a group of studies is sufficiently homogeneous in terms of participants, interventions and outcomes to provide a meaningful summary (see MECIR Box 10.10.a. ). It is often appropriate to take a broader perspective in a meta-analysis than in a single clinical trial. A common analogy is that systematic reviews bring together apples and oranges, and that combining these can yield a meaningless result. This is true if apples and oranges are of intrinsic interest on their own, but may not be if they are used to contribute to a wider question about fruit. For example, a meta-analysis may reasonably evaluate the average effect of a class of drugs by combining results from trials where each evaluates the effect of a different drug from the class.

MECIR Box 10.10.a Relevant expectations for conduct of intervention reviews

There may be specific interest in a review in investigating how clinical and methodological aspects of studies relate to their results. Where possible these investigations should be specified a priori (i.e. in the protocol for the systematic review). It is legitimate for a systematic review to focus on examining the relationship between some clinical characteristic(s) of the studies and the size of intervention effect, rather than on obtaining a summary effect estimate across a series of studies (see Section 10.11 ). Meta-regression may best be used for this purpose, although it is not implemented in RevMan (see Section 10.11.4 ).

10.10.2 Identifying and measuring heterogeneity

It is essential to consider the extent to which the results of studies are consistent with each other (see MECIR Box 10.10.b ). If confidence intervals for the results of individual studies (generally depicted graphically using horizontal lines) have poor overlap, this generally indicates the presence of statistical heterogeneity. More formally, a statistical test for heterogeneity is available. This Chi 2 (χ 2 , or chi-squared) test is included in the forest plots in Cochrane Reviews. It assesses whether observed differences in results are compatible with chance alone. A low P value (or a large Chi 2 statistic relative to its degree of freedom) provides evidence of heterogeneity of intervention effects (variation in effect estimates beyond chance).

MECIR Box 10.10.b Relevant expectations for conduct of intervention reviews

Care must be taken in the interpretation of the Chi 2 test, since it has low power in the (common) situation of a meta-analysis when studies have small sample size or are few in number. This means that while a statistically significant result may indicate a problem with heterogeneity, a non-significant result must not be taken as evidence of no heterogeneity. This is also why a P value of 0.10, rather than the conventional level of 0.05, is sometimes used to determine statistical significance. A further problem with the test, which seldom occurs in Cochrane Reviews, is that when there are many studies in a meta-analysis, the test has high power to detect a small amount of heterogeneity that may be clinically unimportant.

Some argue that, since clinical and methodological diversity always occur in a meta-analysis, statistical heterogeneity is inevitable (Higgins et al 2003). Thus, the test for heterogeneity is irrelevant to the choice of analysis; heterogeneity will always exist whether or not we happen to be able to detect it using a statistical test. Methods have been developed for quantifying inconsistency across studies that move the focus away from testing whether heterogeneity is present to assessing its impact on the meta-analysis. A useful statistic for quantifying inconsistency is:

research proposal meta analysis

In this equation, Q is the Chi 2 statistic and df is its degrees of freedom (Higgins and Thompson 2002, Higgins et al 2003). I 2 describes the percentage of the variability in effect estimates that is due to heterogeneity rather than sampling error (chance).

Thresholds for the interpretation of the I 2 statistic can be misleading, since the importance of inconsistency depends on several factors. A rough guide to interpretation in the context of meta-analyses of randomized trials is as follows:

  • 0% to 40%: might not be important;
  • 30% to 60%: may represent moderate heterogeneity*;
  • 50% to 90%: may represent substantial heterogeneity*;
  • 75% to 100%: considerable heterogeneity*.

*The importance of the observed value of I 2 depends on (1) magnitude and direction of effects, and (2) strength of evidence for heterogeneity (e.g. P value from the Chi 2 test, or a confidence interval for I 2 : uncertainty in the value of I 2 is substantial when the number of studies is small).

10.10.3 Strategies for addressing heterogeneity

Review authors must take into account any statistical heterogeneity when interpreting results, particularly when there is variation in the direction of effect (see MECIR Box 10.10.c ). A number of options are available if heterogeneity is identified among a group of studies that would otherwise be considered suitable for a meta-analysis.

MECIR Box 10.10.c  Relevant expectations for conduct of intervention reviews

  • Check again that the data are correct. Severe apparent heterogeneity can indicate that data have been incorrectly extracted or entered into meta-analysis software. For example, if standard errors have mistakenly been entered as SDs for continuous outcomes, this could manifest itself in overly narrow confidence intervals with poor overlap and hence substantial heterogeneity. Unit-of-analysis errors may also be causes of heterogeneity (see Chapter 6, Section 6.2 ).  
  • Do not do a meta -analysis. A systematic review need not contain any meta-analyses. If there is considerable variation in results, and particularly if there is inconsistency in the direction of effect, it may be misleading to quote an average value for the intervention effect.  
  • Explore heterogeneity. It is clearly of interest to determine the causes of heterogeneity among results of studies. This process is problematic since there are often many characteristics that vary across studies from which one may choose. Heterogeneity may be explored by conducting subgroup analyses (see Section 10.11.3 ) or meta-regression (see Section 10.11.4 ). Reliable conclusions can only be drawn from analyses that are truly pre-specified before inspecting the studies’ results, and even these conclusions should be interpreted with caution. Explorations of heterogeneity that are devised after heterogeneity is identified can at best lead to the generation of hypotheses. They should be interpreted with even more caution and should generally not be listed among the conclusions of a review. Also, investigations of heterogeneity when there are very few studies are of questionable value.  
  • Ignore heterogeneity. Fixed-effect meta-analyses ignore heterogeneity. The summary effect estimate from a fixed-effect meta-analysis is normally interpreted as being the best estimate of the intervention effect. However, the existence of heterogeneity suggests that there may not be a single intervention effect but a variety of intervention effects. Thus, the summary fixed-effect estimate may be an intervention effect that does not actually exist in any population, and therefore have a confidence interval that is meaningless as well as being too narrow (see Section 10.10.4 ).  
  • Perform a random-effects meta-analysis. A random-effects meta-analysis may be used to incorporate heterogeneity among studies. This is not a substitute for a thorough investigation of heterogeneity. It is intended primarily for heterogeneity that cannot be explained. An extended discussion of this option appears in Section 10.10.4 .  
  • Reconsider the effect measure. Heterogeneity may be an artificial consequence of an inappropriate choice of effect measure. For example, when studies collect continuous outcome data using different scales or different units, extreme heterogeneity may be apparent when using the mean difference but not when the more appropriate standardized mean difference is used. Furthermore, choice of effect measure for dichotomous outcomes (odds ratio, risk ratio, or risk difference) may affect the degree of heterogeneity among results. In particular, when comparator group risks vary, homogeneous odds ratios or risk ratios will necessarily lead to heterogeneous risk differences, and vice versa. However, it remains unclear whether homogeneity of intervention effect in a particular meta-analysis is a suitable criterion for choosing between these measures (see also Section 10.4.3 ).  
  • Exclude studies. Heterogeneity may be due to the presence of one or two outlying studies with results that conflict with the rest of the studies. In general it is unwise to exclude studies from a meta-analysis on the basis of their results as this may introduce bias. However, if an obvious reason for the outlying result is apparent, the study might be removed with more confidence. Since usually at least one characteristic can be found for any study in any meta-analysis which makes it different from the others, this criterion is unreliable because it is all too easy to fulfil. It is advisable to perform analyses both with and without outlying studies as part of a sensitivity analysis (see Section 10.14 ). Whenever possible, potential sources of clinical diversity that might lead to such situations should be specified in the protocol.

10.10.4 Incorporating heterogeneity into random-effects models

The random-effects meta-analysis approach incorporates an assumption that the different studies are estimating different, yet related, intervention effects (DerSimonian and Laird 1986, Borenstein et al 2010). The approach allows us to address heterogeneity that cannot readily be explained by other factors. A random-effects meta-analysis model involves an assumption that the effects being estimated in the different studies follow some distribution. The model represents our lack of knowledge about why real, or apparent, intervention effects differ, by considering the differences as if they were random. The centre of the assumed distribution describes the average of the effects, while its width describes the degree of heterogeneity. The conventional choice of distribution is a normal distribution. It is difficult to establish the validity of any particular distributional assumption, and this is a common criticism of random-effects meta-analyses. The importance of the assumed shape for this distribution has not been widely studied.

To undertake a random-effects meta-analysis, the standard errors of the study-specific estimates (SE i in Section 10.3.1 ) are adjusted to incorporate a measure of the extent of variation, or heterogeneity, among the intervention effects observed in different studies (this variation is often referred to as Tau-squared, τ 2 , or Tau 2 ). The amount of variation, and hence the adjustment, can be estimated from the intervention effects and standard errors of the studies included in the meta-analysis.

In a heterogeneous set of studies, a random-effects meta-analysis will award relatively more weight to smaller studies than such studies would receive in a fixed-effect meta-analysis. This is because small studies are more informative for learning about the distribution of effects across studies than for learning about an assumed common intervention effect.

Note that a random-effects model does not ‘take account’ of the heterogeneity, in the sense that it is no longer an issue. It is always preferable to explore possible causes of heterogeneity, although there may be too few studies to do this adequately (see Section 10.11 ). Fixed or random effects?

A fixed-effect meta-analysis provides a result that may be viewed as a ‘typical intervention effect’ from the studies included in the analysis. In order to calculate a confidence interval for a fixed-effect meta-analysis the assumption is usually made that the true effect of intervention (in both magnitude and direction) is the same value in every study (i.e. fixed across studies). This assumption implies that the observed differences among study results are due solely to the play of chance (i.e. that there is no statistical heterogeneity).

A random-effects model provides a result that may be viewed as an ‘average intervention effect’, where this average is explicitly defined according to an assumed distribution of effects across studies. Instead of assuming that the intervention effects are the same, we assume that they follow (usually) a normal distribution. The assumption implies that the observed differences among study results are due to a combination of the play of chance and some genuine variation in the intervention effects.

The random-effects method and the fixed-effect method will give identical results when there is no heterogeneity among the studies.

When heterogeneity is present, a confidence interval around the random-effects summary estimate is wider than a confidence interval around a fixed-effect summary estimate. This will happen whenever the I 2 statistic is greater than zero, even if the heterogeneity is not detected by the Chi 2 test for heterogeneity (see Section 10.10.2 ).

Sometimes the central estimate of the intervention effect is different between fixed-effect and random-effects analyses. In particular, if results of smaller studies are systematically different from results of larger ones, which can happen as a result of publication bias or within-study bias in smaller studies (Egger et al 1997, Poole and Greenland 1999, Kjaergard et al 2001), then a random-effects meta-analysis will exacerbate the effects of the bias (see also Chapter 13, Section ). A fixed-effect analysis will be affected less, although strictly it will also be inappropriate.

The decision between fixed- and random-effects meta-analyses has been the subject of much debate, and we do not provide a universal recommendation. Some considerations in making this choice are as follows:

  • Many have argued that the decision should be based on an expectation of whether the intervention effects are truly identical, preferring the fixed-effect model if this is likely and a random-effects model if this is unlikely (Borenstein et al 2010). Since it is generally considered to be implausible that intervention effects across studies are identical (unless the intervention has no effect at all), this leads many to advocate use of the random-effects model.
  • Others have argued that a fixed-effect analysis can be interpreted in the presence of heterogeneity, and that it makes fewer assumptions than a random-effects meta-analysis. They then refer to it as a ‘fixed-effects’ meta-analysis (Peto et al 1995, Rice et al 2018).
  • Under any interpretation, a fixed-effect meta-analysis ignores heterogeneity. If the method is used, it is therefore important to supplement it with a statistical investigation of the extent of heterogeneity (see Section 10.10.2 ).
  • In the presence of heterogeneity, a random-effects analysis gives relatively more weight to smaller studies and relatively less weight to larger studies. If there is additionally some funnel plot asymmetry (i.e. a relationship between intervention effect magnitude and study size), then this will push the results of the random-effects analysis towards the findings in the smaller studies. In the context of randomized trials, this is generally regarded as an unfortunate consequence of the model.
  • A pragmatic approach is to plan to undertake both a fixed-effect and a random-effects meta-analysis, with an intention to present the random-effects result if there is no indication of funnel plot asymmetry. If there is an indication of funnel plot asymmetry, then both methods are problematic. It may be reasonable to present both analyses or neither, or to perform a sensitivity analysis in which small studies are excluded or addressed directly using meta-regression (see Chapter 13, Section ).
  • The choice between a fixed-effect and a random-effects meta-analysis should never be made on the basis of a statistical test for heterogeneity. Interpretation of random-effects meta-analyses

The summary estimate and confidence interval from a random-effects meta-analysis refer to the centre of the distribution of intervention effects, but do not describe the width of the distribution. Often the summary estimate and its confidence interval are quoted in isolation and portrayed as a sufficient summary of the meta-analysis. This is inappropriate. The confidence interval from a random-effects meta-analysis describes uncertainty in the location of the mean of systematically different effects in the different studies. It does not describe the degree of heterogeneity among studies, as may be commonly believed. For example, when there are many studies in a meta-analysis, we may obtain a very tight confidence interval around the random-effects estimate of the mean effect even when there is a large amount of heterogeneity. A solution to this problem is to consider a prediction interval (see Section ).

Methodological diversity creates heterogeneity through biases variably affecting the results of different studies. The random-effects summary estimate will only correctly estimate the average intervention effect if the biases are symmetrically distributed, leading to a mixture of over-estimates and under-estimates of effect, which is unlikely to be the case. In practice it can be very difficult to distinguish whether heterogeneity results from clinical or methodological diversity, and in most cases it is likely to be due to both, so these distinctions are hard to draw in the interpretation.

When there is little information, either because there are few studies or if the studies are small with few events, a random-effects analysis will provide poor estimates of the amount of heterogeneity (i.e. of the width of the distribution of intervention effects). Fixed-effect methods such as the Mantel-Haenszel method will provide more robust estimates of the average intervention effect, but at the cost of ignoring any heterogeneity. Prediction intervals from a random-effects meta-analysis

An estimate of the between-study variance in a random-effects meta-analysis is typically presented as part of its results. The square root of this number (i.e. Tau) is the estimated standard deviation of underlying effects across studies. Prediction intervals are a way of expressing this value in an interpretable way.

To motivate the idea of a prediction interval, note that for absolute measures of effect (e.g. risk difference, mean difference, standardized mean difference), an approximate 95% range of normally distributed underlying effects can be obtained by creating an interval from 1.96´Tau below the random-effects mean, to 1.96✕Tau above it. (For relative measures such as the odds ratio and risk ratio, an equivalent interval needs to be based on the natural logarithm of the summary estimate.) In reality, both the summary estimate and the value of Tau are associated with uncertainty. A prediction interval seeks to present the range of effects in a way that acknowledges this uncertainty (Higgins et al 2009). A simple 95% prediction interval can be calculated as:

research proposal meta analysis

where M is the summary mean from the random-effects meta-analysis, t k −2 is the 95% percentile of a t -distribution with k –2 degrees of freedom, k is the number of studies, Tau 2 is the estimated amount of heterogeneity and SE( M ) is the standard error of the summary mean.

The term ‘prediction interval’ relates to the use of this interval to predict the possible underlying effect in a new study that is similar to the studies in the meta-analysis. A more useful interpretation of the interval is as a summary of the spread of underlying effects in the studies included in the random-effects meta-analysis.

Prediction intervals have proved a popular way of expressing the amount of heterogeneity in a meta-analysis (Riley et al 2011). They are, however, strongly based on the assumption of a normal distribution for the effects across studies, and can be very problematic when the number of studies is small, in which case they can appear spuriously wide or spuriously narrow. Nevertheless, we encourage their use when the number of studies is reasonable (e.g. more than ten) and there is no clear funnel plot asymmetry. Implementing random-effects meta-analyses

As introduced in Section 10.3.2 , the random-effects model can be implemented using an inverse-variance approach, incorporating a measure of the extent of heterogeneity into the study weights. RevMan implements a version of random-effects meta-analysis that is described by DerSimonian and Laird, making use of a ‘moment-based’ estimate of the between-study variance (DerSimonian and Laird 1986). The attraction of this method is that the calculations are straightforward, but it has a theoretical disadvantage in that the confidence intervals are slightly too narrow to encompass full uncertainty resulting from having estimated the degree of heterogeneity.

For many years, RevMan has implemented two random-effects methods for dichotomous data: a Mantel-Haenszel method and an inverse-variance method. Both use the moment-based approach to estimating the amount of between-studies variation. The difference between the two is subtle: the former estimates the between-study variation by comparing each study’s result with a Mantel-Haenszel fixed-effect meta-analysis result, whereas the latter estimates it by comparing each study’s result with an inverse-variance fixed-effect meta-analysis result. In practice, the difference is likely to be trivial.

There are alternative methods for performing random-effects meta-analyses that have better technical properties than the DerSimonian and Laird approach with a moment-based estimate (Veroniki et al 2016). Most notable among these is an adjustment to the confidence interval proposed by Hartung and Knapp and by Sidik and Jonkman (Hartung and Knapp 2001, Sidik and Jonkman 2002). This adjustment widens the confidence interval to reflect uncertainty in the estimation of between-study heterogeneity, and it should be used if available to review authors. An alternative option to encompass full uncertainty in the degree of heterogeneity is to take a Bayesian approach (see Section 10.13 ).

An empirical comparison of different ways to estimate between-study variation in Cochrane meta-analyses has shown that they can lead to substantial differences in estimates of heterogeneity, but seldom have major implications for estimating summary effects (Langan et al 2015). Several simulation studies have concluded that an approach proposed by Paule and Mandel should be recommended (Langan et al 2017); whereas a comprehensive recent simulation study recommended a restricted maximum likelihood approach, although noted that no single approach is universally preferable (Langan et al 2019). Review authors are encouraged to select one of these options if it is available to them.

10.11 Investigating heterogeneity

10.11.1 interaction and effect modification.

Does the intervention effect vary with different populations or intervention characteristics (such as dose or duration)? Such variation is known as interaction by statisticians and as effect modification by epidemiologists. Methods to search for such interactions include subgroup analyses and meta-regression. All methods have considerable pitfalls.

10.11.2 What are subgroup analyses?

Subgroup analyses involve splitting all the participant data into subgroups, often in order to make comparisons between them. Subgroup analyses may be done for subsets of participants (such as males and females), or for subsets of studies (such as different geographical locations). Subgroup analyses may be done as a means of investigating heterogeneous results, or to answer specific questions about particular patient groups, types of intervention or types of study.

Subgroup analyses of subsets of participants within studies are uncommon in systematic reviews based on published literature because sufficient details to extract data about separate participant types are seldom published in reports. By contrast, such subsets of participants are easily analysed when individual participant data have been collected (see Chapter 26 ). The methods we describe in the remainder of this chapter are for subgroups of studies.

Findings from multiple subgroup analyses may be misleading. Subgroup analyses are observational by nature and are not based on randomized comparisons. False negative and false positive significance tests increase in likelihood rapidly as more subgroup analyses are performed. If their findings are presented as definitive conclusions there is clearly a risk of people being denied an effective intervention or treated with an ineffective (or even harmful) intervention. Subgroup analyses can also generate misleading recommendations about directions for future research that, if followed, would waste scarce resources.

It is useful to distinguish between the notions of ‘qualitative interaction’ and ‘quantitative interaction’ (Yusuf et al 1991). Qualitative interaction exists if the direction of effect is reversed, that is if an intervention is beneficial in one subgroup but is harmful in another. Qualitative interaction is rare. This may be used as an argument that the most appropriate result of a meta-analysis is the overall effect across all subgroups. Quantitative interaction exists when the size of the effect varies but not the direction, that is if an intervention is beneficial to different degrees in different subgroups.

10.11.3 Undertaking subgroup analyses

Meta-analyses can be undertaken in RevMan both within subgroups of studies as well as across all studies irrespective of their subgroup membership. It is tempting to compare effect estimates in different subgroups by considering the meta-analysis results from each subgroup separately. This should only be done informally by comparing the magnitudes of effect. Noting that either the effect or the test for heterogeneity in one subgroup is statistically significant whilst that in the other subgroup is not statistically significant does not indicate that the subgroup factor explains heterogeneity. Since different subgroups are likely to contain different amounts of information and thus have different abilities to detect effects, it is extremely misleading simply to compare the statistical significance of the results. Is the effect different in different subgroups?

Valid investigations of whether an intervention works differently in different subgroups involve comparing the subgroups with each other. It is a mistake to compare within-subgroup inferences such as P values. If one subgroup analysis is statistically significant and another is not, then the latter may simply reflect a lack of information rather than a smaller (or absent) effect. When there are only two subgroups, non-overlap of the confidence intervals indicates statistical significance, but note that the confidence intervals can overlap to a small degree and the difference still be statistically significant.

A formal statistical approach should be used to examine differences among subgroups (see MECIR Box 10.11.a ). A simple significance test to investigate differences between two or more subgroups can be performed (Borenstein and Higgins 2013). This procedure consists of undertaking a standard test for heterogeneity across subgroup results rather than across individual study results. When the meta-analysis uses a fixed-effect inverse-variance weighted average approach, the method is exactly equivalent to the test described by Deeks and colleagues (Deeks et al 2001). An I 2 statistic is also computed for subgroup differences. This describes the percentage of the variability in effect estimates from the different subgroups that is due to genuine subgroup differences rather than sampling error (chance). Note that these methods for examining subgroup differences should be used only when the data in the subgroups are independent (i.e. they should not be used if the same study participants contribute to more than one of the subgroups in the forest plot).

If fixed-effect models are used for the analysis within each subgroup, then these statistics relate to differences in typical effects across different subgroups. If random-effects models are used for the analysis within each subgroup, then the statistics relate to variation in the mean effects in the different subgroups.

An alternative method for testing for differences between subgroups is to use meta-regression techniques, in which case a random-effects model is generally preferred (see Section 10.11.4 ). Tests for subgroup differences based on random-effects models may be regarded as preferable to those based on fixed-effect models, due to the high risk of false-positive results when a fixed-effect model is used to compare subgroups (Higgins and Thompson 2004).

MECIR Box 10.11.a Relevant expectations for conduct of intervention reviews

10.11.4 Meta-regression

If studies are divided into subgroups (see Section 10.11.2 ), this may be viewed as an investigation of how a categorical study characteristic is associated with the intervention effects in the meta-analysis. For example, studies in which allocation sequence concealment was adequate may yield different results from those in which it was inadequate. Here, allocation sequence concealment, being either adequate or inadequate, is a categorical characteristic at the study level. Meta-regression is an extension to subgroup analyses that allows the effect of continuous, as well as categorical, characteristics to be investigated, and in principle allows the effects of multiple factors to be investigated simultaneously (although this is rarely possible due to inadequate numbers of studies) (Thompson and Higgins 2002). Meta-regression should generally not be considered when there are fewer than ten studies in a meta-analysis.

Meta-regressions are similar in essence to simple regressions, in which an outcome variable is predicted according to the values of one or more explanatory variables . In meta-regression, the outcome variable is the effect estimate (for example, a mean difference, a risk difference, a log odds ratio or a log risk ratio). The explanatory variables are characteristics of studies that might influence the size of intervention effect. These are often called ‘potential effect modifiers’ or covariates. Meta-regressions usually differ from simple regressions in two ways. First, larger studies have more influence on the relationship than smaller studies, since studies are weighted by the precision of their respective effect estimate. Second, it is wise to allow for the residual heterogeneity among intervention effects not modelled by the explanatory variables. This gives rise to the term ‘random-effects meta-regression’, since the extra variability is incorporated in the same way as in a random-effects meta-analysis (Thompson and Sharp 1999).

The regression coefficient obtained from a meta-regression analysis will describe how the outcome variable (the intervention effect) changes with a unit increase in the explanatory variable (the potential effect modifier). The statistical significance of the regression coefficient is a test of whether there is a linear relationship between intervention effect and the explanatory variable. If the intervention effect is a ratio measure, the log-transformed value of the intervention effect should always be used in the regression model (see Chapter 6, Section ), and the exponential of the regression coefficient will give an estimate of the relative change in intervention effect with a unit increase in the explanatory variable.

Meta-regression can also be used to investigate differences for categorical explanatory variables as done in subgroup analyses. If there are J subgroups, membership of particular subgroups is indicated by using J minus 1 dummy variables (which can only take values of zero or one) in the meta-regression model (as in standard linear regression modelling). The regression coefficients will estimate how the intervention effect in each subgroup differs from a nominated reference subgroup. The P value of each regression coefficient will indicate the strength of evidence against the null hypothesis that the characteristic is not associated with the intervention effect.

Meta-regression may be performed using the ‘metareg’ macro available for the Stata statistical package, or using the ‘metafor’ package for R, as well as other packages.

10.11.5 Selection of study characteristics for subgroup analyses and meta-regression

Authors need to be cautious about undertaking subgroup analyses, and interpreting any that they do. Some considerations are outlined here for selecting characteristics (also called explanatory variables, potential effect modifiers or covariates) that will be investigated for their possible influence on the size of the intervention effect. These considerations apply similarly to subgroup analyses and to meta-regressions. Further details may be obtained elsewhere (Oxman and Guyatt 1992, Berlin and Antman 1994). Ensure that there are adequate studies to justify subgroup analyses and meta-regressions

It is very unlikely that an investigation of heterogeneity will produce useful findings unless there is a substantial number of studies. Typical advice for undertaking simple regression analyses: that at least ten observations (i.e. ten studies in a meta-analysis) should be available for each characteristic modelled. However, even this will be too few when the covariates are unevenly distributed across studies. Specify characteristics in advance

Authors should, whenever possible, pre-specify characteristics in the protocol that later will be subject to subgroup analyses or meta-regression. The plan specified in the protocol should then be followed (data permitting), without undue emphasis on any particular findings (see MECIR Box 10.11.b ). Pre-specifying characteristics reduces the likelihood of spurious findings, first by limiting the number of subgroups investigated, and second by preventing knowledge of the studies’ results influencing which subgroups are analysed. True pre-specification is difficult in systematic reviews, because the results of some of the relevant studies are often known when the protocol is drafted. If a characteristic was overlooked in the protocol, but is clearly of major importance and justified by external evidence, then authors should not be reluctant to explore it. However, such post-hoc analyses should be identified as such.

MECIR Box 10.11.b Relevant expectations for conduct of intervention reviews Select a small number of characteristics

The likelihood of a false-positive result among subgroup analyses and meta-regression increases with the number of characteristics investigated. It is difficult to suggest a maximum number of characteristics to look at, especially since the number of available studies is unknown in advance. If more than one or two characteristics are investigated it may be sensible to adjust the level of significance to account for making multiple comparisons. Ensure there is scientific rationale for investigating each characteristic

Selection of characteristics should be motivated by biological and clinical hypotheses, ideally supported by evidence from sources other than the included studies. Subgroup analyses using characteristics that are implausible or clinically irrelevant are not likely to be useful and should be avoided. For example, a relationship between intervention effect and year of publication is seldom in itself clinically informative, and if identified runs the risk of initiating a post-hoc data dredge of factors that may have changed over time.

Prognostic factors are those that predict the outcome of a disease or condition, whereas effect modifiers are factors that influence how well an intervention works in affecting the outcome. Confusion between prognostic factors and effect modifiers is common in planning subgroup analyses, especially at the protocol stage. Prognostic factors are not good candidates for subgroup analyses unless they are also believed to modify the effect of intervention. For example, being a smoker may be a strong predictor of mortality within the next ten years, but there may not be reason for it to influence the effect of a drug therapy on mortality (Deeks 1998). Potential effect modifiers may include participant characteristics (age, setting), the precise interventions (dose of active intervention, choice of comparison intervention), how the study was done (length of follow-up) or methodology (design and quality). Be aware that the effect of a characteristic may not always be identified

Many characteristics that might have important effects on how well an intervention works cannot be investigated using subgroup analysis or meta-regression. These are characteristics of participants that might vary substantially within studies, but that can only be summarized at the level of the study. An example is age. Consider a collection of clinical trials involving adults ranging from 18 to 60 years old. There may be a strong relationship between age and intervention effect that is apparent within each study. However, if the mean ages for the trials are similar, then no relationship will be apparent by looking at trial mean ages and trial-level effect estimates. The problem is one of aggregating individuals’ results and is variously known as aggregation bias, ecological bias or the ecological fallacy (Morgenstern 1982, Greenland 1987, Berlin et al 2002). It is even possible for the direction of the relationship across studies be the opposite of the direction of the relationship observed within each study. Think about whether the characteristic is closely related to another characteristic (confounded)

The problem of ‘confounding’ complicates interpretation of subgroup analyses and meta-regressions and can lead to incorrect conclusions. Two characteristics are confounded if their influences on the intervention effect cannot be disentangled. For example, if those studies implementing an intensive version of a therapy happened to be the studies that involved patients with more severe disease, then one cannot tell which aspect is the cause of any difference in effect estimates between these studies and others. In meta-regression, co-linearity between potential effect modifiers leads to similar difficulties (Berlin and Antman 1994). Computing correlations between study characteristics will give some information about which study characteristics may be confounded with each other.

10.11.6 Interpretation of subgroup analyses and meta-regressions

Appropriate interpretation of subgroup analyses and meta-regressions requires caution (Oxman and Guyatt 1992).

  • Subgroup comparisons are observational. It must be remembered that subgroup analyses and meta-regressions are entirely observational in their nature. These analyses investigate differences between studies. Even if individuals are randomized to one group or other within a clinical trial, they are not randomized to go in one trial or another. Hence, subgroup analyses suffer the limitations of any observational investigation, including possible bias through confounding by other study-level characteristics. Furthermore, even a genuine difference between subgroups is not necessarily due to the classification of the subgroups. As an example, a subgroup analysis of bone marrow transplantation for treating leukaemia might show a strong association between the age of a sibling donor and the success of the transplant. However, this probably does not mean that the age of donor is important. In fact, the age of the recipient is probably a key factor and the subgroup finding would simply be due to the strong association between the age of the recipient and the age of their sibling.  
  • Was the analysis pre-specified or post hoc? Authors should state whether subgroup analyses were pre-specified or undertaken after the results of the studies had been compiled (post hoc). More reliance may be placed on a subgroup analysis if it was one of a small number of pre-specified analyses. Performing numerous post-hoc subgroup analyses to explain heterogeneity is a form of data dredging. Data dredging is condemned because it is usually possible to find an apparent, but false, explanation for heterogeneity by considering lots of different characteristics.  
  • Is there indirect evidence in support of the findings? Differences between subgroups should be clinically plausible and supported by other external or indirect evidence, if they are to be convincing.  
  • Is the magnitude of the difference practically important? If the magnitude of a difference between subgroups will not result in different recommendations for different subgroups, then it may be better to present only the overall analysis results.  
  • Is there a statistically significant difference between subgroups? To establish whether there is a different effect of an intervention in different situations, the magnitudes of effects in different subgroups should be compared directly with each other. In particular, statistical significance of the results within separate subgroup analyses should not be compared (see Section ).  
  • Are analyses looking at within-study or between-study relationships? For patient and intervention characteristics, differences in subgroups that are observed within studies are more reliable than analyses of subsets of studies. If such within-study relationships are replicated across studies then this adds confidence to the findings.

10.11.7 Investigating the effect of underlying risk

One potentially important source of heterogeneity among a series of studies is when the underlying average risk of the outcome event varies between the studies. The underlying risk of a particular event may be viewed as an aggregate measure of case-mix factors such as age or disease severity. It is generally measured as the observed risk of the event in the comparator group of each study (the comparator group risk, or CGR). The notion is controversial in its relevance to clinical practice since underlying risk represents a summary of both known and unknown risk factors. Problems also arise because comparator group risk will depend on the length of follow-up, which often varies across studies. However, underlying risk has received particular attention in meta-analysis because the information is readily available once dichotomous data have been prepared for use in meta-analyses. Sharp provides a full discussion of the topic (Sharp 2001).

Intuition would suggest that participants are more or less likely to benefit from an effective intervention according to their risk status. However, the relationship between underlying risk and intervention effect is a complicated issue. For example, suppose an intervention is equally beneficial in the sense that for all patients it reduces the risk of an event, say a stroke, to 80% of the underlying risk. Then it is not equally beneficial in terms of absolute differences in risk in the sense that it reduces a 50% stroke rate by 10 percentage points to 40% (number needed to treat=10), but a 20% stroke rate by 4 percentage points to 16% (number needed to treat=25).

Use of different summary statistics (risk ratio, odds ratio and risk difference) will demonstrate different relationships with underlying risk. Summary statistics that show close to no relationship with underlying risk are generally preferred for use in meta-analysis (see Section 10.4.3 ).

Investigating any relationship between effect estimates and the comparator group risk is also complicated by a technical phenomenon known as regression to the mean. This arises because the comparator group risk forms an integral part of the effect estimate. A high risk in a comparator group, observed entirely by chance, will on average give rise to a higher than expected effect estimate, and vice versa. This phenomenon results in a false correlation between effect estimates and comparator group risks. There are methods, which require sophisticated software, that correct for regression to the mean (McIntosh 1996, Thompson et al 1997). These should be used for such analyses, and statistical expertise is recommended.

10.11.8 Dose-response analyses

The principles of meta-regression can be applied to the relationships between intervention effect and dose (commonly termed dose-response), treatment intensity or treatment duration (Greenland and Longnecker 1992, Berlin et al 1993). Conclusions about differences in effect due to differences in dose (or similar factors) are on stronger ground if participants are randomized to one dose or another within a study and a consistent relationship is found across similar studies. While authors should consider these effects, particularly as a possible explanation for heterogeneity, they should be cautious about drawing conclusions based on between-study differences. Authors should be particularly cautious about claiming that a dose-response relationship does not exist, given the low power of many meta-regression analyses to detect genuine relationships.

10.12 Missing data

10.12.1 types of missing data.

There are many potential sources of missing data in a systematic review or meta-analysis (see Table 10.12.a ). For example, a whole study may be missing from the review, an outcome may be missing from a study, summary data may be missing for an outcome, and individual participants may be missing from the summary data. Here we discuss a variety of potential sources of missing data, highlighting where more detailed discussions are available elsewhere in the Handbook .

Whole studies may be missing from a review because they are never published, are published in obscure places, are rarely cited, or are inappropriately indexed in databases. Thus, review authors should always be aware of the possibility that they have failed to identify relevant studies. There is a strong possibility that such studies are missing because of their ‘uninteresting’ or ‘unwelcome’ findings (that is, in the presence of publication bias). This problem is discussed at length in Chapter 13 . Details of comprehensive search methods are provided in Chapter 4 .

Some studies might not report any information on outcomes of interest to the review. For example, there may be no information on quality of life, or on serious adverse effects. It is often difficult to determine whether this is because the outcome was not measured or because the outcome was not reported. Furthermore, failure to report that outcomes were measured may be dependent on the unreported results (selective outcome reporting bias; see Chapter 7, Section ). Similarly, summary data for an outcome, in a form that can be included in a meta-analysis, may be missing. A common example is missing standard deviations (SDs) for continuous outcomes. This is often a problem when change-from-baseline outcomes are sought. We discuss imputation of missing SDs in Chapter 6, Section . Other examples of missing summary data are missing sample sizes (particularly those for each intervention group separately), numbers of events, standard errors, follow-up times for calculating rates, and sufficient details of time-to-event outcomes. Inappropriate analyses of studies, for example of cluster-randomized and crossover trials, can lead to missing summary data. It is sometimes possible to approximate the correct analyses of such studies, for example by imputing correlation coefficients or SDs, as discussed in Chapter 23, Section 23.1 , for cluster-randomized studies and Chapter 23,Section 23.2 , for crossover trials. As a general rule, most methodologists believe that missing summary data (e.g. ‘no usable data’) should not be used as a reason to exclude a study from a systematic review. It is more appropriate to include the study in the review, and to discuss the potential implications of its absence from a meta-analysis.

It is likely that in some, if not all, included studies, there will be individuals missing from the reported results. Review authors are encouraged to consider this problem carefully (see MECIR Box 10.12.a ). We provide further discussion of this problem in Section 10.12.3 ; see also Chapter 8, Section 8.5 .

Missing data can also affect subgroup analyses. If subgroup analyses or meta-regressions are planned (see Section 10.11 ), they require details of the study-level characteristics that distinguish studies from one another. If these are not available for all studies, review authors should consider asking the study authors for more information.

Table 10.12.a Types of missing data in a meta-analysis

MECIR Box 10.12.a Relevant expectations for conduct of intervention reviews

10.12.2 General principles for dealing with missing data

There is a large literature of statistical methods for dealing with missing data. Here we briefly review some key concepts and make some general recommendations for Cochrane Review authors. It is important to think why data may be missing. Statisticians often use the terms ‘missing at random’ and ‘not missing at random’ to represent different scenarios.

Data are said to be ‘missing at random’ if the fact that they are missing is unrelated to actual values of the missing data. For instance, if some quality-of-life questionnaires were lost in the postal system, this would be unlikely to be related to the quality of life of the trial participants who completed the forms. In some circumstances, statisticians distinguish between data ‘missing at random’ and data ‘missing completely at random’, although in the context of a systematic review the distinction is unlikely to be important. Data that are missing at random may not be important. Analyses based on the available data will often be unbiased, although based on a smaller sample size than the original data set.

Data are said to be ‘not missing at random’ if the fact that they are missing is related to the actual missing data. For instance, in a depression trial, participants who had a relapse of depression might be less likely to attend the final follow-up interview, and more likely to have missing outcome data. Such data are ‘non-ignorable’ in the sense that an analysis of the available data alone will typically be biased. Publication bias and selective reporting bias lead by definition to data that are ‘not missing at random’, and attrition and exclusions of individuals within studies often do as well.

The principal options for dealing with missing data are:

  • analysing only the available data (i.e. ignoring the missing data);
  • imputing the missing data with replacement values, and treating these as if they were observed (e.g. last observation carried forward, imputing an assumed outcome such as assuming all were poor outcomes, imputing the mean, imputing based on predicted values from a regression analysis);
  • imputing the missing data and accounting for the fact that these were imputed with uncertainty (e.g. multiple imputation, simple imputation methods (as point 2) with adjustment to the standard error); and
  • using statistical models to allow for missing data, making assumptions about their relationships with the available data.

Option 2 is practical in most circumstances and very commonly used in systematic reviews. However, it fails to acknowledge uncertainty in the imputed values and results, typically, in confidence intervals that are too narrow. Options 3 and 4 would require involvement of a knowledgeable statistician.

Five general recommendations for dealing with missing data in Cochrane Reviews are as follows:

  • Whenever possible, contact the original investigators to request missing data.
  • Make explicit the assumptions of any methods used to address missing data: for example, that the data are assumed missing at random, or that missing values were assumed to have a particular value such as a poor outcome.
  • Follow the guidance in Chapter 8 to assess risk of bias due to missing outcome data in randomized trials.
  • Perform sensitivity analyses to assess how sensitive results are to reasonable changes in the assumptions that are made (see Section 10.14 ).
  • Address the potential impact of missing data on the findings of the review in the Discussion section.

10.12.3 Dealing with missing outcome data from individual participants

Review authors may undertake sensitivity analyses to assess the potential impact of missing outcome data, based on assumptions about the relationship between missingness in the outcome and its true value. Several methods are available (Akl et al 2015). For dichotomous outcomes, Higgins and colleagues propose a strategy involving different assumptions about how the risk of the event among the missing participants differs from the risk of the event among the observed participants, taking account of uncertainty introduced by the assumptions (Higgins et al 2008a). Akl and colleagues propose a suite of simple imputation methods, including a similar approach to that of Higgins and colleagues based on relative risks of the event in missing versus observed participants. Similar ideas can be applied to continuous outcome data (Ebrahim et al 2013, Ebrahim et al 2014). Particular care is required to avoid double counting events, since it can be unclear whether reported numbers of events in trial reports apply to the full randomized sample or only to those who did not drop out (Akl et al 2016).

Although there is a tradition of implementing ‘worst case’ and ‘best case’ analyses clarifying the extreme boundaries of what is theoretically possible, such analyses may not be informative for the most plausible scenarios (Higgins et al 2008a).

10.13 Bayesian approaches to meta-analysis

Bayesian statistics is an approach to statistics based on a different philosophy from that which underlies significance tests and confidence intervals. It is essentially about updating of evidence. In a Bayesian analysis, initial uncertainty is expressed through a prior distribution about the quantities of interest. Current data and assumptions concerning how they were generated are summarized in the likelihood . The posterior distribution for the quantities of interest can then be obtained by combining the prior distribution and the likelihood. The likelihood summarizes both the data from studies included in the meta-analysis (for example, 2×2 tables from randomized trials) and the meta-analysis model (for example, assuming a fixed effect or random effects). The result of the analysis is usually presented as a point estimate and 95% credible interval from the posterior distribution for each quantity of interest, which look much like classical estimates and confidence intervals. Potential advantages of Bayesian analyses are summarized in Box 10.13.a . Bayesian analysis may be performed using WinBUGS software (Smith et al 1995, Lunn et al 2000), within R (Röver 2017), or – for some applications – using standard meta-regression software with a simple trick (Rhodes et al 2016).

A difference between Bayesian analysis and classical meta-analysis is that the interpretation is directly in terms of belief: a 95% credible interval for an odds ratio is that region in which we believe the odds ratio to lie with probability 95%. This is how many practitioners actually interpret a classical confidence interval, but strictly in the classical framework the 95% refers to the long-term frequency with which 95% intervals contain the true value. The Bayesian framework also allows a review author to calculate the probability that the odds ratio has a particular range of values, which cannot be done in the classical framework. For example, we can determine the probability that the odds ratio is less than 1 (which might indicate a beneficial effect of an experimental intervention), or that it is no larger than 0.8 (which might indicate a clinically important effect). It should be noted that these probabilities are specific to the choice of the prior distribution. Different meta-analysts may analyse the same data using different prior distributions and obtain different results. It is therefore important to carry out sensitivity analyses to investigate how the results depend on any assumptions made.

In the context of a meta-analysis, prior distributions are needed for the particular intervention effect being analysed (such as the odds ratio or the mean difference) and – in the context of a random-effects meta-analysis – on the amount of heterogeneity among intervention effects across studies. Prior distributions may represent subjective belief about the size of the effect, or may be derived from sources of evidence not included in the meta-analysis, such as information from non-randomized studies of the same intervention or from randomized trials of other interventions. The width of the prior distribution reflects the degree of uncertainty about the quantity. When there is little or no information, a ‘non-informative’ prior can be used, in which all values across the possible range are equally likely.

Most Bayesian meta-analyses use non-informative (or very weakly informative) prior distributions to represent beliefs about intervention effects, since many regard it as controversial to combine objective trial data with subjective opinion. However, prior distributions are increasingly used for the extent of among-study variation in a random-effects analysis. This is particularly advantageous when the number of studies in the meta-analysis is small, say fewer than five or ten. Libraries of data-based prior distributions are available that have been derived from re-analyses of many thousands of meta-analyses in the Cochrane Database of Systematic Reviews (Turner et al 2012).

Box 10.13.a Some potential advantages of Bayesian meta-analysis

Statistical expertise is strongly recommended for review authors who wish to carry out Bayesian analyses. There are several good texts (Sutton et al 2000, Sutton and Abrams 2001, Spiegelhalter et al 2004).

10.14 Sensitivity analyses

The process of undertaking a systematic review involves a sequence of decisions. Whilst many of these decisions are clearly objective and non-contentious, some will be somewhat arbitrary or unclear. For instance, if eligibility criteria involve a numerical value, the choice of value is usually arbitrary: for example, defining groups of older people may reasonably have lower limits of 60, 65, 70 or 75 years, or any value in between. Other decisions may be unclear because a study report fails to include the required information. Some decisions are unclear because the included studies themselves never obtained the information required: for example, the outcomes of those who were lost to follow-up. Further decisions are unclear because there is no consensus on the best statistical method to use for a particular problem.

It is highly desirable to prove that the findings from a systematic review are not dependent on such arbitrary or unclear decisions by using sensitivity analysis (see MECIR Box 10.14.a ). A sensitivity analysis is a repeat of the primary analysis or meta-analysis in which alternative decisions or ranges of values are substituted for decisions that were arbitrary or unclear. For example, if the eligibility of some studies in the meta-analysis is dubious because they do not contain full details, sensitivity analysis may involve undertaking the meta-analysis twice: the first time including all studies and, second, including only those that are definitely known to be eligible. A sensitivity analysis asks the question, ‘Are the findings robust to the decisions made in the process of obtaining them?’

MECIR Box 10.14.a Relevant expectations for conduct of intervention reviews

There are many decision nodes within the systematic review process that can generate a need for a sensitivity analysis. Examples include:

Searching for studies:

  • Should abstracts whose results cannot be confirmed in subsequent publications be included in the review?

Eligibility criteria:

  • Characteristics of participants: where a majority but not all people in a study meet an age range, should the study be included?
  • Characteristics of the intervention: what range of doses should be included in the meta-analysis?
  • Characteristics of the comparator: what criteria are required to define usual care to be used as a comparator group?
  • Characteristics of the outcome: what time point or range of time points are eligible for inclusion?
  • Study design: should blinded and unblinded outcome assessment be included, or should study inclusion be restricted by other aspects of methodological criteria?

What data should be analysed?

  • Time-to-event data: what assumptions of the distribution of censored data should be made?
  • Continuous data: where standard deviations are missing, when and how should they be imputed? Should analyses be based on change scores or on post-intervention values?
  • Ordinal scales: what cut-point should be used to dichotomize short ordinal scales into two groups?
  • Cluster-randomized trials: what values of the intraclass correlation coefficient should be used when trial analyses have not been adjusted for clustering?
  • Crossover trials: what values of the within-subject correlation coefficient should be used when this is not available in primary reports?
  • All analyses: what assumptions should be made about missing outcomes? Should adjusted or unadjusted estimates of intervention effects be used?

Analysis methods:

  • Should fixed-effect or random-effects methods be used for the analysis?
  • For dichotomous outcomes, should odds ratios, risk ratios or risk differences be used?
  • For continuous outcomes, where several scales have assessed the same dimension, should results be analysed as a standardized mean difference across all scales or as mean differences individually for each scale?

Some sensitivity analyses can be pre-specified in the study protocol, but many issues suitable for sensitivity analysis are only identified during the review process where the individual peculiarities of the studies under investigation are identified. When sensitivity analyses show that the overall result and conclusions are not affected by the different decisions that could be made during the review process, the results of the review can be regarded with a higher degree of certainty. Where sensitivity analyses identify particular decisions or missing information that greatly influence the findings of the review, greater resources can be deployed to try and resolve uncertainties and obtain extra information, possibly through contacting trial authors and obtaining individual participant data. If this cannot be achieved, the results must be interpreted with an appropriate degree of caution. Such findings may generate proposals for further investigations and future research.

Reporting of sensitivity analyses in a systematic review may best be done by producing a summary table. Rarely is it informative to produce individual forest plots for each sensitivity analysis undertaken.

Sensitivity analyses are sometimes confused with subgroup analysis. Although some sensitivity analyses involve restricting the analysis to a subset of the totality of studies, the two methods differ in two ways. First, sensitivity analyses do not attempt to estimate the effect of the intervention in the group of studies removed from the analysis, whereas in subgroup analyses, estimates are produced for each subgroup. Second, in sensitivity analyses, informal comparisons are made between different ways of estimating the same thing, whereas in subgroup analyses, formal statistical comparisons are made across the subgroups.

10.15 Chapter information

Editors: Jonathan J Deeks, Julian PT Higgins, Douglas G Altman; on behalf of the Cochrane Statistical Methods Group

Contributing authors: Douglas Altman, Deborah Ashby, Jacqueline Birks, Michael Borenstein, Marion Campbell, Jonathan Deeks, Matthias Egger, Julian Higgins, Joseph Lau, Keith O’Rourke, Gerta Rücker, Rob Scholten, Jonathan Sterne, Simon Thompson, Anne Whitehead

Acknowledgements: We are grateful to the following for commenting helpfully on earlier drafts: Bodil Als-Nielsen, Deborah Ashby, Jesse Berlin, Joseph Beyene, Jacqueline Birks, Michael Bracken, Marion Campbell, Chris Cates, Wendong Chen, Mike Clarke, Albert Cobos, Esther Coren, Francois Curtin, Roberto D’Amico, Keith Dear, Heather Dickinson, Diana Elbourne, Simon Gates, Paul Glasziou, Christian Gluud, Peter Herbison, Sally Hollis, David Jones, Steff Lewis, Tianjing Li, Joanne McKenzie, Philippa Middleton, Nathan Pace, Craig Ramsey, Keith O’Rourke, Rob Scholten, Guido Schwarzer, Jack Sinclair, Jonathan Sterne, Simon Thompson, Andy Vail, Clarine van Oel, Paula Williamson and Fred Wolf.

Funding: JJD received support from the National Institute for Health Research (NIHR) Birmingham Biomedical Research Centre at the University Hospitals Birmingham NHS Foundation Trust and the University of Birmingham. JPTH is a member of the NIHR Biomedical Research Centre at University Hospitals Bristol NHS Foundation Trust and the University of Bristol. JPTH received funding from National Institute for Health Research Senior Investigator award NF-SI-0617-10145. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health.

10.16 References

Agresti A. An Introduction to Categorical Data Analysis . New York (NY): John Wiley & Sons; 1996.

Akl EA, Kahale LA, Agoritsas T, Brignardello-Petersen R, Busse JW, Carrasco-Labra A, Ebrahim S, Johnston BC, Neumann I, Sola I, Sun X, Vandvik P, Zhang Y, Alonso-Coello P, Guyatt G. Handling trial participants with missing outcome data when conducting a meta-analysis: a systematic survey of proposed approaches. Systematic Reviews 2015; 4 : 98.

Akl EA, Kahale LA, Ebrahim S, Alonso-Coello P, Schünemann HJ, Guyatt GH. Three challenges described for identifying participants with missing data in trials reports, and potential solutions suggested to systematic reviewers. Journal of Clinical Epidemiology 2016; 76 : 147-154.

Altman DG, Bland JM. Detecting skewness from summary information. BMJ 1996; 313 : 1200.

Anzures-Cabrera J, Sarpatwari A, Higgins JPT. Expressing findings from meta-analyses of continuous outcomes in terms of risks. Statistics in Medicine 2011; 30 : 2967-2985.

Berlin JA, Longnecker MP, Greenland S. Meta-analysis of epidemiologic dose-response data. Epidemiology 1993; 4 : 218-228.

Berlin JA, Antman EM. Advantages and limitations of metaanalytic regressions of clinical trials data. Online Journal of Current Clinical Trials 1994; Doc No 134 .

Berlin JA, Santanna J, Schmid CH, Szczech LA, Feldman KA, Group A-LAITS. Individual patient- versus group-level data meta-regressions for the investigation of treatment effect modifiers: ecological bias rears its ugly head. Statistics in Medicine 2002; 21 : 371-387.

Borenstein M, Hedges LV, Higgins JPT, Rothstein HR. A basic introduction to fixed-effect and random-effects models for meta-analysis. Research Synthesis Methods 2010; 1 : 97-111.

Borenstein M, Higgins JPT. Meta-analysis and subgroups. Prev Sci 2013; 14 : 134-143.

Bradburn MJ, Deeks JJ, Berlin JA, Russell Localio A. Much ado about nothing: a comparison of the performance of meta-analytical methods with rare events. Statistics in Medicine 2007; 26 : 53-77.

Chinn S. A simple method for converting an odds ratio to effect size for use in meta-analysis. Statistics in Medicine 2000; 19 : 3127-3131.

da Costa BR, Nuesch E, Rutjes AW, Johnston BC, Reichenbach S, Trelle S, Guyatt GH, Jüni P. Combining follow-up and change data is valid in meta-analyses of continuous outcomes: a meta-epidemiological study. Journal of Clinical Epidemiology 2013; 66 : 847-855.

Deeks JJ. Systematic reviews of published evidence: Miracles or minefields? Annals of Oncology 1998; 9 : 703-709.

Deeks JJ, Altman DG, Bradburn MJ. Statistical methods for examining heterogeneity and combining results from several studies in meta-analysis. In: Egger M, Davey Smith G, Altman DG, editors. Systematic Reviews in Health Care: Meta-analysis in Context . 2nd edition ed. London (UK): BMJ Publication Group; 2001. p. 285-312.

Deeks JJ. Issues in the selection of a summary statistic for meta-analysis of clinical trials with binary outcomes. Statistics in Medicine 2002; 21 : 1575-1600.

DerSimonian R, Laird N. Meta-analysis in clinical trials. Controlled Clinical Trials 1986; 7 : 177-188.

DiGuiseppi C, Higgins JPT. Interventions for promoting smoke alarm ownership and function. Cochrane Database of Systematic Reviews 2001; 2 : CD002246.

Ebrahim S, Akl EA, Mustafa RA, Sun X, Walter SD, Heels-Ansdell D, Alonso-Coello P, Johnston BC, Guyatt GH. Addressing continuous data for participants excluded from trial analysis: a guide for systematic reviewers. Journal of Clinical Epidemiology 2013; 66 : 1014-1021 e1011.

Ebrahim S, Johnston BC, Akl EA, Mustafa RA, Sun X, Walter SD, Heels-Ansdell D, Alonso-Coello P, Guyatt GH. Addressing continuous data measured with different instruments for participants excluded from trial analysis: a guide for systematic reviewers. Journal of Clinical Epidemiology 2014; 67 : 560-570.

Efthimiou O. Practical guide to the meta-analysis of rare events. Evidence-Based Mental Health 2018; 21 : 72-76.

Egger M, Davey Smith G, Schneider M, Minder C. Bias in meta-analysis detected by a simple, graphical test. BMJ 1997; 315 : 629-634.

Engels EA, Schmid CH, Terrin N, Olkin I, Lau J. Heterogeneity and statistical significance in meta-analysis: an empirical study of 125 meta-analyses. Statistics in Medicine 2000; 19 : 1707-1728.

Greenland S, Robins JM. Estimation of a common effect parameter from sparse follow-up data. Biometrics 1985; 41 : 55-68.

Greenland S. Quantitative methods in the review of epidemiologic literature. Epidemiologic Reviews 1987; 9 : 1-30.

Greenland S, Longnecker MP. Methods for trend estimation from summarized dose-response data, with applications to meta-analysis. American Journal of Epidemiology 1992; 135 : 1301-1309.

Guevara JP, Berlin JA, Wolf FM. Meta-analytic methods for pooling rates when follow-up duration varies: a case study. BMC Medical Research Methodology 2004; 4 : 17.

Hartung J, Knapp G. A refined method for the meta-analysis of controlled clinical trials with binary outcome. Statistics in Medicine 2001; 20 : 3875-3889.

Hasselblad V, McCrory DC. Meta-analytic tools for medical decision making: A practical guide. Medical Decision Making 1995; 15 : 81-96.

Higgins JPT, Thompson SG. Quantifying heterogeneity in a meta-analysis. Statistics in Medicine 2002; 21 : 1539-1558.

Higgins JPT, Thompson SG, Deeks JJ, Altman DG. Measuring inconsistency in meta-analyses. BMJ 2003; 327 : 557-560.

Higgins JPT, Thompson SG. Controlling the risk of spurious findings from meta-regression. Statistics in Medicine 2004; 23 : 1663-1682.

Higgins JPT, White IR, Wood AM. Imputation methods for missing outcome data in meta-analysis of clinical trials. Clinical Trials 2008a; 5 : 225-239.

Higgins JPT, White IR, Anzures-Cabrera J. Meta-analysis of skewed data: combining results reported on log-transformed or raw scales. Statistics in Medicine 2008b; 27 : 6072-6092.

Higgins JPT, Thompson SG, Spiegelhalter DJ. A re-evaluation of random-effects meta-analysis. Journal of the Royal Statistical Society: Series A (Statistics in Society) 2009; 172 : 137-159.

Kjaergard LL, Villumsen J, Gluud C. Reported methodologic quality and discrepancies between large and small randomized trials in meta-analyses. Annals of Internal Medicine 2001; 135 : 982-989.

Langan D, Higgins JPT, Simmonds M. An empirical comparison of heterogeneity variance estimators in 12 894 meta-analyses. Research Synthesis Methods 2015; 6 : 195-205.

Langan D, Higgins JPT, Simmonds M. Comparative performance of heterogeneity variance estimators in meta-analysis: a review of simulation studies. Research Synthesis Methods 2017; 8 : 181-198.

Langan D, Higgins JPT, Jackson D, Bowden J, Veroniki AA, Kontopantelis E, Viechtbauer W, Simmonds M. A comparison of heterogeneity variance estimators in simulated random-effects meta-analyses. Research Synthesis Methods 2019; 10 : 83-98.

Lewis S, Clarke M. Forest plots: trying to see the wood and the trees. BMJ 2001; 322 : 1479-1480.

Lunn DJ, Thomas A, Best N, Spiegelhalter D. WinBUGS - A Bayesian modelling framework: Concepts, structure, and extensibility. Statistics and Computing 2000; 10 : 325-337.

Mantel N, Haenszel W. Statistical aspects of the analysis of data from retrospective studies of disease. Journal of the National Cancer Institute 1959; 22 : 719-748.

McIntosh MW. The population risk as an explanatory variable in research synthesis of clinical trials. Statistics in Medicine 1996; 15 : 1713-1728.

Morgenstern H. Uses of ecologic analysis in epidemiologic research. American Journal of Public Health 1982; 72 : 1336-1344.

Oxman AD, Guyatt GH. A consumers guide to subgroup analyses. Annals of Internal Medicine 1992; 116 : 78-84.

Peto R, Collins R, Gray R. Large-scale randomized evidence: large, simple trials and overviews of trials. Journal of Clinical Epidemiology 1995; 48 : 23-40.

Poole C, Greenland S. Random-effects meta-analyses are not always conservative. American Journal of Epidemiology 1999; 150 : 469-475.

Rhodes KM, Turner RM, White IR, Jackson D, Spiegelhalter DJ, Higgins JPT. Implementing informative priors for heterogeneity in meta-analysis using meta-regression and pseudo data. Statistics in Medicine 2016; 35 : 5495-5511.

Rice K, Higgins JPT, Lumley T. A re-evaluation of fixed effect(s) meta-analysis. Journal of the Royal Statistical Society Series A (Statistics in Society) 2018; 181 : 205-227.

Riley RD, Higgins JPT, Deeks JJ. Interpretation of random effects meta-analyses. BMJ 2011; 342 : d549.

Röver C. Bayesian random-effects meta-analysis using the bayesmeta R package 2017. https://arxiv.org/abs/1711.08683 .

Rücker G, Schwarzer G, Carpenter J, Olkin I. Why add anything to nothing? The arcsine difference as a measure of treatment effect in meta-analysis with zero cells. Statistics in Medicine 2009; 28 : 721-738.

Sharp SJ. Analysing the relationship between treatment benefit and underlying risk: precautions and practical recommendations. In: Egger M, Davey Smith G, Altman DG, editors. Systematic Reviews in Health Care: Meta-analysis in Context . 2nd edition ed. London (UK): BMJ Publication Group; 2001. p. 176-188.

Sidik K, Jonkman JN. A simple confidence interval for meta-analysis. Statistics in Medicine 2002; 21 : 3153-3159.

Simmonds MC, Tierney J, Bowden J, Higgins JPT. Meta-analysis of time-to-event data: a comparison of two-stage methods. Research Synthesis Methods 2011; 2 : 139-149.

Sinclair JC, Bracken MB. Clinically useful measures of effect in binary analyses of randomized trials. Journal of Clinical Epidemiology 1994; 47 : 881-889.

Smith TC, Spiegelhalter DJ, Thomas A. Bayesian approaches to random-effects meta-analysis: a comparative study. Statistics in Medicine 1995; 14 : 2685-2699.

Spiegelhalter DJ, Abrams KR, Myles JP. Bayesian Approaches to Clinical Trials and Health-Care Evaluation . Chichester (UK): John Wiley & Sons; 2004.

Spittal MJ, Pirkis J, Gurrin LC. Meta-analysis of incidence rate data in the presence of zero events. BMC Medical Research Methodology 2015; 15 : 42.

Sutton AJ, Abrams KR, Jones DR, Sheldon TA, Song F. Methods for Meta-analysis in Medical Research . Chichester (UK): John Wiley & Sons; 2000.

Sutton AJ, Abrams KR. Bayesian methods in meta-analysis and evidence synthesis. Statistical Methods in Medical Research 2001; 10 : 277-303.

Sweeting MJ, Sutton AJ, Lambert PC. What to add to nothing? Use and avoidance of continuity corrections in meta-analysis of sparse data. Statistics in Medicine 2004; 23 : 1351-1375.

Thompson SG, Smith TC, Sharp SJ. Investigating underlying risk as a source of heterogeneity in meta-analysis. Statistics in Medicine 1997; 16 : 2741-2758.

Thompson SG, Sharp SJ. Explaining heterogeneity in meta-analysis: a comparison of methods. Statistics in Medicine 1999; 18 : 2693-2708.

Thompson SG, Higgins JPT. How should meta-regression analyses be undertaken and interpreted? Statistics in Medicine 2002; 21 : 1559-1574.

Turner RM, Davey J, Clarke MJ, Thompson SG, Higgins JPT. Predicting the extent of heterogeneity in meta-analysis, using empirical data from the Cochrane Database of Systematic Reviews. International Journal of Epidemiology 2012; 41 : 818-827.

Veroniki AA, Jackson D, Viechtbauer W, Bender R, Bowden J, Knapp G, Kuss O, Higgins JPT, Langan D, Salanti G. Methods to estimate the between-study variance and its uncertainty in meta-analysis. Research Synthesis Methods 2016; 7 : 55-79.

Whitehead A, Jones NMB. A meta-analysis of clinical trials involving different classifications of response into ordered categories. Statistics in Medicine 1994; 13 : 2503-2515.

Yusuf S, Peto R, Lewis J, Collins R, Sleight P. Beta blockade during and after myocardial infarction: an overview of the randomized trials. Progress in Cardiovascular Diseases 1985; 27 : 335-371.

Yusuf S, Wittes J, Probstfield J, Tyroler HA. Analysis and interpretation of treatment effects in subgroups of patients in randomized clinical trials. JAMA 1991; 266 : 93-98.

For permission to re-use material from the Handbook (either academic or commercial), please see here for full details.

  • How it works

researchprospect post subheader

Meta-Analysis – Guide with Definition, Steps & Examples

Published by Owen Ingram at April 26th, 2023 , Revised On April 26, 2023

“A meta-analysis is a formal, epidemiological, quantitative study design that uses statistical methods to generalise the findings of the selected independent studies. “

Meta-analysis and systematic review are the two most authentic strategies in research. When researchers start looking for the best available evidence concerning their research work, they are advised to begin from the top of the evidence pyramid. The evidence available in the form of meta-analysis or systematic reviews addressing important questions is significant in academics because it informs decision-making.

What is Meta-Analysis  

Meta-analysis estimates the absolute effect of individual independent research studies by systematically synthesising or merging the results. Meta-analysis isn’t only about achieving a wider population by combining several smaller studies. It involves systematic methods to evaluate the inconsistencies in participants, variability (also known as heterogeneity), and findings to check how sensitive their findings are to the selected systematic review protocol.   

When Should you Conduct a Meta-Analysis?

Meta-analysis has become a widely-used research method in medical sciences and other fields of work for several reasons. The technique involves summarising the results of independent systematic review studies. 

The Cochrane Handbook explains that “an important step in a systematic review is the thoughtful consideration of whether it is appropriate to combine the numerical results of all, or perhaps some, of the studies. Such a meta-analysis yields an overall statistic (together with its confidence interval) that summarizes the effectiveness of an experimental intervention compared with a comparator intervention” (section 10.2).

A researcher or a practitioner should choose meta-analysis when the following outcomes are desirable. 

For generating new hypotheses or ending controversies resulting from different research studies. Quantifying and evaluating the variable results and identifying the extent of conflict in literature through meta-analysis is possible. 

To find research gaps left unfilled and address questions not posed by individual studies. Primary research studies involve specific types of participants and interventions. A review of these studies with variable characteristics and methodologies can allow the researcher to gauge the consistency of findings across a wider range of participants and interventions. With the help of meta-analysis, the reasons for differences in the effect can also be explored. 

To provide convincing evidence. Estimating the effects with a larger sample size and interventions can provide convincing evidence. Many academic studies are based on a very small dataset, so the estimated intervention effects in isolation are not fully reliable.

Elements of a Meta-Analysis

Deeks et al. (2019), Haidilch (2010), and Grant & Booth (2009) explored the characteristics, strengths, and weaknesses of conducting the meta-analysis. They are briefly explained below. 


  • A systematic review must be completed before conducting the meta-analysis because it provides a summary of the findings of the individual studies synthesised. 
  • You can only conduct a meta-analysis by synthesising studies in a systematic review. 
  • The studies selected for statistical analysis for the purpose of meta-analysis should be similar in terms of comparison, intervention, and population. 


  • A meta-analysis takes place after the systematic review. The end product is a comprehensive quantitative analysis that is complicated but reliable. 
  • It gives more value and weightage to existing studies that do not hold practical value on their own. 
  • Policy-makers and academicians cannot base their decisions on individual research studies. Meta-analysis provides them with a complex and solid analysis of evidence to make informed decisions. 


  • The meta-analysis uses studies exploring similar topics. Finding similar studies for the meta-analysis can be challenging.
  • When and if biases in the individual studies or those related to reporting and specific research methodologies are involved, the meta-analysis results could be misleading.

Steps of Conducting the Meta-Analysis 

The process of conducting the meta-analysis has remained a topic of debate among researchers and scientists. However, the following 5-step process is widely accepted. 

Step 1: Research Question

The first step in conducting clinical research involves identifying a research question and proposing a hypothesis . The potential clinical significance of the research question is then explained, and the study design and analytical plan are justified.

Step 2: Systematic Review 

The purpose of a systematic review (SR) is to address a research question by identifying all relevant studies that meet the required quality standards for inclusion. While established journals typically serve as the primary source for identified studies, it is important to also consider unpublished data to avoid publication bias or the exclusion of studies with negative results.

While some meta-analyses may limit their focus to randomized controlled trials (RCTs) for the sake of obtaining the highest quality evidence, other experimental and quasi-experimental studies may be included if they meet the specific inclusion/exclusion criteria established for the review.

Step 3: Data Extraction

After selecting studies for the meta-analysis, researchers extract summary data or outcomes, as well as sample sizes and measures of data variability for both intervention and control groups. The choice of outcome measures depends on the research question and the type of study, and may include numerical or categorical measures.

For instance, numerical means may be used to report differences in scores on a questionnaire or changes in a measurement, such as blood pressure. In contrast, risk measures like odds ratios (OR) or relative risks (RR) are typically used to report differences in the probability of belonging to one category or another, such as vaginal birth versus cesarean birth.

Step 4: Standardisation and Weighting Studies

After gathering all the required data, the fourth step involves computing suitable summary measures from each study for further examination. These measures are typically referred to as Effect Sizes and indicate the difference in average scores between the control and intervention groups. For instance, it could be the variation in blood pressure changes between study participants who used drug X and those who used a placebo.

Since the units of measurement often differ across the included studies, standardization is necessary to create comparable effect size estimates. Standardization is accomplished by determining, for each study, the average score for the intervention group, subtracting the average score for the control group, and dividing the result by the relevant measure of variability in that dataset.

In some cases, the results of certain studies must carry more significance than others. Larger studies, as measured by their sample sizes, are deemed to produce more precise estimates of effect size than smaller studies. Additionally, studies with less variability in data, such as smaller standard deviation or narrower confidence intervals, are typically regarded as higher quality in study design. A weighting statistic that aims to incorporate both of these factors, known as inverse variance, is commonly employed.

Step 5: Absolute Effect Estimation

The ultimate step in conducting a meta-analysis is to choose and utilize an appropriate model for comparing Effect Sizes among diverse studies. Two popular models for this purpose are the Fixed Effects and Random Effects models. The Fixed Effects model relies on the premise that each study is evaluating a common treatment effect, implying that all studies would have estimated the same Effect Size if sample variability were equal across all studies.

Conversely, the Random Effects model posits that the true treatment effects in individual studies may vary from each other, and endeavors to consider this additional source of interstudy variation in Effect Sizes. The existence and magnitude of this latter variability is usually evaluated within the meta-analysis through a test for ‘heterogeneity.’

Forest Plot

The results of a meta-analysis are often visually presented using a “Forest Plot”. This type of plot displays, for each study, included in the analysis, a horizontal line that indicates the standardized Effect Size estimate and 95% confidence interval for the risk ratio used. Figure A provides an example of a hypothetical Forest Plot in which drug X reduces the risk of death in all three studies.

However, the first study was larger than the other two, and as a result, the estimates for the smaller studies were not statistically significant. This is indicated by the lines emanating from their boxes, including the value of 1. The size of the boxes represents the relative weights assigned to each study by the meta-analysis. The combined estimate of the drug’s effect, represented by the diamond, provides a more precise estimate of the drug’s effect, with the diamond indicating both the combined risk ratio estimate and the 95% confidence interval limits.

odds ratio

Figure-A: Hypothetical Forest Plot

Relevance to Practice and Research 

  Evidence Based Nursing commentaries often include recently published systematic reviews and meta-analyses, as they can provide new insights and strengthen recommendations for effective healthcare practices. Additionally, they can identify gaps or limitations in current evidence and guide future research directions.

The quality of the data available for synthesis is a critical factor in the strength of conclusions drawn from meta-analyses, and this is influenced by the quality of individual studies and the systematic review itself. However, meta-analysis cannot overcome issues related to underpowered or poorly designed studies.

Therefore, clinicians may still encounter situations where the evidence is weak or uncertain, and where higher-quality research is required to improve clinical decision-making. While such findings can be frustrating, they remain important for informing practice and highlighting the need for further research to fill gaps in the evidence base.

Methods and Assumptions in Meta-Analysis 

Ensuring the credibility of findings is imperative in all types of research, including meta-analyses. To validate the outcomes of a meta-analysis, the researcher must confirm that the research techniques used were accurate in measuring the intended variables. Typically, researchers establish the validity of a meta-analysis by testing the outcomes for homogeneity or the degree of similarity between the results of the combined studies.

Homogeneity is preferred in meta-analyses as it allows the data to be combined without needing adjustments to suit the study’s requirements. To determine homogeneity, researchers assess heterogeneity, the opposite of homogeneity. Two widely used statistical methods for evaluating heterogeneity in research results are Cochran’s-Q and I-Square, also known as I-2 Index.

Difference Between Meta-Analysis and Systematic Reviews

Meta-analysis and systematic reviews are both research methods used to synthesise evidence from multiple studies on a particular topic. However, there are some key differences between the two.

Systematic reviews involve a comprehensive and structured approach to identifying, selecting, and critically appraising all available evidence relevant to a specific research question. This process involves searching multiple databases, screening the identified studies for relevance and quality, and summarizing the findings in a narrative report.

Meta-analysis, on the other hand, involves using statistical methods to combine and analyze the data from multiple studies, with the aim of producing a quantitative summary of the overall effect size. Meta-analysis requires the studies to be similar enough in terms of their design, methodology, and outcome measures to allow for meaningful comparison and analysis.

Therefore, systematic reviews are broader in scope and summarize the findings of all studies on a topic, while meta-analyses are more focused on producing a quantitative estimate of the effect size of an intervention across multiple studies that meet certain criteria. In some cases, a systematic review may be conducted without a meta-analysis if the studies are too diverse or the quality of the data is not sufficient to allow for statistical pooling.

Software Packages For Meta-Analysis

Meta-analysis can be done through software packages, including free and paid options. One of the most commonly used software packages for meta-analysis is RevMan by the Cochrane Collaboration.

Assessing the Quality of Meta-Analysis 

Assessing the quality of a meta-analysis involves evaluating the methods used to conduct the analysis and the quality of the studies included. Here are some key factors to consider:

  • Study selection: The studies included in the meta-analysis should be relevant to the research question and meet predetermined criteria for quality.
  • Search strategy: The search strategy should be comprehensive and transparent, including databases and search terms used to identify relevant studies.
  • Study quality assessment: The quality of included studies should be assessed using appropriate tools, and this assessment should be reported in the meta-analysis.
  • Data extraction: The data extraction process should be systematic and clearly reported, including any discrepancies that arose.
  • Analysis methods: The meta-analysis should use appropriate statistical methods to combine the results of the included studies, and these methods should be transparently reported.
  • Publication bias: The potential for publication bias should be assessed and reported in the meta-analysis, including any efforts to identify and include unpublished studies.
  • Interpretation of results: The results should be interpreted in the context of the study limitations and the overall quality of the evidence.
  • Sensitivity analysis: Sensitivity analysis should be conducted to evaluate the impact of study quality, inclusion criteria, and other factors on the overall results.

Overall, a high-quality meta-analysis should be transparent in its methods and clearly report the included studies’ limitations and the evidence’s overall quality.

Hire an Expert Writer

Orders completed by our expert writers are

  • Formally drafted in an academic style
  • Free Amendments and 100% Plagiarism Free – or your money back!
  • 100% Confidential and Timely Delivery!
  • Free anti-plagiarism report
  • Appreciated by thousands of clients. Check client reviews

Hire an Expert Writer

Examples of Meta-Analysis

  • STANLEY T.D. et JARRELL S.B. (1989), « Meta-regression analysis : a quantitative method of literature surveys », Journal of Economics Surveys, vol. 3, n°2, pp. 161-170.
  • DATTA D.K., PINCHES G.E. et NARAYANAN V.K. (1992), « Factors influencing wealth creation from mergers and acquisitions : a meta-analysis », Strategic Management Journal, Vol. 13, pp. 67-84.
  • GLASS G. (1983), « Synthesising empirical research : Meta-analysis » in S.A. Ward and L.J. Reed (Eds), Knowledge structure and use : Implications for synthesis and interpretation, Philadelphia : Temple University Press.
  • WOLF F.M. (1986), Meta-analysis : Quantitative methods for research synthesis, Sage University Paper n°59.
  • HUNTER J.E., SCHMIDT F.L. et JACKSON G.B. (1982), « Meta-analysis : cumulating research findings across studies », Beverly Hills, CA : Sage.

Frequently Asked Questions

What is a meta-analysis in research.

Meta-analysis is a statistical method used to combine results from multiple studies on a specific topic. By pooling data from various sources, meta-analysis can provide a more precise estimate of the effect size of a treatment or intervention and identify areas for future research.

Why is meta-analysis important?

Meta-analysis is important because it combines and summarizes results from multiple studies to provide a more precise and reliable estimate of the effect of a treatment or intervention. This helps clinicians and policymakers make evidence-based decisions and identify areas for further research.

What is an example of a meta-analysis?

A meta-analysis of studies evaluating physical exercise’s effect on depression in adults is an example. Researchers gathered data from 49 studies involving a total of 2669 participants. The studies used different types of exercise and measures of depression, which made it difficult to compare the results.

Through meta-analysis, the researchers calculated an overall effect size and determined that exercise was associated with a statistically significant reduction in depression symptoms. The study also identified that moderate-intensity aerobic exercise, performed three to five times per week, was the most effective. The meta-analysis provided a more comprehensive understanding of the impact of exercise on depression than any single study could provide.

What is the definition of meta-analysis in clinical research?

Meta-analysis in clinical research is a statistical technique that combines data from multiple independent studies on a particular topic to generate a summary or “meta” estimate of the effect of a particular intervention or exposure.

This type of analysis allows researchers to synthesise the results of multiple studies, potentially increasing the statistical power and providing more precise estimates of treatment effects. Meta-analyses are commonly used in clinical research to evaluate the effectiveness and safety of medical interventions and to inform clinical practice guidelines.

Is meta-analysis qualitative or quantitative?

Meta-analysis is a quantitative method used to combine and analyze data from multiple studies. It involves the statistical synthesis of results from individual studies to obtain a pooled estimate of the effect size of a particular intervention or treatment. Therefore, meta-analysis is considered a quantitative approach to research synthesis.

You May Also Like

The authenticity of dissertation is largely influenced by the research method employed. Here we present the most notable research methods for dissertation.

Baffled by the concept of reliability and validity? Reliability refers to the consistency of measurement. Validity refers to the accuracy of measurement.

Struggling to figure out “whether I should choose primary research or secondary research in my dissertation?” Here are some tips to help you decide.






  • How It Works

Systematic Reviews and Meta-Analysis: A Guide for Beginners


  • 1 Department of Pediatrics, Advanced Pediatrics Centre, PGIMER, Chandigarh. Correspondence to: Prof Joseph L Mathew, Department of Pediatrics, Advanced Pediatrics Centre, PGIMER Chandigarh. [email protected].
  • PMID: 34183469
  • PMCID: PMC9065227
  • DOI: 10.1007/s13312-022-2500-y

Systematic reviews involve the application of scientific methods to reduce bias in review of literature. The key components of a systematic review are a well-defined research question, comprehensive literature search to identify all studies that potentially address the question, systematic assembly of the studies that answer the question, critical appraisal of the methodological quality of the included studies, data extraction and analysis (with and without statistics), and considerations towards applicability of the evidence generated in a systematic review. These key features can be remembered as six 'A'; Ask, Access, Assimilate, Appraise, Analyze and Apply. Meta-analysis is a statistical tool that provides pooled estimates of effect from the data extracted from individual studies in the systematic review. The graphical output of meta-analysis is a forest plot which provides information on individual studies and the pooled effect. Systematic reviews of literature can be undertaken for all types of questions, and all types of study designs. This article highlights the key features of systematic reviews, and is designed to help readers understand and interpret them. It can also help to serve as a beginner's guide for both users and producers of systematic reviews and to appreciate some of the methodological issues.

Publication types

  • Meta-Analysis
  • Meta-Analysis as Topic*
  • Research Design
  • Systematic Reviews as Topic*
  • Open access
  • Published: 01 August 2019

A step by step guide for conducting a systematic review and meta-analysis with simulation data

  • Gehad Mohamed Tawfik 1 , 2 ,
  • Kadek Agus Surya Dila 2 , 3 ,
  • Muawia Yousif Fadlelmola Mohamed 2 , 4 ,
  • Dao Ngoc Hien Tam 2 , 5 ,
  • Nguyen Dang Kien 2 , 6 ,
  • Ali Mahmoud Ahmed 2 , 7 &
  • Nguyen Tien Huy 8 , 9 , 10  

Tropical Medicine and Health volume  47 , Article number:  46 ( 2019 ) Cite this article

810k Accesses

193 Citations

94 Altmetric

Metrics details

The massive abundance of studies relating to tropical medicine and health has increased strikingly over the last few decades. In the field of tropical medicine and health, a well-conducted systematic review and meta-analysis (SR/MA) is considered a feasible solution for keeping clinicians abreast of current evidence-based medicine. Understanding of SR/MA steps is of paramount importance for its conduction. It is not easy to be done as there are obstacles that could face the researcher. To solve those hindrances, this methodology study aimed to provide a step-by-step approach mainly for beginners and junior researchers, in the field of tropical medicine and other health care fields, on how to properly conduct a SR/MA, in which all the steps here depicts our experience and expertise combined with the already well-known and accepted international guidance.

We suggest that all steps of SR/MA should be done independently by 2–3 reviewers’ discussion, to ensure data quality and accuracy.

SR/MA steps include the development of research question, forming criteria, search strategy, searching databases, protocol registration, title, abstract, full-text screening, manual searching, extracting data, quality assessment, data checking, statistical analysis, double data checking, and manuscript writing.


The amount of studies published in the biomedical literature, especially tropical medicine and health, has increased strikingly over the last few decades. This massive abundance of literature makes clinical medicine increasingly complex, and knowledge from various researches is often needed to inform a particular clinical decision. However, available studies are often heterogeneous with regard to their design, operational quality, and subjects under study and may handle the research question in a different way, which adds to the complexity of evidence and conclusion synthesis [ 1 ].

Systematic review and meta-analyses (SR/MAs) have a high level of evidence as represented by the evidence-based pyramid. Therefore, a well-conducted SR/MA is considered a feasible solution in keeping health clinicians ahead regarding contemporary evidence-based medicine.

Differing from a systematic review, unsystematic narrative review tends to be descriptive, in which the authors select frequently articles based on their point of view which leads to its poor quality. A systematic review, on the other hand, is defined as a review using a systematic method to summarize evidence on questions with a detailed and comprehensive plan of study. Furthermore, despite the increasing guidelines for effectively conducting a systematic review, we found that basic steps often start from framing question, then identifying relevant work which consists of criteria development and search for articles, appraise the quality of included studies, summarize the evidence, and interpret the results [ 2 , 3 ]. However, those simple steps are not easy to be reached in reality. There are many troubles that a researcher could be struggled with which has no detailed indication.

Conducting a SR/MA in tropical medicine and health may be difficult especially for young researchers; therefore, understanding of its essential steps is crucial. It is not easy to be done as there are obstacles that could face the researcher. To solve those hindrances, we recommend a flow diagram (Fig. 1 ) which illustrates a detailed and step-by-step the stages for SR/MA studies. This methodology study aimed to provide a step-by-step approach mainly for beginners and junior researchers, in the field of tropical medicine and other health care fields, on how to properly and succinctly conduct a SR/MA; all the steps here depicts our experience and expertise combined with the already well known and accepted international guidance.

figure 1

Detailed flow diagram guideline for systematic review and meta-analysis steps. Note : Star icon refers to “2–3 reviewers screen independently”

Methods and results

Detailed steps for conducting any systematic review and meta-analysis.

We searched the methods reported in published SR/MA in tropical medicine and other healthcare fields besides the published guidelines like Cochrane guidelines {Higgins, 2011 #7} [ 4 ] to collect the best low-bias method for each step of SR/MA conduction steps. Furthermore, we used guidelines that we apply in studies for all SR/MA steps. We combined these methods in order to conclude and conduct a detailed flow diagram that shows the SR/MA steps how being conducted.

Any SR/MA must follow the widely accepted Preferred Reporting Items for Systematic Review and Meta-analysis statement (PRISMA checklist 2009) (Additional file 5 : Table S1) [ 5 ].

We proposed our methods according to a valid explanatory simulation example choosing the topic of “evaluating safety of Ebola vaccine,” as it is known that Ebola is a very rare tropical disease but fatal. All the explained methods feature the standards followed internationally, with our compiled experience in the conduct of SR beside it, which we think proved some validity. This is a SR under conduct by a couple of researchers teaming in a research group, moreover, as the outbreak of Ebola which took place (2013–2016) in Africa resulted in a significant mortality and morbidity. Furthermore, since there are many published and ongoing trials assessing the safety of Ebola vaccines, we thought this would provide a great opportunity to tackle this hotly debated issue. Moreover, Ebola started to fire again and new fatal outbreak appeared in the Democratic Republic of Congo since August 2018, which caused infection to more than 1000 people according to the World Health Organization, and 629 people have been killed till now. Hence, it is considered the second worst Ebola outbreak, after the first one in West Africa in 2014 , which infected more than 26,000 and killed about 11,300 people along outbreak course.

Research question and objectives

Like other study designs, the research question of SR/MA should be feasible, interesting, novel, ethical, and relevant. Therefore, a clear, logical, and well-defined research question should be formulated. Usually, two common tools are used: PICO or SPIDER. PICO (Population, Intervention, Comparison, Outcome) is used mostly in quantitative evidence synthesis. Authors demonstrated that PICO holds more sensitivity than the more specific SPIDER approach [ 6 ]. SPIDER (Sample, Phenomenon of Interest, Design, Evaluation, Research type) was proposed as a method for qualitative and mixed methods search.

We here recommend a combined approach of using either one or both the SPIDER and PICO tools to retrieve a comprehensive search depending on time and resources limitations. When we apply this to our assumed research topic, being of qualitative nature, the use of SPIDER approach is more valid.

PICO is usually used for systematic review and meta-analysis of clinical trial study. For the observational study (without intervention or comparator), in many tropical and epidemiological questions, it is usually enough to use P (Patient) and O (outcome) only to formulate a research question. We must indicate clearly the population (P), then intervention (I) or exposure. Next, it is necessary to compare (C) the indicated intervention with other interventions, i.e., placebo. Finally, we need to clarify which are our relevant outcomes.

To facilitate comprehension, we choose the Ebola virus disease (EVD) as an example. Currently, the vaccine for EVD is being developed and under phase I, II, and III clinical trials; we want to know whether this vaccine is safe and can induce sufficient immunogenicity to the subjects.

An example of a research question for SR/MA based on PICO for this issue is as follows: How is the safety and immunogenicity of Ebola vaccine in human? (P: healthy subjects (human), I: vaccination, C: placebo, O: safety or adverse effects)

Preliminary research and idea validation

We recommend a preliminary search to identify relevant articles, ensure the validity of the proposed idea, avoid duplication of previously addressed questions, and assure that we have enough articles for conducting its analysis. Moreover, themes should focus on relevant and important health-care issues, consider global needs and values, reflect the current science, and be consistent with the adopted review methods. Gaining familiarity with a deep understanding of the study field through relevant videos and discussions is of paramount importance for better retrieval of results. If we ignore this step, our study could be canceled whenever we find out a similar study published before. This means we are wasting our time to deal with a problem that has been tackled for a long time.

To do this, we can start by doing a simple search in PubMed or Google Scholar with search terms Ebola AND vaccine. While doing this step, we identify a systematic review and meta-analysis of determinant factors influencing antibody response from vaccination of Ebola vaccine in non-human primate and human [ 7 ], which is a relevant paper to read to get a deeper insight and identify gaps for better formulation of our research question or purpose. We can still conduct systematic review and meta-analysis of Ebola vaccine because we evaluate safety as a different outcome and different population (only human).

Inclusion and exclusion criteria

Eligibility criteria are based on the PICO approach, study design, and date. Exclusion criteria mostly are unrelated, duplicated, unavailable full texts, or abstract-only papers. These exclusions should be stated in advance to refrain the researcher from bias. The inclusion criteria would be articles with the target patients, investigated interventions, or the comparison between two studied interventions. Briefly, it would be articles which contain information answering our research question. But the most important is that it should be clear and sufficient information, including positive or negative, to answer the question.

For the topic we have chosen, we can make inclusion criteria: (1) any clinical trial evaluating the safety of Ebola vaccine and (2) no restriction regarding country, patient age, race, gender, publication language, and date. Exclusion criteria are as follows: (1) study of Ebola vaccine in non-human subjects or in vitro studies; (2) study with data not reliably extracted, duplicate, or overlapping data; (3) abstract-only papers as preceding papers, conference, editorial, and author response theses and books; (4) articles without available full text available; and (5) case reports, case series, and systematic review studies. The PRISMA flow diagram template that is used in SR/MA studies can be found in Fig. 2 .

figure 2

PRISMA flow diagram of studies’ screening and selection

Search strategy

A standard search strategy is used in PubMed, then later it is modified according to each specific database to get the best relevant results. The basic search strategy is built based on the research question formulation (i.e., PICO or PICOS). Search strategies are constructed to include free-text terms (e.g., in the title and abstract) and any appropriate subject indexing (e.g., MeSH) expected to retrieve eligible studies, with the help of an expert in the review topic field or an information specialist. Additionally, we advise not to use terms for the Outcomes as their inclusion might hinder the database being searched to retrieve eligible studies because the used outcome is not mentioned obviously in the articles.

The improvement of the search term is made while doing a trial search and looking for another relevant term within each concept from retrieved papers. To search for a clinical trial, we can use these descriptors in PubMed: “clinical trial”[Publication Type] OR “clinical trials as topic”[MeSH terms] OR “clinical trial”[All Fields]. After some rounds of trial and refinement of search term, we formulate the final search term for PubMed as follows: (ebola OR ebola virus OR ebola virus disease OR EVD) AND (vaccine OR vaccination OR vaccinated OR immunization) AND (“clinical trial”[Publication Type] OR “clinical trials as topic”[MeSH Terms] OR “clinical trial”[All Fields]). Because the study for this topic is limited, we do not include outcome term (safety and immunogenicity) in the search term to capture more studies.

Search databases, import all results to a library, and exporting to an excel sheet

According to the AMSTAR guidelines, at least two databases have to be searched in the SR/MA [ 8 ], but as you increase the number of searched databases, you get much yield and more accurate and comprehensive results. The ordering of the databases depends mostly on the review questions; being in a study of clinical trials, you will rely mostly on Cochrane, mRCTs, or International Clinical Trials Registry Platform (ICTRP). Here, we propose 12 databases (PubMed, Scopus, Web of Science, EMBASE, GHL, VHL, Cochrane, Google Scholar, Clinical trials.gov , mRCTs, POPLINE, and SIGLE), which help to cover almost all published articles in tropical medicine and other health-related fields. Among those databases, POPLINE focuses on reproductive health. Researchers should consider to choose relevant database according to the research topic. Some databases do not support the use of Boolean or quotation; otherwise, there are some databases that have special searching way. Therefore, we need to modify the initial search terms for each database to get appreciated results; therefore, manipulation guides for each online database searches are presented in Additional file 5 : Table S2. The detailed search strategy for each database is found in Additional file 5 : Table S3. The search term that we created in PubMed needs customization based on a specific characteristic of the database. An example for Google Scholar advanced search for our topic is as follows:

With all of the words: ebola virus

With at least one of the words: vaccine vaccination vaccinated immunization

Where my words occur: in the title of the article

With all of the words: EVD

Finally, all records are collected into one Endnote library in order to delete duplicates and then to it export into an excel sheet. Using remove duplicating function with two options is mandatory. All references which have (1) the same title and author, and published in the same year, and (2) the same title and author, and published in the same journal, would be deleted. References remaining after this step should be exported to an excel file with essential information for screening. These could be the authors’ names, publication year, journal, DOI, URL link, and abstract.

Protocol writing and registration

Protocol registration at an early stage guarantees transparency in the research process and protects from duplication problems. Besides, it is considered a documented proof of team plan of action, research question, eligibility criteria, intervention/exposure, quality assessment, and pre-analysis plan. It is recommended that researchers send it to the principal investigator (PI) to revise it, then upload it to registry sites. There are many registry sites available for SR/MA like those proposed by Cochrane and Campbell collaborations; however, we recommend registering the protocol into PROSPERO as it is easier. The layout of a protocol template, according to PROSPERO, can be found in Additional file 5 : File S1.

Title and abstract screening

Decisions to select retrieved articles for further assessment are based on eligibility criteria, to minimize the chance of including non-relevant articles. According to the Cochrane guidance, two reviewers are a must to do this step, but as for beginners and junior researchers, this might be tiresome; thus, we propose based on our experience that at least three reviewers should work independently to reduce the chance of error, particularly in teams with a large number of authors to add more scrutiny and ensure proper conduct. Mostly, the quality with three reviewers would be better than two, as two only would have different opinions from each other, so they cannot decide, while the third opinion is crucial. And here are some examples of systematic reviews which we conducted following the same strategy (by a different group of researchers in our research group) and published successfully, and they feature relevant ideas to tropical medicine and disease [ 9 , 10 , 11 ].

In this step, duplications will be removed manually whenever the reviewers find them out. When there is a doubt about an article decision, the team should be inclusive rather than exclusive, until the main leader or PI makes a decision after discussion and consensus. All excluded records should be given exclusion reasons.

Full text downloading and screening

Many search engines provide links for free to access full-text articles. In case not found, we can search in some research websites as ResearchGate, which offer an option of direct full-text request from authors. Additionally, exploring archives of wanted journals, or contacting PI to purchase it if available. Similarly, 2–3 reviewers work independently to decide about included full texts according to eligibility criteria, with reporting exclusion reasons of articles. In case any disagreement has occurred, the final decision has to be made by discussion.

Manual search

One has to exhaust all possibilities to reduce bias by performing an explicit hand-searching for retrieval of reports that may have been dropped from first search [ 12 ]. We apply five methods to make manual searching: searching references from included studies/reviews, contacting authors and experts, and looking at related articles/cited articles in PubMed and Google Scholar.

We describe here three consecutive methods to increase and refine the yield of manual searching: firstly, searching reference lists of included articles; secondly, performing what is known as citation tracking in which the reviewers track all the articles that cite each one of the included articles, and this might involve electronic searching of databases; and thirdly, similar to the citation tracking, we follow all “related to” or “similar” articles. Each of the abovementioned methods can be performed by 2–3 independent reviewers, and all the possible relevant article must undergo further scrutiny against the inclusion criteria, after following the same records yielded from electronic databases, i.e., title/abstract and full-text screening.

We propose an independent reviewing by assigning each member of the teams a “tag” and a distinct method, to compile all the results at the end for comparison of differences and discussion and to maximize the retrieval and minimize the bias. Similarly, the number of included articles has to be stated before addition to the overall included records.

Data extraction and quality assessment

This step entitles data collection from included full-texts in a structured extraction excel sheet, which is previously pilot-tested for extraction using some random studies. We recommend extracting both adjusted and non-adjusted data because it gives the most allowed confounding factor to be used in the analysis by pooling them later [ 13 ]. The process of extraction should be executed by 2–3 independent reviewers. Mostly, the sheet is classified into the study and patient characteristics, outcomes, and quality assessment (QA) tool.

Data presented in graphs should be extracted by software tools such as Web plot digitizer [ 14 ]. Most of the equations that can be used in extraction prior to analysis and estimation of standard deviation (SD) from other variables is found inside Additional file 5 : File S2 with their references as Hozo et al. [ 15 ], Xiang et al. [ 16 ], and Rijkom et al. [ 17 ]. A variety of tools are available for the QA, depending on the design: ROB-2 Cochrane tool for randomized controlled trials [ 18 ] which is presented as Additional file 1 : Figure S1 and Additional file 2 : Figure S2—from a previous published article data—[ 19 ], NIH tool for observational and cross-sectional studies [ 20 ], ROBINS-I tool for non-randomize trials [ 21 ], QUADAS-2 tool for diagnostic studies, QUIPS tool for prognostic studies, CARE tool for case reports, and ToxRtool for in vivo and in vitro studies. We recommend that 2–3 reviewers independently assess the quality of the studies and add to the data extraction form before the inclusion into the analysis to reduce the risk of bias. In the NIH tool for observational studies—cohort and cross-sectional—as in this EBOLA case, to evaluate the risk of bias, reviewers should rate each of the 14 items into dichotomous variables: yes, no, or not applicable. An overall score is calculated by adding all the items scores as yes equals one, while no and NA equals zero. A score will be given for every paper to classify them as poor, fair, or good conducted studies, where a score from 0–5 was considered poor, 6–9 as fair, and 10–14 as good.

In the EBOLA case example above, authors can extract the following information: name of authors, country of patients, year of publication, study design (case report, cohort study, or clinical trial or RCT), sample size, the infected point of time after EBOLA infection, follow-up interval after vaccination time, efficacy, safety, adverse effects after vaccinations, and QA sheet (Additional file 6 : Data S1).

Data checking

Due to the expected human error and bias, we recommend a data checking step, in which every included article is compared with its counterpart in an extraction sheet by evidence photos, to detect mistakes in data. We advise assigning articles to 2–3 independent reviewers, ideally not the ones who performed the extraction of those articles. When resources are limited, each reviewer is assigned a different article than the one he extracted in the previous stage.

Statistical analysis

Investigators use different methods for combining and summarizing findings of included studies. Before analysis, there is an important step called cleaning of data in the extraction sheet, where the analyst organizes extraction sheet data in a form that can be read by analytical software. The analysis consists of 2 types namely qualitative and quantitative analysis. Qualitative analysis mostly describes data in SR studies, while quantitative analysis consists of two main types: MA and network meta-analysis (NMA). Subgroup, sensitivity, cumulative analyses, and meta-regression are appropriate for testing whether the results are consistent or not and investigating the effect of certain confounders on the outcome and finding the best predictors. Publication bias should be assessed to investigate the presence of missing studies which can affect the summary.

To illustrate basic meta-analysis, we provide an imaginary data for the research question about Ebola vaccine safety (in terms of adverse events, 14 days after injection) and immunogenicity (Ebola virus antibodies rise in geometric mean titer, 6 months after injection). Assuming that from searching and data extraction, we decided to do an analysis to evaluate Ebola vaccine “A” safety and immunogenicity. Other Ebola vaccines were not meta-analyzed because of the limited number of studies (instead, it will be included for narrative review). The imaginary data for vaccine safety meta-analysis can be accessed in Additional file 7 : Data S2. To do the meta-analysis, we can use free software, such as RevMan [ 22 ] or R package meta [ 23 ]. In this example, we will use the R package meta. The tutorial of meta package can be accessed through “General Package for Meta-Analysis” tutorial pdf [ 23 ]. The R codes and its guidance for meta-analysis done can be found in Additional file 5 : File S3.

For the analysis, we assume that the study is heterogenous in nature; therefore, we choose a random effect model. We did an analysis on the safety of Ebola vaccine A. From the data table, we can see some adverse events occurring after intramuscular injection of vaccine A to the subject of the study. Suppose that we include six studies that fulfill our inclusion criteria. We can do a meta-analysis for each of the adverse events extracted from the studies, for example, arthralgia, from the results of random effect meta-analysis using the R meta package.

From the results shown in Additional file 3 : Figure S3, we can see that the odds ratio (OR) of arthralgia is 1.06 (0.79; 1.42), p value = 0.71, which means that there is no association between the intramuscular injection of Ebola vaccine A and arthralgia, as the OR is almost one, and besides, the P value is insignificant as it is > 0.05.

In the meta-analysis, we can also visualize the results in a forest plot. It is shown in Fig. 3 an example of a forest plot from the simulated analysis.

figure 3

Random effect model forest plot for comparison of vaccine A versus placebo

From the forest plot, we can see six studies (A to F) and their respective OR (95% CI). The green box represents the effect size (in this case, OR) of each study. The bigger the box means the study weighted more (i.e., bigger sample size). The blue diamond shape represents the pooled OR of the six studies. We can see the blue diamond cross the vertical line OR = 1, which indicates no significance for the association as the diamond almost equalized in both sides. We can confirm this also from the 95% confidence interval that includes one and the p value > 0.05.

For heterogeneity, we see that I 2 = 0%, which means no heterogeneity is detected; the study is relatively homogenous (it is rare in the real study). To evaluate publication bias related to the meta-analysis of adverse events of arthralgia, we can use the metabias function from the R meta package (Additional file 4 : Figure S4) and visualization using a funnel plot. The results of publication bias are demonstrated in Fig. 4 . We see that the p value associated with this test is 0.74, indicating symmetry of the funnel plot. We can confirm it by looking at the funnel plot.

figure 4

Publication bias funnel plot for comparison of vaccine A versus placebo

Looking at the funnel plot, the number of studies at the left and right side of the funnel plot is the same; therefore, the plot is symmetry, indicating no publication bias detected.

Sensitivity analysis is a procedure used to discover how different values of an independent variable will influence the significance of a particular dependent variable by removing one study from MA. If all included study p values are < 0.05, hence, removing any study will not change the significant association. It is only performed when there is a significant association, so if the p value of MA done is 0.7—more than one—the sensitivity analysis is not needed for this case study example. If there are 2 studies with p value > 0.05, removing any of the two studies will result in a loss of the significance.

Double data checking

For more assurance on the quality of results, the analyzed data should be rechecked from full-text data by evidence photos, to allow an obvious check for the PI of the study.

Manuscript writing, revision, and submission to a journal

Writing based on four scientific sections: introduction, methods, results, and discussion, mostly with a conclusion. Performing a characteristic table for study and patient characteristics is a mandatory step which can be found as a template in Additional file 5 : Table S3.

After finishing the manuscript writing, characteristics table, and PRISMA flow diagram, the team should send it to the PI to revise it well and reply to his comments and, finally, choose a suitable journal for the manuscript which fits with considerable impact factor and fitting field. We need to pay attention by reading the author guidelines of journals before submitting the manuscript.

The role of evidence-based medicine in biomedical research is rapidly growing. SR/MAs are also increasing in the medical literature. This paper has sought to provide a comprehensive approach to enable reviewers to produce high-quality SR/MAs. We hope that readers could gain general knowledge about how to conduct a SR/MA and have the confidence to perform one, although this kind of study requires complex steps compared to narrative reviews.

Having the basic steps for conduction of MA, there are many advanced steps that are applied for certain specific purposes. One of these steps is meta-regression which is performed to investigate the association of any confounder and the results of the MA. Furthermore, there are other types rather than the standard MA like NMA and MA. In NMA, we investigate the difference between several comparisons when there were not enough data to enable standard meta-analysis. It uses both direct and indirect comparisons to conclude what is the best between the competitors. On the other hand, mega MA or MA of patients tend to summarize the results of independent studies by using its individual subject data. As a more detailed analysis can be done, it is useful in conducting repeated measure analysis and time-to-event analysis. Moreover, it can perform analysis of variance and multiple regression analysis; however, it requires homogenous dataset and it is time-consuming in conduct [ 24 ].


Systematic review/meta-analysis steps include development of research question and its validation, forming criteria, search strategy, searching databases, importing all results to a library and exporting to an excel sheet, protocol writing and registration, title and abstract screening, full-text screening, manual searching, extracting data and assessing its quality, data checking, conducting statistical analysis, double data checking, manuscript writing, revising, and submitting to a journal.

Availability of data and materials

Not applicable.


Network meta-analysis

Principal investigator

Population, Intervention, Comparison, Outcome

Preferred Reporting Items for Systematic Review and Meta-analysis statement

Quality assessment

Sample, Phenomenon of Interest, Design, Evaluation, Research type

Systematic review and meta-analyses

Bello A, Wiebe N, Garg A, Tonelli M. Evidence-based decision-making 2: systematic reviews and meta-analysis. Methods Mol Biol (Clifton, NJ). 2015;1281:397–416.

Article   Google Scholar  

Khan KS, Kunz R, Kleijnen J, Antes G. Five steps to conducting a systematic review. J R Soc Med. 2003;96(3):118–21.

Rys P, Wladysiuk M, Skrzekowska-Baran I, Malecki MT. Review articles, systematic reviews and meta-analyses: which can be trusted? Polskie Archiwum Medycyny Wewnetrznej. 2009;119(3):148–56.

PubMed   Google Scholar  

Higgins JPT, Green S. Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0 [updated March 2011]. 2011.

Moher D, Liberati A, Tetzlaff J, Altman DG. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. BMJ. 2009;339:b2535.

Methley AM, Campbell S, Chew-Graham C, McNally R, Cheraghi-Sohi S. PICO, PICOS and SPIDER: a comparison study of specificity and sensitivity in three search tools for qualitative systematic reviews. BMC Health Serv Res. 2014;14:579.

Gross L, Lhomme E, Pasin C, Richert L, Thiebaut R. Ebola vaccine development: systematic review of pre-clinical and clinical studies, and meta-analysis of determinants of antibody response variability after vaccination. Int J Infect Dis. 2018;74:83–96.

Article   CAS   Google Scholar  

Shea BJ, Reeves BC, Wells G, Thuku M, Hamel C, Moran J, ... Henry DA. AMSTAR 2: a critical appraisal tool for systematic reviews that include randomised or non-randomised studies of healthcare interventions, or both. BMJ. 2017;358:j4008.

Giang HTN, Banno K, Minh LHN, Trinh LT, Loc LT, Eltobgy A, et al. Dengue hemophagocytic syndrome: a systematic review and meta-analysis on epidemiology, clinical signs, outcomes, and risk factors. Rev Med Virol. 2018;28(6):e2005.

Morra ME, Altibi AMA, Iqtadar S, Minh LHN, Elawady SS, Hallab A, et al. Definitions for warning signs and signs of severe dengue according to the WHO 2009 classification: systematic review of literature. Rev Med Virol. 2018;28(4):e1979.

Morra ME, Van Thanh L, Kamel MG, Ghazy AA, Altibi AMA, Dat LM, et al. Clinical outcomes of current medical approaches for Middle East respiratory syndrome: a systematic review and meta-analysis. Rev Med Virol. 2018;28(3):e1977.

Vassar M, Atakpo P, Kash MJ. Manual search approaches used by systematic reviewers in dermatology. Journal of the Medical Library Association: JMLA. 2016;104(4):302.

Naunheim MR, Remenschneider AK, Scangas GA, Bunting GW, Deschler DG. The effect of initial tracheoesophageal voice prosthesis size on postoperative complications and voice outcomes. Ann Otol Rhinol Laryngol. 2016;125(6):478–84.

Rohatgi AJaiWa. Web Plot Digitizer. ht tp. 2014;2.

Hozo SP, Djulbegovic B, Hozo I. Estimating the mean and variance from the median, range, and the size of a sample. BMC Med Res Methodol. 2005;5(1):13.

Wan X, Wang W, Liu J, Tong T. Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range. BMC Med Res Methodol. 2014;14(1):135.

Van Rijkom HM, Truin GJ, Van’t Hof MA. A meta-analysis of clinical studies on the caries-inhibiting effect of fluoride gel treatment. Carries Res. 1998;32(2):83–92.

Higgins JP, Altman DG, Gotzsche PC, Juni P, Moher D, Oxman AD, et al. The Cochrane Collaboration's tool for assessing risk of bias in randomised trials. BMJ. 2011;343:d5928.

Tawfik GM, Tieu TM, Ghozy S, Makram OM, Samuel P, Abdelaal A, et al. Speech efficacy, safety and factors affecting lifetime of voice prostheses in patients with laryngeal cancer: a systematic review and network meta-analysis of randomized controlled trials. J Clin Oncol. 2018;36(15_suppl):e18031-e.

Wannemuehler TJ, Lobo BC, Johnson JD, Deig CR, Ting JY, Gregory RL. Vibratory stimulus reduces in vitro biofilm formation on tracheoesophageal voice prostheses. Laryngoscope. 2016;126(12):2752–7.

Sterne JAC, Hernán MA, Reeves BC, Savović J, Berkman ND, Viswanathan M, et al. ROBINS-I: a tool for assessing risk of bias in non-randomised studies of interventions. BMJ. 2016;355.

RevMan The Cochrane Collaboration %J Copenhagen TNCCTCC. Review Manager (RevMan). 5.0. 2008.

Schwarzer GJRn. meta: An R package for meta-analysis. 2007;7(3):40-45.

Google Scholar  

Simms LLH. Meta-analysis versus mega-analysis: is there a difference? Oral budesonide for the maintenance of remission in Crohn’s disease: Faculty of Graduate Studies, University of Western Ontario; 1998.

Download references


This study was conducted (in part) at the Joint Usage/Research Center on Tropical Disease, Institute of Tropical Medicine, Nagasaki University, Japan.

Author information

Authors and affiliations.

Faculty of Medicine, Ain Shams University, Cairo, Egypt

Gehad Mohamed Tawfik

Online research Club http://www.onlineresearchclub.org/

Gehad Mohamed Tawfik, Kadek Agus Surya Dila, Muawia Yousif Fadlelmola Mohamed, Dao Ngoc Hien Tam, Nguyen Dang Kien & Ali Mahmoud Ahmed

Pratama Giri Emas Hospital, Singaraja-Amlapura street, Giri Emas village, Sawan subdistrict, Singaraja City, Buleleng, Bali, 81171, Indonesia

Kadek Agus Surya Dila

Faculty of Medicine, University of Khartoum, Khartoum, Sudan

Muawia Yousif Fadlelmola Mohamed

Nanogen Pharmaceutical Biotechnology Joint Stock Company, Ho Chi Minh City, Vietnam

Dao Ngoc Hien Tam

Department of Obstetrics and Gynecology, Thai Binh University of Medicine and Pharmacy, Thai Binh, Vietnam

Nguyen Dang Kien

Faculty of Medicine, Al-Azhar University, Cairo, Egypt

Ali Mahmoud Ahmed

Evidence Based Medicine Research Group & Faculty of Applied Sciences, Ton Duc Thang University, Ho Chi Minh City, 70000, Vietnam

Nguyen Tien Huy

Faculty of Applied Sciences, Ton Duc Thang University, Ho Chi Minh City, 70000, Vietnam

Department of Clinical Product Development, Institute of Tropical Medicine (NEKKEN), Leading Graduate School Program, and Graduate School of Biomedical Sciences, Nagasaki University, 1-12-4 Sakamoto, Nagasaki, 852-8523, Japan

You can also search for this author in PubMed   Google Scholar


NTH and GMT were responsible for the idea and its design. The figure was done by GMT. All authors contributed to the manuscript writing and approval of the final version.

Corresponding author

Correspondence to Nguyen Tien Huy .

Ethics declarations

Ethics approval and consent to participate, consent for publication, competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Additional files

Additional file 1:.

Figure S1. Risk of bias assessment graph of included randomized controlled trials. (TIF 20 kb)

Additional file 2:

Figure S2. Risk of bias assessment summary. (TIF 69 kb)

Additional file 3:

Figure S3. Arthralgia results of random effect meta-analysis using R meta package. (TIF 20 kb)

Additional file 4:

Figure S4. Arthralgia linear regression test of funnel plot asymmetry using R meta package. (TIF 13 kb)

Additional file 5:

Table S1. PRISMA 2009 Checklist. Table S2. Manipulation guides for online database searches. Table S3. Detailed search strategy for twelve database searches. Table S4. Baseline characteristics of the patients in the included studies. File S1. PROSPERO protocol template file. File S2. Extraction equations that can be used prior to analysis to get missed variables. File S3. R codes and its guidance for meta-analysis done for comparison between EBOLA vaccine A and placebo. (DOCX 49 kb)

Additional file 6:

Data S1. Extraction and quality assessment data sheets for EBOLA case example. (XLSX 1368 kb)

Additional file 7:

Data S2. Imaginary data for EBOLA case example. (XLSX 10 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/ ), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article.

Tawfik, G.M., Dila, K.A.S., Mohamed, M.Y.F. et al. A step by step guide for conducting a systematic review and meta-analysis with simulation data. Trop Med Health 47 , 46 (2019). https://doi.org/10.1186/s41182-019-0165-6

Download citation

Received : 30 January 2019

Accepted : 24 May 2019

Published : 01 August 2019

DOI : https://doi.org/10.1186/s41182-019-0165-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Tropical Medicine and Health

ISSN: 1349-4147

  • Submission enquiries: Access here and click Contact Us
  • General enquiries: [email protected]

research proposal meta analysis

  • Open access
  • Published: 13 December 2018

Protocol for a systematic review and meta-analysis of research on the associations between workplace bullying and sleep

  • Morten Birkeland Nielsen 1 , 2 ,
  • Ståle Pallesen 2 ,
  • Anette Harris 2 &
  • Ståle Valvatne Einarsen 2  

Systematic Reviews volume  7 , Article number:  232 ( 2018 ) Cite this article

24k Accesses

12 Citations

2 Altmetric

Metrics details

Existing evidence on the association between exposure to bullying and sleep is limited and inconclusive. The aims of this planned systematic review and meta-analysis are therefore (1) to determine whether exposure to workplace bullying is related to changes in sleep function and (2) to establish mediating and moderating factors that govern the relationship between bullying and sleep.

A systematic review and meta-analysis will be conducted. Electronic databases will be searched using predefined search terms to identify relevant studies. Eligible studies should report empirical findings on the association between exposure to workplace bullying and at least one indicator of sleep. Primary observational studies with cross-sectional or prospective research design, case-control studies, and studies with experimental designs will be included. Qualitative interviews and case studies will be excluded. The methodological quality of the included studies will be assessed with a previously established checklist for studies on workplace bullying. The quality of evidence for an association between bullying and sleep problems will evaluated in accordance with the GRADE system. A random effects meta-analysis will be conducted with the Comprehensive Meta-Analysis software, version 3.

This review and meta-analysis will be among the first to systematically explore and integrate the evidence available on the association between exposure to bullying and sleep, as well as on the mediating and moderating factors that can govern this associations. By gathering and summarizing information about potential factors that can explain when and how bullying is related to sleep, the findings from this study will provide directions for future research and provide practitioners and clinicians with an understanding about the nature and consequences of workplace bullying and point to directions for relevant interventions.

Systematic review registration

The protocol has been registered at the International Prospective Register of Systematic Reviews (PROSPERO; registration number: CRD42018082192 ).

Peer Review reports

Quality of sleep is highly important with regard to everyday functioning, mental and physical health, and for job performance [ 1 ]. A 2011 US study estimated the socioeconomic costs of troubled sleep to be in the area of 63 and 91 billion dollars per year [ 2 ]. To reduce these costs, knowledge about the antecedents and risk factors for sleep problems is therefore of major significance. To this date, we know that physical and psychosocial working conditions are associated with a range of negative outcomes, including sleep problems [ 3 ]. Although negative social interactions at the workplace, such as bullying, may be especially distressing for those exposed [ 4 , 5 ], previous research on psychosocial work environment factors and sleep has mainly been limited to examining the impact of job demands and control [ 6 , 7 , 8 ]. Consequently, there is a shortage of knowledge about how and when social problems, such as exposure to workplace bullying, influence sleep. Workplace bullying, defined as a situation wherein an employee persistently and systematically is exposed to harassment and mistreatment at work and wherein this employee finds it difficult to defend him- or herself against this prolonged and unwanted treatment [ 9 ], has been established as a precursor to a range of health complaints including depression and anxiety [ 10 , 11 , 12 ], somatic problems [ 11 ], and even symptoms of posttraumatic stress [ 13 ]. Bullying has also been highlighted as a potential cause of sleep problems [ 5 , 14 , 15 ]. For instance, in a mixed method study on workplace bullying among university employees, insomnia was reported by practically all cases interviewed [ 16 ]. In a study among victims of bullying, findings revealed a high prevalence of sleep problems, specifically difficulties falling asleep, interrupted sleep, fatigue during the day, and early morning awakening [ 17 ]. Still, existing evidence is limited, and in a meta-analysis published in 2012, only four studies, encompassing 14,584 respondents, were included [ 10 ]. Going against expectations about an association between bullying and sleep problems, a non-significant Pearson correlation of .10 (95% CI = − .29–.45) was reported in the meta-analysis. It should be noted that this latter meta-analysis focused on outcomes of bullying in general and did therefore not include a systematic review of the literature on bullying and sleep.

The Cognitive Activation Theory of Stress (CATS) [ 18 ] has been suggested as a theoretical framework for how exposure to bullying could influence sleep [ 4 ]. According to CATS, cognitive activation is a key factor in the cycle of emotional and physiologic arousal on sleep, and an extensive body of research has shown that cognitive activation is associated with persistent high stress levels and pathology, such as decreased sleep quality, increased cortisol levels, elevated heart rate, and increased mortality [ 18 , 19 ]. In short, persistent exposure to stressors leads to increased arousal. This increase in arousal as a response to stressful situations may prolong physiological activation which subsequently can be manifested through difficulties in sleep initiation or returning to sleep after awakenings during the night. However, there may be individual differences in responses to bullying, and some workers may be more able to cope with the exposure compared to others [ 20 ]. It is also possible that different workers respond with different profiles to stressors [ 21 ]. Overall, this suggests that the impact of workplace bullying on sleep should be both mediated (activation) and moderated (individual differences) by other variables.

To add to our knowledge about the potential impact of bullying on sleep, this planned meta-analytic study will provide a systematic review of all available research literature on the associations between the variables. The first aim is to determine whether exposure to workplace bullying is related to levels of sleep among employees. The overall magnitude of this association will be established by means of a meta-analytic synthesis. Theoretically, bullying may have indirect (mediated), conditional (moderated), and reverse associations with sleep parameters. In addition, sleep may function as a mediator between bullying and other indicators of health and well-being. A secondary aim of the study is therefore to review and summarize research on such mediating and moderating factors affecting the relationship between workplace bullying and sleep. Hence, this study will extend existing reviews on bullying and sleep by including a larger number of studies, thus increasing the statistical power in the meta-analysis, and by examining moderating and mediating factors that determines when and how exposure to bullying relates to sleep.

The following protocol has been written according to the MOOSE Guidelines for Meta-Analyses and Systematic Reviews of Observational Studies and the PRISMA-P (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines [ 22 , 23 ]. The PRISMA-P checklist is given in the Additional file  1 . The protocol has been registered at the International Prospective Register of Systematic Reviews (PROSPERO; registration number: CRD42018082192).

The proposed review and meta-analysis is part of a larger project entitled “Bullying in the workplace: from mechanisms and moderators to problem treatment.” The aims of that project are (1) to improve our understanding of the workplace bullying phenomenon through determining mechanisms (mediating and moderating factors) that influence and explain how and when workplace bullying occurs, develops, and impacts those targeted and (2) to provide information that can be used to develop sound and effective interventions and rehabilitation approaches for targeted individuals and organizations.

Data sources search terms and search strategy

This literature review and meta-analysis will be based on systematic searches in multiple literature databases, including Medline/Pubmed, Proquest, Web of Science, Taylor & Francis Online Journals, PsychInfo, and Wiley Online Library. Additional searches will be performed in Scopus and Google Scholar. All search terms are included in Table  1 . Systematic searches will be conducted by combining every possible combination of three categories of keywords. Reference lists of key full text articles included in the review will be checked to identify any potentially eligible studies. The searches will not be limited by historical time-constraints. The systematic procedure substantiates that the literature search comprises all published studies on the relationship between workplace bullying and different sleep parameters. The search strategy is considered as adequate to reduce the risk of selection and detection bias. The search results will be exported to Endnote where duplicates are excluded. Included studies will be manually screened in order to select other relevant studies.

Inclusion and exclusion criteria

Eligible studies should report empirical findings on the association between exposure to workplace bullying (or any overlapping concept) and an indicator of sleep (e.g., disturbed sleep, early awakening, etc.). Primary observational studies with cross-sectional or prospective research design, case-control studies, and studies with experimental designs will be included. Cross-sectional data will be used to determine the magnitude of the association between bullying and sleep, whereas prospective data will be used to determine directions of associations. As associations based on prospective data are dependent upon the utilized time-lag between measurement points [ 24 ], it is important to also include cross-sectional data. Qualitative interview studies, single-case studies, and series of single-case studies will not be included in the meta-analysis. To be included in the meta-analytic part of the study, studies should provide the zero-order associations between bullying and sleep or provide sufficient information for these associations (effect sizes) to be calculated. Studies lacking this information or reported effect sizes that could not be transformed into odds ratios will be excluded from the meta-analyses. To avoid double-counting data, the sample in a given study should not have been used in a previous study of those included in the review. In cases with overlap, we will use data from the largest sample. The review will be limited to articles published in peer-reviewed journals in English, German, French, or the Scandinavian languages (Danish, Norwegian, and Swedish). Hence, this will be a review of published peer review studies only. Accordingly, data based on conference abstracts, dissertations, and gray literature (e.g., reports, etc.) will not be included. As a first step, relevant articles will be considered on the basis of their title and abstract. At the second step, full-text versions of selected papers will be examined and assessed with regard to effect sizes and methodological quality.

A professional librarian will conduct the search. The primary investigator will oversee the search strategy and remove duplicates using Endnote X7. Following the above inclusion and exclusion criteria, two reviewers without consideration for the results will perform assessment of studies for potential inclusion independently. Any differences in opinions will be resolved through discussion until a consensus is reached. A third reviewer may be consulted if necessary. This process ensures that bias is minimized when deciding whether or not to include or exclude certain studies. The two reviewers will independently conduct the data extraction from each study using a pre-defined data extraction sheet. Following the description by Lipsey and Wilson [ 25 ], the coding form will assess information about bullying and sleep, demographic characteristics of participants (age, gender, job type, employment status, educational level, etc.), study characteristics (country of origin, sample size, effect sizes, response rate, year study published, sampling method, measurement inventories, etc.), and other relevant variables (health indicators, other exposures).


The study population will be adults (18 years or older) with a current or previous employment in a full or part-time position. No restrictions will be placed on participants’ gender, ethnicity, or other demographic characteristics. Since the aim of the study is to determine associations between bullying and sleep, indicators of mental and somatic health complaints will be recorded and used as correlates and/or moderators in meta-analyses (conditioned by enough relevant studies). A minimum of two studies is considered sufficient to perform a meta-analysis [ 26 ].

Assessment of methodological quality (risk of bias)

As displayed in Table  2 , the methodological quality of the included studies will be assessed with an adapted version previously established checklist for research on workplace bullying comprising 14 items related to sampling, representativeness, measurement issues, and confounders [ 27 ]. This checklist comprises selected and adapted items from the Risk of Bias Assessment Tool for Nonrandomized Studies [ 28 ] and the Quality Assessment Tool [ 29 ]. The quality of the reviewed studies will be scored on a scale from 0 (lowest possible quality) to 13 (highest possible quality). Kappa will be calculated to quantify the level of inter-rater agreement.

The quality of evidence for an association between bullying and sleep problems will be evaluated in accordance with the GRADE system [ 30 ]. This system grades quality of evidence at four levels: high (4), moderate (3), low (2), and very low (1). For high evidence, the requirements are a randomized, double-blinded study design with no selection biases. For observational studies, moderate evidence, i.e., exceptionally strong evidence from unbiased studies, is considered the strongest possible level of proof for an association.

Meta-analytic approach

The meta-analysis will be conducted with the Comprehensive Meta-Analysis (version 3) software developed by Biostat [ 31 ]. Odds ratio (OR) with 95% confidence intervals (95% CI) will be reported as an overall synthesized measure of effect size. The mean of the combined effect sizes will be calculated in studies where several effect sizes were reported from the same sample (e.g., models with different control variables). An overall estimate will be calculated for studies with overlapping samples. In studies reporting effect sizes from independent subgroups (e.g., moderators), each subgroup will be included as a unique sample in the meta-analysis. Moderation analyses will also be used to compare associations from cross-sectional and prospective data. In contrast to some other meta-analytic methods, such as the Hunter and Schmidt approach [ 32 ], which weights studies by sample size, the Comprehensive Meta-analysis program weights studies by inverse variance. Inverse-variance weighting is a method of aggregating two or more random variables where each random variable is weighted in inverse proportion to its variance in order to minimize the variance of the weighted average. The inverse variance is roughly proportional to sample size, but is a more nuanced measure, and serves to minimize the variance of the combined effect [ 33 ].

As the individual studies included cannot be expected to come from the same population of studies, pooled mean effect size will be calculated using the random effects model. Such effects models are thus recommended when accumulating data from a series of studies where the effect size is assumed to vary from one study to the next and where it is unlikely that studies are functionally equivalent [ 33 ]. Random effects models allow statistical inferences to be made to a population of studies beyond those included in the meta-analysis [ 34 ]. The Q within -statistic will be used to assess the heterogeneity of studies. A significant Q within -value rejects the null hypothesis of homogeneity. An I 2 statistic will be computed as an indicator of heterogeneity in terms of percentages. Increasing values show increasing heterogeneity, with values of 0% indicating no heterogeneity, 50% indicating moderate heterogeneity, and 75% indicating high heterogeneity [ 35 ]. The “one-study-removed” procedure will be used as a sensitivity analysis to determine whether the overall estimates between bullying and sleep are influenced by outlier studies. Using this approach, effect sizes that fell outside the 95th confidence interval of the average effect size will be considered as outliers. Four indicators of publication bias are to be examined: funnel plot, Rosenthal’s Fail-Safe N, Duval and Tweedie’s trim and fill procedure, and Egger’s regression intercept [ 36 ].

This planned review and meta-analysis will systematically explore the evidence available on the association between exposure to bullying and sleep. By gathering and summarizing information about potential mediating and moderating factors that can explain when and how bullying is related to sleep, the findings from this study will provide directions for future research and provide practitioners with an understanding about the nature and consequences of workplace bullying. This knowledge can be used to develop stronger countermeasures and interventions. Both bullying and sleep can be considered as modifiable factors that, if assessed and promptly recognized, can be addressed, potentially preventing the development of further health problems.


As data will be extracted using full-text articles only, and excluding data from gray literature, this review will build on published studies and doctoral dissertations exclusively, whereas unpublished studies and non-peer reviewed literature (e.g., reports) are to be excluded. Although it has been suggested that researchers should aim at including unpublished literature in meta-analyses and systematic reviews, the inclusion of data from unpublished studies can itself introduce bias [ 37 ]. First of all, the unpublished studies that can be located are likely to be an unrepresentative sample of all unpublished studies. For instance, the identification of unpublished studies may depend on the willingness of investigators of unpublished studies to provide data. This may again depend upon the findings of the study, with more favorable results being provided more readily [ 37 ]. Secondly, unpublished studies may be of lower methodological quality than published studies. In a study of 60 meta-analyses that included published and unpublished studies it was found that unpublished studies were less likely to conceal intervention allocation adequately and to blind outcome assessments [ 38 ]. As the planned review will be based on a comprehensive literature search of studies published in peer reviewed journals, the scientific quality of the included studies should be ensured while they at the same time also should be representative for the published literature on workplace bullying and sleep. Furthermore, the robustness of the findings will also be indicated by publication bias analyses.

It is likely that most associations reported in primary studies will be based on self-report data based on the self-administered questionnaires. This kind of data is prone to be influenced by common method bias as well as response set bias such as expectations, previous experiences, or health status. This may cause both non-differential and differential misclassification, resulting in under- and overestimations of effects [ 39 ]. However, sleep data based on actigraphy and polysomnography can be regarded as unbiased.

The meta-analysis will include studies with cross-sectional designs, and the aggregated effect sizes will therefore not account for the cause and effect relationship between the included variables. However, separate analyses will be conducted for studies based on time-lagged data in order to determine direction of associations over time.

Ethics and dissemination

Ethical approval is not required for this systematic review and meta-analysis as only a secondary analysis of data already available in scientific databases will be conducted. The results of this review will be submitted for peer-reviewed publication and will be presented at relevant conferences.

Review status

The project team has commenced searching relevant studies in the relevant databases. This review is expected to be complete by April 2019.

Krueger, G.P., Sustained work, fatigue, sleep loss and performance: a review of the issues. Apr-Jun 1989. Work Stress, 1989. 3(2): p. 129-141.

Kessler RC, et al. Insomnia and the performance of US workers: results from the America insomnia survey. Sleep. 2011;34(9):1161–71.

Article   Google Scholar  

Vleeshouwers J, Knardahl S, Christensen JO. Effects of psychological and social work factors on self-reported sleep disturbance and difficulties initiating sleep. Sleep. 2015;39(4):833–46.

Rodriguez-Munoz A, Notelaers G, Moreno-Jimenez B. Workplace bullying and sleep quality: the mediating role of worry and need for recovery. Behav Psychol Psicologia Conductual. 2011;19(2):453–68.

Google Scholar  

Hansen ÅM, et al. Workplace bullying and sleep difficulties: a 2-year follow-up study. Int Arch Occup Environ Health. 2014;87(3):285–94.

de Lange AH, et al. A hard day’s night: a longitudinal study on the relationships among job demands and job control, sleep quality and fatigue. J Sleep Res. 2009;18(3):374–83.

Kalimo R, et al. Job stress and sleep disorders: findings from the Helsinki Heart Study. Stress Medicine. 2000;16(2):65–75.

Litwiller B, et al. The relationship between sleep and work: a meta-analysis. J Appl Psychol. 2017;102(4):682–99.

Einarsen S, et al. In: Einarsen S, et al., editors. The concept of bullying and harassment at work: the European tradition, in bullying and harassment in the workplace. Developments in theory, research, and practice. Boca Raton: CRC Press; 2011. p. 3–40.

Nielsen MB, Einarsen S. Outcomes of workplace bullying: a meta-analytic review. Work Stress. 2012;26(4):309–32.

Nielsen MB, et al. Workplace bullying and subsequent health problems. Tidsskr Nor Legeforen. 2014;134(12/13):1233–8.

Verkuil B, Atasayi S, Molendijk ML. Workplace bullying and mental health: a meta-analysis on cross-sectional and longitudinal data. Plos One. 2015;10(8):1–16. https://doi.org/10.1371/journal.pone.0135225 .

Nielsen MB, et al. Post-traumatic stress disorder as a consequence of bullying at work and at school. A literature review and meta-analysis. Aggress Violent Behav. 2015;21(1):17–24.

Hansen ÅM, et al. Workplace bullying, sleep problems and leisure-time physical activity: a prospective cohort study. Scand J Work Environ Health. 2016;42(1):26–33.

Lallukka T, Rahkonen O, Lahelma E. Workplace bullying and subsequent sleep problems--the Helsinki Health Study. Scand J Work Environ Health. 2011;37(3):204–12.

Björkqvist K, Österman K, Hjeltbäck M. Aggression among university employees. Aggress Behav. 1994;20:173–84.

Leymann H, Gustafsson A. Mobbing at work and the development of post-traumatic stress disorders. Eur J Work Organ Psychol. 1996;5:251–75.

Ursin H, Eriksen HR. The cognitive activation theory of stress. Psychoneuroendocrinology. 2004;29(5):567–92.

Meurs JA, Perrewe PL. Cognitive activation theory of stress: an integrative theoretical approach to work stress. J Manag. 2011;37(4):1043–68.

Nielsen MB, et al. Exposure to aggression in the workplace. In: The Wiley Blackwell Handbook of the Psychology of Occupational Safety and Workplace Health. Chichester: Wiley-Blackwell; 2016. p. 205–27.

Rudolph KD, Troop-Gordon W, Granger DA. Individual differences in biological stress responses moderate the contribution of early peer victimization to subsequent depressive symptoms. Psychopharmacology. 2011;214(1):209–19.

Article   CAS   Google Scholar  

Moher D, et al. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. J Clin Epidemiol. 2009;62(10):1006–12.

Stroup DF, et al. Meta-analysis of observational studies in epidemiology: a proposal for reporting. Meta-analysis Of Observational Studies in Epidemiology (MOOSE) group. JAMA. 2000;283(15):2008–12.

Ford MT, et al. How do occupational stressor-strain effects vary with time? A review and meta-analysis of the relevance of time lags in longitudinal studies. Work Stress. 2014;28(1):9–30.

Lipsey MW, Wilson DB. Practical meta-analysis. In: Applied Social Research Methods Series. Vol. 49. Thousand Oaks: Sage; 2001.

Valentine JC, Pigott TD, Rothstein HR. How many studies do you need? A primer on statistical power for meta-analysis. J Educ Behav Stat. 2010;35(2):215–47.

Nielsen MB, Indregard AM, Øverland S. Workplace bullying and sickness absence – a systematic review and meta-analysis of the research literature. Scand J Work Environ Health. 2016;42(5):359–70.

Kim SY, et al. Testing a tool for assessing the risk of bias for nonrandomized studies showed moderate reliability and promising validity. J Clin Epidemiol. 2013;66(4):408–14.

National Collaborating Centre for Methods and Tools. Quality assessment tool for quantitative studies. Hamilton: McMaster University; 2008.

Guyatt G, et al. GRADE guidelines: 1. Introduction-GRADE evidence profiles and summary of findings tables. J Clin Epidemiol. 2011;64(4):383–94.

Borenstein M, et al. Comprehensive meta-analysis version 2. Englewood: Biostat; 2005.

Hunter JE, Schmidt FL. Methods of meta-analysis. Correcting error and bias in research findings. 2nd ed. Thousand Oaks: Sage; 2004.

Borenstein M, Hedges L, Rothstein H. Meta-analysis. Fixed effects vs. random effects. Englewood: Biostat; 2007.

Berkeljon A, Baldwin SA. An introduction to meta-analysis for psychotherapy outcome research. Psychother Res. 2009;19(4–5):511–8.

Higgins JPT, et al. Measuring inconsistency in meta-analyses. BMJ. 2003;327(7414):557–60.

Borenstein M, et al. Introduction to meta-analysis. Chichester: Wiley; 2009.

Book   Google Scholar  

Higgins, J.P.T. and S. Green, Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0 [updated March 2011]. 2011, Available from http://www.training.cochrane.org/handbook : The Cochrane Collaboration.

Egger M, et al. How important are comprehensive literature searches and the assessment of trial quality in systematic reviews? Empirical study. Health Technol Assess. 2003;7(1):1–76.

CAS   PubMed   Google Scholar  

Rugulies R. Studying the effect of the psychosocial work environment on risk of ill-health: towards a more comprehensive assessment of working conditions. Scand J Work Environ Health. 2012;38(3):187–91.

Download references


Not applicable.

The study is a part of a larger project entitled “Workplace bullying: From mechanisms and moderators to problem treatment” funded by The Norwegian Research Council. Grant no: 250127. The funding body played no role in developing the protocol.

Availability of data and materials

The studies included in the review will be available upon request.

Protocol amendments

If the present protocol is substantially amended after an initiation that may impact on the conduct of the study (including eligibility criteria, study objectives, study design, study procedures, and analysis), then this amendment will be agreed upon by all collaborators prior to the implementation and will be documented in a note to a later publication or a report (section “Differences between protocol and review”).

Author information

Authors and affiliations.

National Institute of Occupational Health, Pb 5330 Majorstuen, N- 0304, Oslo, Norway

Morten Birkeland Nielsen

Department of Psychosocial Science, University of Bergen, Bergen, Norway

Morten Birkeland Nielsen, Ståle Pallesen, Anette Harris & Ståle Valvatne Einarsen

You can also search for this author in PubMed   Google Scholar


MBN was the initiator of the project and has been responsible for the writing of the protocol. SP, AH, and SE contributed to the idea development and the development of the project. All authors has read and approved the protocol. MBN is the guarantor of the review.

Corresponding author

Correspondence to Morten Birkeland Nielsen .

Ethics declarations

Authors’ information.

MBN is a research professor at the National Institute of Occupational Health, Oslo Norway, and a professor in work and organizational psychology at the Department of Psychosocial Science at the University of Bergen, Norway.

SP is a professor in psychology at the Department of Psychosocial Science at the University of Bergen, Norway.

AH is a professor in psychology at the Department of Psychosocial Science at the University of Bergen, Norway.

SE is a professor in psychology at the Department of Psychosocial Science at the University of Bergen, Norway.

Ethics approval and consent to participate

Consent for publication, competing interests.

The authors declare no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Additional file

Additional file 1:.

PRISMA-P 2015 checklist. (DOCX 30 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/ ), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article.

Nielsen, M.B., Pallesen, S., Harris, A. et al. Protocol for a systematic review and meta-analysis of research on the associations between workplace bullying and sleep. Syst Rev 7 , 232 (2018). https://doi.org/10.1186/s13643-018-0898-z

Download citation

Received : 05 January 2018

Accepted : 26 November 2018

Published : 13 December 2018

DOI : https://doi.org/10.1186/s13643-018-0898-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Systematic Reviews

ISSN: 2046-4053

  • Submission enquiries: Access here and click Contact Us
  • General enquiries: [email protected]

research proposal meta analysis

  • Open access
  • Published: 24 May 2024

Systematic review and meta-analysis of hepatitis E seroprevalence in Southeast Asia: a comprehensive assessment of epidemiological patterns

  • Ulugbek Khudayberdievich Mirzaev 1 , 2 ,
  • Serge Ouoba 1 , 3 ,
  • Zayar Phyo 1 ,
  • Chanroth Chhoung 1 ,
  • Akuffo Golda Ataa 1 ,
  • Aya Sugiyama 1 ,
  • Tomoyuki Akita 1 &
  • Junko Tanaka 1  

BMC Infectious Diseases volume  24 , Article number:  525 ( 2024 ) Cite this article

197 Accesses

1 Altmetric

Metrics details

The burden of hepatitis E in Southeast Asia is substantial, influenced by its distinct socio-economic and environmental factors, as well as variations in healthcare systems. The aim of this study was to assess the pooled seroprevalence of hepatitis E across countries within the Southeast Asian region by the UN division.

The study analyzed 66 papers across PubMed, Web of Science, and Scopus databases, encompassing data from of 44,850 individuals focusing on anti-HEV seroprevalence. The investigation spanned nine countries, excluding Brunei and East Timor due to lack of data. The pooled prevalence of anti-HEV IgG was determined to be 21.03%, with the highest prevalence observed in Myanmar (33.46%) and the lowest in Malaysia (5.93%). IgM prevalence was highest in Indonesia (12.43%) and lowest in Malaysia (0.91%). The study stratified populations into high-risk (farm workers, chronic patients) and low-risk groups (general population, blood donors, pregnant women, hospital patients). It revealed a higher IgG—28.9%, IgM—4.42% prevalence in the former group, while the latter group exhibited figures of 17.86% and 3.15%, respectively, indicating occupational and health-related vulnerabilities to HEV.

A temporal analysis (1987–2023), indicated an upward trend in both IgG and IgM prevalence, suggesting an escalating HEV burden.

These findings contribute to a better understanding of HEV seroprevalence in Southeast Asia, shedding light on important public health implications and suggesting directions for further research and intervention strategies.

Research Question

Investigate the seroprevalence of hepatitis E virus (HEV) in Southeast Asian countries focusing on different patterns, timelines, and population cohorts.

Sporadic Transmission of IgG and IgM Prevalence:

• Pooled anti-HEV IgG prevalence: 21.03%

• Pooled anti-HEV IgM prevalence: 3.49%

Seroprevalence among specific groups:

High-risk group (farm workers and chronic patients):

• anti-HEV IgG: 28.9%

• anti-HEV IgM: 4.42%

Low-risk group (general population, blood donors, pregnant women, hospital patients):

• anti-HEV IgG: 17.86%

• anti-HEV IgM: 3.15%

Temporal Seroprevalence of HEV:

Anti-HEV IgG prevalence increased over decades (1987–1999; 2000–2010; 2011–2023): 12.47%, 18.43%, 29.17% as an anti-HEV IgM prevalence: 1.92%, 2.44%, 5.27%

Provides a comprehensive overview of HEV seroprevalence in Southeast Asia.

Highlights variation in seroprevalence among different population groups.

Reveals increasing trend in HEV seroprevalence over the years.

Distinguishes between sporadic and epidemic cases for a better understanding of transmission dynamics.

Peer Review reports


Hepatitis E is a major global health concern caused by the hepatitis E virus (HEV), which is a small, nonenveloped, single-stranded, positive-sense RNA virus belonging to the Paslahepevirus genus in the Hepeviridae family. There are eight genotypes of HEV: HEV-1 and HEV-2 infect only humans, HEV-3, HEV-4, and HEV-7 infect both humans and animals, while HEV-5, HEV-6, and HEV-8 infect only animals [ 1 ].

HEV infections affect millions of people worldwide each year, resulting in a significant number of symptomatic cases and deaths. In 2015, the World Health Organization (WHO) reported approximately 44,000 deaths from hepatitis E, accounting for 3.3% of overall mortality attributed to viral hepatitis [ 2 ]. The primary mode of transmission for hepatitis E is through the fecal–oral route. Outbreaks of the disease are often associated with heavy rainfall and flooding [ 3 , 4 ]. Additionally, sporadic cases can occur due to poor sanitation, vertical transmission, blood transfusion or close contact with infected animals, which serve as hosts for the virus [ 5 ]. Southeast Asia carries a substantial burden of hepatitis E, influenced by its unique socio-economic and environmental factors as well as variations in healthcare systems. Understanding the seroprevalence of hepatitis E in this region is crucial for implementing targeted public health interventions and allocating resources. To achieve the effective control and prevention of HEV, it is required to address the waterborne transmission and considering the specific characteristics of each region. By taking these measures, healthcare authorities can work towards reducing the global impact of hepatitis E on public health. Systematic reviews and meta-analyses on hepatitis E play a crucial role in synthesizing and integrating existing research findings, providing comprehensive insights into the epidemiology, transmission, and burden of the disease, thereby aiding evidence-based decision-making and public health strategies [ 6 , 7 ].

Recent systematic reviews and meta-analysis conducted on hepatitis E have varied in their scope or were limited by a smaller number of source materials [ 8 , 9 ]. The objective of this study was to determine the pooled seroprevalence of hepatitis E in countries within Southeast Asia by aggregating findings from a multitude of primary studies conducted across the region.

To commence this systematic review and meta-analysis, we adhered to the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines and used the PRISMA assessment checklist [Supplementary Table  1 ]. The study included pertinent research conducted within the population of Southeast Asian countries, as outlined by the United Nations [ 10 ], and perform a meta-analysis on the seroprevalence of hepatitis E in this specific region.

PICOT assessment

In this systematic review and meta-analysis, the eligible population comprised individuals from the Southeast Asia region, irrespective of age, gender, ethnic characteristics, or specific chronic diseases. However, studies involving populations outside the designated countries, travelers, migrants, animal species studies, and those lacking clear descriptions of the study population were excluded.

Intervention and comparison

Intervention and comparison are not applicable to the prevalence studies.

Anti-HEV antibodies positivity either total antibodies or IgG or IgM among the Southeast Asian countries' population was assessed.

All studies conducted between 1987 and 2023 were included in this meta-analysis.

Search strategy

To conduct the data search, we utilized three databases, namely “PubMed”, “Scopus”, and “Web of Science”. The search terms comprised keywords related to the Hepatitis E virus, such as “Hepatitis E virus” OR “Hepatitis E” OR “HEV” AND names of each country “Brunei”, “Cambodia”, “Timor-Leste” OR “East-Timor”, “Laos” OR “Lao PDR”, “Indonesia”, “Malaysia”, “Myanmar” OR “Burma”, “Philippines”, “Singapore”, “Thailand”, “Vietnam” and “Southeast Asia”.

The search process in the databases finished on May 29 th , 2023, with two members of the study team conducting independent searches. Subsequently, the search results were unified. A grey literature search was performed from June 25 th to 30 th , 2023, by examining the references of review manuscripts and conference materials, along with using specific keywords in the Google Scholar database. Notably, during the gray literature search, additional studies from the Philippines that were initially missing in the first search were identified and included. Moreover, due to the diverse language expertise of the team, studies in Russian and French related to Cambodia and Vietnam were also considered for inclusion.

After applying the inclusion and exclusion criteria, each article selected for this systematic review (SR) was considered relevant. The quality assessment of each article was conducted using specific JBI critical appraisal instruments [ 11 ] [Supplementary Table  2 ].

Sporadic transmission of HEV infection

For the systematic review and meta-analysis of sporadic infection of HEV, we divided the study population into cohorts by countries, by risk of acquiring HEV—low and high risk. The low risk cohort included the general population (apparently healthy individuals, students, some ethnic populations, or individuals included in original studies as “general population”), blood donors, pregnant women, and hospital patients, while pig farmers, those with chronic hepatitis, HIV positive patients, and solid organ transplant patients in the high-risk group.

Lastly, we analyzed data in three decades—1987–1999, 2000–2010, and 2011–2023—to reveal seroprevalence rates over time.

Epidemic outbreaks of HEV infection

We separated epidemic outbreaks from sporadic cases due to distinct patterns and scale of transmission in epidemy. Epidemics are characterized by rapid and widespread transmission, affecting a large population within a short period and often following a specific pattern or route of propagation.

Statistical analysis

A meta-analysis of proportions was conducted using the 'meta' and 'metafor' packages in the R statistical software. To account for small proportions, the Freeman-Tukey double arcsine method was applied to transform the data. The Dersimonian and Laird method, which employs a random-effects model, was utilized for the meta-analysis, and the results were presented in a forest plot. Confidence intervals (CIs) for the proportions of individual studies were computed using the Clopper-Pearson method.

Heterogeneity was evaluated using the Cochran Q test and quantified by the I 2 index. Heterogeneity was considered significant if the p -value of the Cochran Q test was below 0.05.

For the assessment of publication bias, a funnel plot displaying the transformed proportions against the sample size was created. The symmetry of the plot was examined using the Egger test ( p  < 0.1).

The initial search yielded 1641 articles, which covered 9 out of 11 Southeast Asia countries. We couldn't find any information on hepatitis E from Brunei. We excluded a study from East Timor because it focused on the wrong population (US Army troops). The final screening resulted in the selection of 57 relevant studies, and the grey literature search added 9 more papers that met our inclusion criteria (Fig.  1 ). Among 9 papers through a grey literature, two relevant studies from the Philippines [ 12 , 13 ], one each from Indonesia [ 14 ] and Lao PDR [ 15 ], one study covered both Vietnam and Cambodia [ 16 ], one study provided HEV seroepidemiology information for Myanmar, Thailand, and Vietnam [ 17 ], two studies reported in Russian [ 18 , 19 ] (from Vietnam) and one reported in French [ 16 ] (from Vietnam and Cambodia). In total, our analysis included 66 papers from which we extracted data. This involved a total of 44,850 individuals (Table  1 ).

figure 1

Flowchart of the identification, inclusion, and exclusion of the study. Table under flowchart informing about the studies which were found by the initial search in databases

Sporadic transmission IgG and IgM prevalence in Southeast Asian countries (excluding outbreak settings)

The sporadic cases involving 42,248 participants out of 44,850 participants (the remaining 2,602 people are considered in the “ Epidemic outbreaks ” section) from Southeast Asian countries the pooled prevalence of IgG was found to be 21.03%, while for IgM, it was 3.49% among 34,480 individuals who were tested (Fig. 2 ). Among these countries, Myanmar registered the highest pooled prevalence of IgG at 33.46%, while Malaysia had the lowest at 5.93%. For IgM prevalence, Indonesia had the highest rate at 12.43%, and Malaysia again had the lowest at 0.91% (Table  2 ) [Supplementary Figures  1 and 6 ].

figure 2

Forest plot of meta-analysis of the prevalence of anti-HEV IgG ( A ) and anti-HEV IgM ( B ) in Southeast Asian countries. The plot includes the number of study participants for each country

Seroprevalence among specific groups

High risk of acquiring hev.

The high-risk group, which included farm workers and chronic patients, demonstrated a pooled anti-HEV IgG prevalence of 28.9%, with IgM prevalence at 4.42% [Supplementary Figures  2 and 8 ].

Chronic patients

This group, comprising individuals with chronic liver disease, HIV infection, or solid organ transplantation, exhibited the highest prevalence of pooled IgG among all cohorts, standing at 29.2%. Additionally, IgM prevalence was 3.9% [Supplementary Figures  2 and 7 ].

Farm workers

Farm workers were divided into several subgroups based on exposure to animals (reservoirs of HEV), including pig or ruminant farmers, slaughterhouse workers, butchers, and meat retailers. Among this group, the highest IgG prevalence was observed at 28.4%, while the pooled IgM level was 6.21% [Supplementary Figures  2 and 7 ].

Low risk of acquiring HEV

The low-risk group, comprising the general population, blood donors, pregnant women, and hospital patients, exhibited anti-HEV IgG and IgM prevalence of 17.86% and 3.15%, respectively. [Supplementary Figures  2 and 9 ].

General population

The general population in Southeast Asian countries, represented by 22,571 individuals, showed a presence of IgG in 21.4% of them. IgM was tested in 10,304 participants, and 2.63% of acute infection cases were identified [Supplementary Figures  2 and 7 ].

Blood donors

Blood donors, as a selected subgroup of the general population, exhibit differences in health status, age, gender distribution, and representativeness, warranting separate assessment. Among blood donors in Southeast Asian countries, the pooled prevalence of IgG and IgM were found to be 11.77% and 0.83%, respectively [Supplementary Figures  2 and 7 ].

Pregnant women

Pregnant women considered a vulnerable group regarding disease consequences, demonstrated an anti-HEV IgG prevalence of 18.56% among 1,670 individuals included in the study. Furthermore, 1.54% of them tested positive for anti-HEV IgM [Supplementary Figures  2 and 7 ].

Hospital patients

A group of 18,792 patients who visited hospitals with clinical signs of acute infection, jaundice, high temperature, and elevated liver enzymes, showed anti-HEV IgG and IgM prevalence of 16.3% and 4.45%, respectively [Supplementary Figures  2 and 7 ].

Temporal seroprevalence of HEV

Given the studies' long duration, the data was presented by decades: 1987–1999, 2000–2010, and 2011–2023. The prevalence of IgG showed an upward trend over these decades, with rates of 12.47%, 18.43%, and 29.17%. Similarly, for IgM, the prevalence rates were 1.92%, 2.44%, and 5.27% for the first, second, and third decades, respectively (Fig. 3 ).

figure 3

The prevalence of anti-HEV IgG and IgM in Southeast Asian countries throughout the decades

Evaluating the trend of seroprevalence over decades within the same population and country proved challenging due to the limited availability of research papers. Consequently, we assessed anti-HEV antibody prevalence over decades, considering population cohorts and individual countries.

In Fig.  4 , we can see that all population groups show a consistent increase in the prevalence of both IgG and IgM antibodies over the decades. Figure  5 , we analyze the prevalence of anti-HEV antibodies in different countries over time, except for Indonesia and Malaysia, where we observe an increase in prevalence.

figure 4

The epidemiological data regarding the occurrence of anti-HEV IgG ( A ) and anti-HEV IgM ( B ) antibodies within population cohorts across Southeast Asian nations divided by decades. The population cohorts delineated by the disrupted lines in the figure lack comprehensive data representation, as they provide information for only two out of three decades. Blood donors group has the anti-HEV IgM only for the last decade

figure 5

The epidemiological data regarding the occurrence of anti-HEV IgG ( A ) and anti-HEV IgM ( B ) antibodies within countries of Southeast Asia divided by decades. The countries delineated by the disrupted lines in the figure lack comprehensive data representation, as they provide information for only two out of three decades. Philippines has the anti-HEV IgG antibodies information only for the first decade. Philippines, Myanmar, Singapore have anti-HEV IgM information only for single decade

Some studies lacked information on the collection time of the samples [ 13 , 19 , 41 , 48 , 59 , 62 , 64 , 82 ]. In these studies, the pooled IgG and IgM prevalence was 26.5% and 4.75%, respectively [Supplementary Figures  3 , 4 , 5 , 10 , 11 , 12 ].

Epidemic outbreaks

We separated epidemic outbreaks from sporadic cases due to distinct patterns and scale of transmission in epidemy. Epidemics are characterized by rapid and widespread transmission, affecting a large population within a short period and often following a specific pattern or route of propagation. The outbreaks occurred between 1987 and 1998 in several Southeast Asian countries, namely Indonesia [ 31 , 33 , 34 ], Vietnam [ 77 ], and Myanmar [ 54 ] [Supplementary Figure  13 ]. These outbreak investigations involved a total of 2,602 individuals, with most participants from Indonesia (2,292 individuals). The studies were mainly conducted using a case–control design. Among the participants, 876 were considered controls, while 1,726 were classified as cases. The pooled prevalence of total anti-HEV immunoglobulins was estimated as 61.6% (95% CI 57.1–66) (Table  2 ).

Assessment of publication bias

We checked for publication bias using a funnel plot and Egger's test. Both the studies on anti-HEV IgG and IgM showed asymmetry with Egger's test indicating a p -value less than 0.001 for both cases (Fig. 6 ).

figure 6

Funnel plot of anti-HEV IgG ( A ) and anti-HEV IgM prevalence. Double arcsine transformed proportion of individual studies is plotted against the sample size. The distribution of studies in the funnel plot revealed the presence of publication bias

A paper search yielded varying numbers of manuscripts from Southeast Asian countries. The Philippines had the fewest studies, while Thailand had the highest with 15 studies. No data was found for Brunei Darussalam and East Timor or Timor Leste on the human species.

The results of this study provide valuable insights into the seroprevalence of IgG and IgM antibodies against HEV in different populations across Southeast Asian countries. Understanding the prevalence of these antibodies is essential for assessing the burden of HEV infection and identifying high-risk groups.

The extensive analysis of anti-HEV IgG prevalence in this study covered a wide range of population groups in Southeast Asia, including the general population, blood donors, pregnant women, hospital patients, farm workers, and chronic patients. The results unveiled an overall pooled prevalence of 21.03%, indicating significant exposure to the Hepatitis E virus among individuals in the region at some point in their lives. Moreover, a consistent increase in IgG prevalence was observed over the years, with the highest prevalence occurring in the most recent decade (2011–2023). This suggests a progressive rise in HEV exposure within the region.

Upon examining the prevalence data across different decades and population cohorts, a uniform upward trend in HEV antibody prevalence became apparent across all groups. Several factors could be assessed as potential contributors to this trend:

Notably, the expanding population in Southeast Asian nations during this timeframe increased the number of individuals at risk of Hepatitis E infection.

The rapid urbanization, characterized by the migration from rural to urban areas, led to higher population density and conditions conducive to Hepatitis E virus transmission [ 84 ]. Access to clean drinking water and adequate sanitation facilities emerged as critical factors in preventing Hepatitis E. Regions with inadequate infrastructure, particularly in water and sanitation, faced an elevated risk due to contaminated water sources. Climate-related events, such as heavy rainfall and flooding, significantly impacted waterborne diseases like Hepatitis E. The increasing frequency and severity of such events emphasized the importance of considering climate-related factors in assessing prevalence trends [ 85 ]. Consumption of contaminated or undercooked meat, particularly pork, was identified as a source of Hepatitis E transmission. Changes in food consumption habits over time may have contributed to changes in seroprevalence [ 86 ]. Limited access to healthcare facilities in certain areas exacerbated the spread of Hepatitis E. Increased awareness together with advances in medical research and the establishment of robust surveillance systems likely improved the detection and reporting of Hepatitis E cases, contributing to the observed increase in seroprevalence [ 87 , 88 , 89 ]. These multifaceted factors have likely played a collective role in shaping the changing landscape of Hepatitis E seroprevalence in Southeast Asian nations over the past decades. The upward trend emphasizes the importance of continued monitoring, intervention, and public health measures to mitigate the spread of Hepatitis E in the region.

Among specific populations, pregnant women exhibited an IgG prevalence of 18.56%, indicating that a considerable number of pregnant individuals have been exposed to HEV. Pregnant women are particularly vulnerable to the consequences of HEV infection, as it can lead to severe outcomes for both the mother and the foetus.

Hospital patients with clinical signs of acute infection showed an IgG prevalence of 16.3%, suggesting that HEV is still a significant cause of acute hepatitis cases in the hospital setting. Similarly, farm workers, especially those exposed to animals (reservoirs of HEV), had a high prevalence of IgG (28.4%), highlighting the occupational risk associated with zoonotic transmission.

Chronic patients, including individuals with chronic liver disease, HIV infection, or solid organ transplantation, exhibited the highest pooled IgG prevalence among all cohorts at 29.2%. This finding underscores the importance of monitoring HEV infection in immunocompromised individuals, as they may develop chronic HEV infection, which can lead to severe liver complications.

The prevalence of IgM antibodies, which are indicative of recent or acute HEV infection, was lower overall compared to IgG. The general population showed an IgM prevalence of 2.63% among acute infection cases. Among hospital patients exhibiting clinical signs of acute infection, the prevalence of IgM antibodies indicative of recent or acute HEV infection was higher at 4.45%.

Farm workers, particularly those exposed to animals, demonstrated the highest IgM prevalence at 6.21%. This finding highlights the occupational risk of acquiring acute HEV infection in this population due to direct or indirect contact with infected animals.

The study also identified a high-risk group, consisting of farm workers and chronic patients, with a pooled IgG prevalence of 28.9% and an IgM prevalence of 4.42%. This group is particularly susceptible to HEV infection and requires targeted interventions to reduce transmission and prevent severe outcomes.

Overall, this study provides valuable data on the seroprevalence of HEV antibodies in different populations in Southeast Asian countries. It highlights the importance of continued surveillance and public health interventions to control HEV transmission, especially in vulnerable groups. Understanding the prevalence trends over time can aid in developing effective strategies for the prevention and management of HEV infections in the region. However, further research and studies are warranted to explore the underlying factors contributing to the observed seroprevalence trends and to design targeted interventions to reduce HEV transmission in specific populations. Among the countries of Southeast Asia Myanmar was the most for HEV infection, while Malaysia registered the lowest seroprevalence.

This study has some limitations that we should be aware of. We looked at studies in three languages (English, Russian, and French), but we couldn't find data from two out of the 11 countries. This means we might not have a complete picture of the disease's prevalence in the whole region.

The way we divided the groups based on occupation or status could be questioned. Different criteria might give us different results, so it's something we need to consider. Another challenge is that the study covers a long time from 1989 to 2023 by published research and involves many different countries. This makes it difficult to compare the results because the tests used, and the diagnostic abilities might have changed over time and vary across countries.

Despite these limitations, our study presents a detailed epidemiologic report of combined seroprevalence data for HEV in Southeast Asian countries following the UN division. It gives us a basic understanding of the disease's prevalence in the region and offers some insights into potential risk factors. However, to get a more accurate picture, future research should address these limitations and include data from all countries in the region. Furthermore, certain countries such as Myanmar and the Philippines have not reported HEV prevalence data since 2006 and 2015, respectively. The absence of recent HEV prevalence reports from certain countries raises concerns about the availability of up-to-date epidemiological data for assessing the current status of hepatitis E virus infections in these regions.

Our comprehensive analysis study involving Southeast Asian countries provides significant insights into the seroprevalence of hepatitis E virus (HEV) infection in this region and in various populations. The rates of anti-HEV antibodies observed among different groups, as well as the increasing trend in seroprevalence over decades, emphasize the dynamic nature of HEV transmission in the region. These findings contribute to a better understanding of HEV prevalence across countries, populations, and time periods in Southeast Asia, shedding light on important public health implications and suggesting directions for further research and intervention strategies.

Availability of data and materials

All data generated or analyzed during this study were included in this paper either in the results or supplementary information.


Hepatitis E Virus

Preferred reporting items for systematic review and meta-analysis

Enzyme-Linked Immunosorbent Essay

Hepatitis E virus Immunoglobulin G

Hepatitis E Virus Immunoglobulin M

Smith DB, Izopet J, Nicot F, Simmonds P, Jameel S, Meng XJ, et al. Update: proposed reference sequences for subtypes of hepatitis E virus (species Orthohepevirus A). J Gen Virol. 2020 [cited 2023 Aug 3];101(7):692. Available from: /pmc/articles/PMC7660235/.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Hepatitis E. Available from: https://www.who.int/news-room/fact-sheets/detail/hepatitis-e . Accessed 22 July 2023.

Viswanathan R. A review of the literature on the epidemiology of infectious hepatitis. Indian J Med Res. 1957;45:145–55.

CAS   PubMed   Google Scholar  

Naik SR, Aggarwal R, Salunke PN, Mehrotra NN. A large waterborne viral hepatitis E epidemic in Kanpur, India. Bull World Health Organ. 1992 [cited 2023 Jul 20];70(5):597. Available from: /pmc/articles/PMC2393368/?report=abstract.

CAS   PubMed   PubMed Central   Google Scholar  

Aslan AT, Balaban HY. Hepatitis E virus: epidemiology, diagnosis, clinical manifestations, and treatment. World J Gastroenterol. 2020;26(37):5543–60.

Mulrow CD. Rationale for systematic reviews. BMJ. 1994 [cited 2023 Jul 20];309(6954):597–9. Available from: https://pubmed.ncbi.nlm.nih.gov/8086953/ .

Cook DJ, Mulrow CD, Haynes RB. Systematic reviews: synthesis of best evidence for clinical decisions. Ann Intern Med. 1997;126(5):376–80.

Article   CAS   PubMed   Google Scholar  

Wasuwanich P, Thawillarp S, Ingviya T, Karnsakul W. Hepatitis E in Southeast Asia. Siriraj Med J. 2020 [cited 2023 Jul 20];72(3):259–64. Available from: https://he02.tci-thaijo.org/index.php/sirirajmedj/article/view/240129 .

Article   Google Scholar  

Raji YE, Peck Toung O, Mohd N, Zamberi T, Sekawi B, MohdTaib N, et al. A systematic review of the epidemiology of hepatitis E virus infection in South – Eastern Asia. Virulence. 2021 [cited 2023 Jul 20];12(1):114. Available from: /pmc/articles/PMC7781573/.

South East Asia. Available from: https://www.unep.org/ozonaction/south-east-asia . Accessed 28 May 2023.

Chapter 5: Systematic reviews of prevalence and incidence - JBI Manual for Evidence Synthesis - JBI Global Wiki. [accessed 2023 May 20]. Available from: https://jbi-global-wiki.refined.site/space/MANUAL/4688607/Chapter+5%3A+Systematic+reviews+of+prevalence+and+incidence .

Gloriani-Barzaga N, Cabanban A, Graham RR, Florese RH. Hepatitis E virus infection diagnosed by serology: a report of cases at the San Lazaro Hospital Manila. Phil J Microbiol Infect Dis. 1997;26(4):169–72.

Google Scholar  

Lorenzo AA, De Guzman TS, Su GLS. Detection of IgM and IgG antibodies against hepatitis E virus in donated blood bags from a national voluntary blood bank in Metro Manila Philippines. Asian Pac J Trop Dis. 2015;5(8):604–5.

Article   CAS   Google Scholar  

Jennings GB, Lubis I, Listiyaningsih E, Burans JP, Hyams KC. Hepatitis E virus in Indonesia. Trans R Soc Trop Med Hyg. 1994 [cited 2023 Jul 24];88(1):57. Available from: https://pubmed.ncbi.nlm.nih.gov/8154003/ .

Pauly M, Muller CP, Black AP, Snoeck CJ. Intense human-animal interaction and limited capacity for the surveillance of zoonoses as drivers for hepatitis E virus infections among animals and humans in Lao PDR. Int J Infect Dis. 2016 [cited 2023 Jul 24];53:18. Available from: http://www.ijidonline.com/article/S1201971216312693/fulltext .

Buchy P, Monchy D, An TT, Srey CT, Tri DV, Son S, Glaziou P, Chien BT. Prévalence de marqueurs d’infection des hépatites virales A, B, C et E chez des patients ayant une hypertransaminasémie a Phnom Penh (Cambodge) et Nha Trang (Centre Vietnam) [Prevalence of hepatitis A, B, C and E virus markers among patients with elevated levels of Alanine aminotransferase and Aspartate aminotransferase in Phnom Penh (Cambodia) and Nha Trang (Central Vietnam)]. Bull Soc Pathol Exot. 2004;97(3):165–71.

Abe K, Li T, Ding X, Win KM, Shrestha PK, Quang VX, Ngoc TT, Taltavull TC, Smirnov AV, Uchaikin VF, Luengrojanakul P. International collaborative survey on epidemiology of hepatitis E virus in 11 countries. Southeast Asian J Trop Med Public Health. 2006;37(1):90–5.

PubMed   Google Scholar  

Lichnaia EV, Pham THG, Petrova OA, Tran TN, Bui TTN, Nguyen TT, et al. Hepatitis e virus seroprevalence in indigenous residents of the Hà Giang northern province of Vietnam. Russ J Infect Immun. 2021;11(4):692–700.

Ostankova YuV, Semenov AV, Valutite DE, Zueva EB, Serikova EN, Shchemelev AN, et al. Enteric viral hepatitis in the Socialist Republic of Vietnam (Southern Vietnam). Jurnal Infektologii. 2021;13(4):72–8. Available from: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85123451810&doi=10.22625%2f2072-6732-2021-13-4-72-78&partnerID=40&md5=968b7e50231d40b12aded0e0da133936 .

Kasper MR, Blair PJ, Touch S, Sokhal B, Yasuda CY, Williams M, et al. Infectious etiologies of acute febrile illness among patients seeking health care in south-central Cambodia. Am J Trop Med Hyg. 2012;86(2):246–53. Available from: https://pubmed.ncbi.nlm.nih.gov/22302857/ .

Article   PubMed   PubMed Central   Google Scholar  

Nouhin J, Madec Y, Prak S, Ork M, Kerleguer A, Froehlich Y, et al. Declining hepatitis E virus antibody prevalence in Phnom Penh, Cambodia during 1996–2017. Epidemiol Infect. 2018;147:e26. Available from: https://pubmed.ncbi.nlm.nih.gov/30309396/ .

Nouhin J, Prak S, Madec Y, Barennes H, Weissel R, Hok K, et al. Hepatitis E virus antibody prevalence, RNA frequency, and genotype among blood donors in Cambodia (Southeast Asia). Transfusion (Paris). 2016;56(10):2597–601. Available from: https://pubmed.ncbi.nlm.nih.gov/27480100/ .

Nouhin J, Barennes H, Madec Y, Prak S, Hou SV, Kerleguer A, et al. Low frequency of acute hepatitis E virus (HEV) infections but high past HEV exposure in subjects from Cambodia with mild liver enzyme elevations, unexplained fever or immunodeficiency due to HIV-1 infection. J Clin Virol. 2015;71:22–7. Available from: https://pubmed.ncbi.nlm.nih.gov/26370310/ .

Article   PubMed   Google Scholar  

Yamada H, Takahashi K, Lim O, Svay S, Chuon C, Hok S, et al. Hepatitis E Virus in Cambodia: prevalence among the general population and complete genome sequence of genotype 4. PLoS One. 2015;10:e0136903. Available from: https://pubmed.ncbi.nlm.nih.gov/26317620/ .

Chhour YM, Ruble G, Hong R, Minn K, Kdan Y, Sok T, et al. Hospital-based diagnosis of hemorrhagic fever, encephalitis, and hepatitis in Cambodian children. Emerg Infect Dis. 2002;8(5):485–9. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2732496/pdf/01-0236-FinalR.pdf .

Utsumi T, Hayashi Y, Lusida MI, Amin M, Soetjipto, Hendra A, et al. Prevalence of hepatitis E virus among swine and humans in two different ethnic communities in Indonesia. Arch Virol. 2011;156(4):689–93. Available from: https://pubmed.ncbi.nlm.nih.gov/21191625/ .

Mizuo H, Suzuki K, Takikawa Y, Sugai Y, Tokita H, Akahane Y, et al. Polyphyletic strains of hepatitis E virus are responsible for sporadic cases of acute hepatitis in Japan. J Clin Microbiol. 2002 [cited 2023 Jul 24];40(9):3209. Available from: /pmc/articles/PMC130758/.

Achwan WA, Muttaqin Z, Zakaria E, Depamede SA, Mulyanto, Sumoharjo S, et al. Epidemiology of hepatitis B, C, and E viruses and human immunodeficiency virus infections in Tahuna, Sangihe-Talaud Archipelago Indonesia. Intervirology. 2007;50(6):408–11. Available from: https://pubmed.ncbi.nlm.nih.gov/18185013/ .

Surya IG, Kornia K, Suwardewa TG, Mulyanto, Tsuda F, Mishiro S. Serological markers of hepatitis B, C, and E viruses and human immunodeficiency virus type-1 infections in pregnant women in Bali Indonesia. J Med Virol. 2005;75(4):499–503. Available from: https://pubmed.ncbi.nlm.nih.gov/15714491/ .

Wibawa ID, Suryadarma IG, Mulyanto, Tsuda F, Matsumoto Y, Ninomiya M, et al. Identification of genotype 4 hepatitis E virus strains from a patient with acute hepatitis E and farm pigs in Bali Indonesia. J Med Virol. 2007;79(8):1138–46. Available from: https://pubmed.ncbi.nlm.nih.gov/17596841/ .

Sedyaningsih-Mamahit ER, Larasati RP, Laras K, Sidemen A, Sukri N, Sabaruddin N, et al. First documented outbreak of hepatitis E virus transmission in Java, Indonesia. Trans R Soc Trop Med Hyg. 2002;96(4):398–404. Available from: https://pubmed.ncbi.nlm.nih.gov/12497976/ .

Widasari DI, Yano Y, Utsumi T, Heriyanto DS, Anggorowati N, Rinonce HT, et al. Hepatitis E virus infection in two different regions of Indonesia with identification of swine HEV genotype 3. Microbiol Immunol. 2013;57(10):692–703. Available from: https://pubmed.ncbi.nlm.nih.gov/23865729/ .

Corwin A, Putri MP, Winarno J, Lubis I, Suparmanto S, Sumardiati A, et al. Epidemic and sporadic hepatitis E virus transmission in West Kalimantan (Borneo). Indonesia Am J Trop Med Hyg. 1997;57(1):62–5. Available from: https://pubmed.ncbi.nlm.nih.gov/9242320/ .

Corwin A, Jarot K, Lubis I, Nasution K, Suparmawo S, Sumardiati A, et al. Two years’ investigation of epidemic hepatitis E virus transmission in West Kalimantan (Borneo), Indonesia. Trans R Soc Trop Med Hyg. 1995 [cited 2023 Jul 20];89(3):262–5. Available from: https://pubmed.ncbi.nlm.nih.gov/7660427/ .

Wibawa ID, Muljono DH, Mulyanto, Suryadarma IG, Tsuda F, Takahashi M, et al. Prevalence of antibodies to hepatitis E virus among apparently healthy humans and pigs in Bali, Indonesia: identification of a pig infected with a genotype 4 hepatitis E virus. J Med Virol. 2004;73(1):38–44. Available from: https://pubmed.ncbi.nlm.nih.gov/15042646/ .

Goldsmith R, Yarbough PO, Reyes GR, Fry KE, Gabor KA, Kamel M, et al. Enzyme-linked immunosorbent assay for diagnosis of acute sporadic hepatitis E in Egyptian children. Lancet. 1992 [cited 2023 Jul 25];339(8789):328–31. Available from: https://pubmed.ncbi.nlm.nih.gov/1346411/ .

Bounlu K, Insisiengmay S, Vanthanouvong K, Saykham, Widjaja S, Iinuma K, et al. Acute jaundice in Vientiane, Lao People’s Democratic Republic. Clin Infect Dis. 1998;27(4):717–21. Available from: https://pubmed.ncbi.nlm.nih.gov/9798023/ .

Khounvisith V, Saysouligno S, Souvanlasy B, Billamay S, Mongkhoune S, Vongphachanh B, et al. Hepatitis B virus and other transfusion-transmissible infections in child blood recipients in Lao People’s Democratic Republic: a hospital-based study. Arch Dis Child. 2023;108(1):15–9. Available from: https://pubmed.ncbi.nlm.nih.gov/36344216/ .

Khounvisith V, Tritz S, Khenkha L, Phoutana V, Keosengthong A, Pommasichan S, et al. High circulation of hepatitis E virus in pigs and professionals exposed to pigs in Laos. Zoonoses Public Health. 2018;65(8):1020–6. Available from: https://pubmed.ncbi.nlm.nih.gov/30152201/ .

Tritz SE, Khounvisith V, Pommasichan S, Ninnasopha K, Keosengthong A, Phoutana V, et al. Evidence of increased hepatitis E virus exposure in Lao villagers with contact to ruminants. Zoonoses Public Health. 2018;65(6):690–701. Available from: https://pubmed.ncbi.nlm.nih.gov/29888475/ .

Bisayher S, Barennes H, Nicand E, Buisson Y. Seroprevalence and risk factors of hepatitis E among women of childbearing age in the Xieng Khouang province (Lao People’s Democratic Republic), a cross-sectional survey. Trans R Soc Trop Med Hyg. 2019;113(6):298–304. Available from: https://pubmed.ncbi.nlm.nih.gov/31034060/ .

Holt HR, Inthavong P, Khamlome B, Blaszak K, Keokamphe C, Somoulay V, et al. Endemicity of zoonotic diseases in pigs and humans in lowland and upland Lao PDR: identification of socio-cultural risk factors. PLoS Negl Trop Dis. 2016;10(4):e0003913. Available from: https://pubmed.ncbi.nlm.nih.gov/27070428/ .

Syhavong B, Rasachack B, Smythe L, Rolain JM, Roque-Afonso AM, Jenjaroen K, et al. The infective causes of hepatitis and jaundice amongst hospitalised patients in Vientiane, Laos. Trans R Soc Trop Med Hyg. 2010;104(7):475–83. Available from: https://pubmed.ncbi.nlm.nih.gov/20378138/ .

Chansamouth V, Thammasack S, Phetsouvanh R, Keoluangkot V, Moore CE, Blacksell SD, et al. The aetiologies and impact of fever in pregnant inpatients in Vientiane, Laos. PLoS Negl Trop Dis. 2016;10(4):e0004577.

Wong LP, Tay ST, Chua KH, Goh XT, Alias H, Zheng Z, et al. Serological evidence of Hepatitis E virus (HEV) infection among ruminant farmworkers: a retrospective study from Malaysia. Infect Drug Resist. 2022;15:5533–41. Available from: https://pubmed.ncbi.nlm.nih.gov/36164335/ .

Wong LP, Lee HY, Khor CS, Abdul-Jamil J, Alias H, Abu-Amin N, et al. The risk of transfusion-transmitted hepatitis E virus: evidence from seroprevalence screening of blood donations. Indian J Hematol Blood Transfus. 2022;38(1):145–52. Available from: https://pubmed.ncbi.nlm.nih.gov/33879981/ .

Wong LP, Alias H, Choy SH, Goh XT, Lee SC, Lim YAL, et al. The study of seroprevalence of hepatitis E virus and an investigation into the lifestyle behaviours of the aborigines in Malaysia. Zoonoses Public Health. 2020;67(3):263–70. Available from: https://pubmed.ncbi.nlm.nih.gov/31927794/ .

Ng KP, He J, Saw TL, Lyles CM. A seroprevalence study of viral hepatitis E infection in human immunodeficiency virus type 1 infected subjects in Malaysia. Med J Malaysia. 2000;55(1):58–64. Available from: https://pubmed.ncbi.nlm.nih.gov/11072492/ .

Hudu SA, Niazlin MT, Nordin SA, Harmal NS, Tan SS, Omar H, et al. Hepatitis E virus isolated from chronic hepatitis B patients in Malaysia: sequences analysis and genetic diversity suggest zoonotic origin. Alexandria J Med. 2018;54(4):487–94. Available from: https://www.tandfonline.com/doi/pdf/10.1016/j.ajme.2017.07.003 .

Anderson DA, Li F, Riddell M, Howard T, Seow HF, Torresi J, et al. ELISA for IgG-class antibody to hepatitis E virus based on a highly conserved, conformational epitope expressed in Escherichia coli. J Virol Methods. 1999 [cited 2023 Jul 25];81(1–2):131–42. Available from: https://pubmed.ncbi.nlm.nih.gov/10488771/ .

Seow HF, Mahomed NM, Mak JW, Riddell MA, Li F, Anderson DA. Seroprevalence of antibodies to hepatitis E virus in the normal blood donor population and two aboriginal communities in Malaysia. J Med Virol. 1999;59(2):164–8. Available from: https://pubmed.ncbi.nlm.nih.gov/10459151/ .

Saat Z, Sinniah M, Kin TL, Baharuddin R, Krishnasamy M. A four year review of acute viral hepatitis cases in the east coast of Peninsular Malaysia. Southeast Asian J Trop Med Public Health. 1999;30(1):106–9.

Li TC, Yamakawa Y, Suzuki K, Tatsumi M, Razak MA, Uchida T, et al. Expression and self-assembly of empty virus-like particles of hepatitis E virus. J Virol. 1997 [cited 2023 Jul 25];71(10):7207–13. Available from: https://pubmed.ncbi.nlm.nih.gov/9311793/ .

Uchida T, Aye TT, Ma X, Iida F, Shikata T, Ichikawa M, et al. An epidemic outbreak of hepatitis E in Yangon of Myanmar: antibody assay and animal transmission of the virus. Acta Pathol Jpn. 1993;43(3):94–8. Available from: https://pubmed.ncbi.nlm.nih.gov/8257479/ .

Nakai K, Win KM, Oo SS, Arakawa Y, Abe K. Molecular characteristic-based epidemiology of hepatitis B, C, and E viruses and GB virus C/hepatitis G virus in Myanmar. J Clin Microbiol. 2001;39(4):1536–9. Available from: https://pubmed.ncbi.nlm.nih.gov/11283083/ .

Chow WC, Ng HS, Lim GK, Oon CJ. Hepatitis E in Singapore–a seroprevalence study. Singapore Med J. 1996;37(6):579–81. Available from: https://pubmed.ncbi.nlm.nih.gov/9104052/ .

Wong CC, Thean SM, Ng Y, Kang JSL, Ng TY, Chau ML, et al. Seroepidemiology and genotyping of hepatitis E virus in Singapore reveal rise in number of cases and similarity of human strains to those detected in pig livers. Zoonoses Public Health. 2019 [cited 2023 Jul 24];66(7):773–82. Available from: https://pubmed.ncbi.nlm.nih.gov/31293095/ .

Tan LTC, Tan J, Ang LW, Chan KP, Chiew KT, Cutter J, et al. Epidemiology of acute hepatitis E in Singapore. J Infect. 2013 [cited 2023 Jul 24];66(5):453–9. Available from: https://pubmed.ncbi.nlm.nih.gov/23286967/ .

Pourpongporn P, Samransurp K, Rojanasang P, Wiwattanakul S, Srisurapanon S. The prevalence of anti-hepatitis E in occupational risk groups. J Med Assoc Thai. 2009;92:S38–42. Available from: https://pubmed.ncbi.nlm.nih.gov/19705545/ .

Siripanyaphinyo U, Boon-Long J, Louisirirotchanakul S, Takeda N, Chanmanee T, Srimee B, et al. Occurrence of hepatitis E virus infection in acute hepatitis in Thailand. J Med Virol. 2014;86(10):1730–5. Available from: https://pubmed.ncbi.nlm.nih.gov/24984976/ .

Poovorawan K, Jitmitrapab S, Treeprasertsuk S, Tangkijvanich P, Komolmitr P, Poovorawan Y. Acute hepatitis E in Thailand, 2009–2012. J Gastroenterol Hepatol. 2012;27:233.

Maneerat Y, Wilairatana P, Pongponratn E, Chaisri U, Puthavatana P, Snitbhan R, et al. Etiology of acute non-A, B, C hepatitis in Thai patients: preliminary study. Southeast Asian J Trop Med Public Health. 1996;27(4):844–6. Available from: https://pubmed.ncbi.nlm.nih.gov/9253895/ .

Sa-nguanmoo P, Posuwan N, Vichaiwattana P, Wutthiratkowit N, Owatanapanich S, Wasitthankasem R, et al. Swine is a possible source of hepatitis E virus infection by comparative study of hepatitis A and E seroprevalence in Thailand. PLoS One. 2015;10:e0126184. Available from: https://pubmed.ncbi.nlm.nih.gov/25927925/ .

Pilakasiri C, Gibbons RV, Jarman RG, Supyapoung S, Myint KSA. Hepatitis antibody profile of Royal Thai Army nursing students. Trop Med Int Health. 2009;14(6):609–11. Available from: https://onlinelibrary.wiley.com/doi/pdfdirect/10.1111/j.1365-3156.2009.02264.x?download=true .

Louisirirotchanakul S, Myint KS, Srimee B, Kanoksinsombat C, Khamboonruang C, Kunstadter P, et al. The prevalence of viral hepatitis among the Hmong people of northern Thailand. Southeast Asian J Trop Med Public Health. 2002;33(4):837–44. Available from: https://pubmed.ncbi.nlm.nih.gov/12757235/ .

Jupattanasin S, Chainuvati S, Chotiyaputta W, Chanmanee T, Supapueng O, Charoonruangrit U, et al. A nationwide survey of the seroprevalence of hepatitis E virus infections among blood donors in Thailand. Viral Immunol. 2019;32(7):302–7. Available from: https://pubmed.ncbi.nlm.nih.gov/31403386/ .

Hinjoy S, Nelson KE, Gibbons RV, Jarman RG, Mongkolsirichaikul D, Smithsuwan P, et al. A cross-sectional study of hepatitis E virus infection in healthy people directly exposed and unexposed to pigs in a rural community in northern Thailand. Zoonoses Public Health. 2013;60(8):555–62. Available from: https://pubmed.ncbi.nlm.nih.gov/23280251/ .

Getsuwan S, Pasomsub E, Yutthanakarnwikom P, Tongsook C, Butsriphum N, Tanpowpong P, et al. Seroprevalence of hepatitis E virus after pediatric liver transplantation. J Trop Pediatr. 2023;69(2):fmad011. Available from: https://pubmed.ncbi.nlm.nih.gov/36811578/ .

Gonwong S, Chuenchitra T, Khantapura P, Islam D, Sirisopana N, Mason CJ. Pork consumption and seroprevalence of hepatitis E virus, Thailand, 2007–2008. Emerg Infect Dis. 2014;20:1531–4. Available from: https://pubmed.ncbi.nlm.nih.gov/25148245/ .

Komolmit P, Oranrap V, Suksawatamnuay S, Thanapirom K, Sriphoosanaphan S, Srisoonthorn N, et al. Clinical significance of post-liver transplant hepatitis E seropositivity in high prevalence area of hepatitis E genotype 3: a prospective study. Sci Rep. 2020;10(1):7352. Available from: https://pubmed.ncbi.nlm.nih.gov/32355268/ .

Jutavijittum P, Jiviriyawat Y, Jiviriyawat W, Yousukh A, Hayashi S, Itakura H, et al. Seroprevalence of antibody to hepatitis E virus in voluntary blood donors in Northern Thailand. Trop Med. 2000;42(2):135–9. Available from: https://www.scopus.com/inward/record.uri?eid=2-s2.0-0033638431&partnerID=40&md5=12c324fd502945b8c8dc425274f7cad2 .

Boonyai A, Thongput A, Sisaeng T, Phumchan P, Horthongkham N, Kantakamalakul W, et al. Prevalence and clinical correlation of hepatitis E virus antibody in the patients’ serum samples from a tertiary care hospital in Thailand during 2015–2018. Virol J. 2021;18(1):145. Available from: https://pubmed.ncbi.nlm.nih.gov/34247642/ .

Huy PX, Chung DT, Linh DT, Hang NT, Rachakonda S, Pallerla SR, et al. Low prevalence of HEV infection and no associated risk of HEV transmission from mother to child among pregnant women in Vietnam. Pathogens. 2021;10(10):1340. Available from: https://pubmed.ncbi.nlm.nih.gov/34684289/ .

Hoan NX, Huy PX, Sy BT, Meyer CG, Son TV, Binh MT, et al. High hepatitis E virus (HEV) Positivity among domestic pigs and risk of HEV infection of individuals occupationally exposed to pigs and pork meat in Hanoi, Vietnam. Open Forum Infect Dis. 2019;6(9):ofz306. Available from: https://pubmed.ncbi.nlm.nih.gov/31660396/ .

Hoan NX, Tong HV, Hecht N, Sy BT, Marcinek P, Meyer CG, et al. Hepatitis E virus superinfection and clinical progression in hepatitis B patients. EBioMedicine. 2015;2(12):2080–6. Available from: https://pubmed.ncbi.nlm.nih.gov/26844288/ .

Hau CH, Hien TT, Tien NT, Khiem HB, Sac PK, Nhung VT, et al. Prevalence of enteric hepatitis A and E viruses in the Mekong River delta region of Vietnam. Am J Trop Med Hyg. 1999;60(2):277–80. Available from: https://pubmed.ncbi.nlm.nih.gov/10072151/ .

Corwin AL, Dai TC, Duc DD, Suu PI, Van NT, Ha LD, et al. Acute viral hepatitis in Hanoi, Viet Nam. Trans R Soc Trop Med Hyg. 1996;90(6):647–8. Available from: https://pubmed.ncbi.nlm.nih.gov/9015503/ .

Corwin AL, Khiem HB, Clayson ET, Pham KS, Vo TT, Vu TY, et al. A waterborne outbreak of hepatitis E virus transmission in southwestern Vietnam. Am J Trop Med Hyg. 1996;54(6):559–62. Available from: https://pubmed.ncbi.nlm.nih.gov/8686771/ .

Berto A, Pham HA, Thao TTN, Vy NHT, Caddy SL, Hiraide R, et al. Hepatitis E in southern Vietnam: seroepidemiology in humans and molecular epidemiology in pigs. Zoonoses Public Health. 2018 [cited 2023 Jul 24];65(1):43–50. Available from: https://onlinelibrary.wiley.com/doi/full/10.1111/zph.12364 .

Li TC, Zhang J, Shinzawa H, Ishibashi M, Sata M, Mast EE, et al. Empty virus-like particle-based enzyme-linked immunosorbent assay for antibodies to hepatitis E virus. J Med Virol. 2000;62(3):327–33.

Shimizu K, Hamaguchi S, Ngo CC, Li TC, Ando S, Yoshimatsu K, et al. Serological evidence of infection with rodent-borne hepatitis E virus HEV-C1 or antigenically related virus in humans. J Vet Med Sci. 2016 [cited 2023 Jul 24];78(11):1677–81. Available from: https://pubmed.ncbi.nlm.nih.gov/27499185/ .

Nghiem XH, Pham XH, Trinh VS, Dao PG, Mai TB, Dam TA, et al. HEV positivity in domesticated pigs and a relative risk of HEV zoonosis among occupationally exposed individuals in Vietnam. J Hepatol. 2018 [cited 2023 Jul 24];68:S186–7. Available from: https://www.researchgate.net/publication/324700980 .

Tran HTT, Ushijima H, Quang VX, Phuong N, Li TC, Hayashi S, et al. Prevalence of hepatitis virus types B through E and genotypic distribution of HBV and HCV in Ho Chi Minh City. Vietnam Hepatology Research. 2003 [cited 2023 Jul 24];26(4):275–80. Available from: https://pubmed.ncbi.nlm.nih.gov/12963426/ .

South-East Asia | Demographic Changes. Available from: https://www.population-trends-asiapacific.org/data/sea . Accessed May 18 2023.

Sentian J, Payus CM, Herman F, Kong VWY. Climate change scenarios over Southeast Asia. APN Sci Bull. 2022 [cited 2023 Sep 28];12(1):102–22. Available from: https://www.apn-gcr.org/bulletin/?p=1927 .

Lee T HJ. Southeast Asia’s growing meat demand and its implications for feedstuffs imports. Amber Waves: The Economics of Food, Farming, Natural Resources, and Rural America. 2019;(03).  https://ideas.repec.org/a/ags/uersaw/302703.html . https://www.ers.usda.gov/amber-waves/2019/april/southeast-asia-s-growing-meat-demand-and-its-implications-forfeedstuffs-imports/ .

Rossi-Tamisier M, Moal V, Gerolami R, Colson P. Discrepancy between anti-hepatitis E virus immunoglobulin G prevalence assessed by two assays in kidney and liver transplant recipients. J Clin Virol. 2013 [cited 2023 Jul 27];56(1):62–4. Available from: https://pubmed.ncbi.nlm.nih.gov/23089569/ .

Wenzel JJ, Preiss J, Schemmerer M, Huber B, Jilg W. Test performance characteristics of Anti-HEV IgG assays strongly influence hepatitis E seroprevalence estimates. J Infect Dis. 2013 [cited 2023 Jul 27];207(3):497–500. Available from: https://pubmed.ncbi.nlm.nih.gov/23148290/ .

Chongsuvivatwong V, Phua KH, Yap MT, Pocock NS, Hashim JH, Chhem R, et al. Health and health-care systems in southeast Asia: diversity and transitions. Lancet. 2011;377(9763):429–37.

Download references


The authors would like to thank all researchers of the primary research included in this study.

This work was supported by Project Research Center for Epidemiology and Prevention of Viral Hepatitis and Hepatocellular Carcinoma, Hiroshima University led by Prof. Junko Tanaka (PI).

Author information

Authors and affiliations.

Department of Epidemiology, Infectious Disease Control and Prevention, Graduate School of Biomedical and Health Sciences, Hiroshima University, 1-2-3, Kasumi, Hiroshima, Minami, 734-8551, Japan

Ulugbek Khudayberdievich Mirzaev, Serge Ouoba, Ko Ko, Zayar Phyo, Chanroth Chhoung, Akuffo Golda Ataa, Aya Sugiyama, Tomoyuki Akita & Junko Tanaka

Department of Hepatology, Research Institute of Virology, Tashkent, Uzbekistan

Ulugbek Khudayberdievich Mirzaev

Unité de Recherche Clinique de Nanoro (URCN), Institut de Recherche en Sciences de La Santé (IRSS), Nanoro, Burkina Faso

Serge Ouoba

You can also search for this author in PubMed   Google Scholar


UM, TA, and JT conceptualized the study. UM and SO contributed to developing the study design and data acquisition. UM, CC, ZP, AG, SO, and JT analysed and interpreted the data. UM, KK, and AS drafted the manuscript. TA, AS, KK, SO, and JT contributed to the intellectual content of the manuscript. All authors read and approved the final manuscript. JT and TA shared the co-correspondence. 

Corresponding authors

Correspondence to Tomoyuki Akita or Junko Tanaka .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Consent for publication

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary material 1., supplementary material 2., supplementary material 3., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Mirzaev, U.K., Ouoba, S., Ko, K. et al. Systematic review and meta-analysis of hepatitis E seroprevalence in Southeast Asia: a comprehensive assessment of epidemiological patterns. BMC Infect Dis 24 , 525 (2024). https://doi.org/10.1186/s12879-024-09349-2

Download citation

Received : 30 October 2023

Accepted : 24 April 2024

Published : 24 May 2024

DOI : https://doi.org/10.1186/s12879-024-09349-2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Hepatitis E virus
  • Southeast Asia
  • Immunoglobulins
  • Systematic review
  • Meta-analysis
  • Epidemiologic patterns

BMC Infectious Diseases

ISSN: 1471-2334

research proposal meta analysis

  • Systematic Review
  • Open access
  • Published: 24 May 2024

Turnover intention and its associated factors among nurses in Ethiopia: a systematic review and meta-analysis

  • Eshetu Elfios 1 ,
  • Israel Asale 1 ,
  • Merid Merkine 1 ,
  • Temesgen Geta 1 ,
  • Kidist Ashager 1 ,
  • Getachew Nigussie 1 ,
  • Ayele Agena 1 ,
  • Bizuayehu Atinafu 1 ,
  • Eskindir Israel 2 &
  • Teketel Tesfaye 3  

BMC Health Services Research volume  24 , Article number:  662 ( 2024 ) Cite this article

348 Accesses

Metrics details

Nurses turnover intention, representing the extent to which nurses express a desire to leave their current positions, is a critical global public health challenge. This issue significantly affects the healthcare workforce, contributing to disruptions in healthcare delivery and organizational stability. In Ethiopia, a country facing its own unique set of healthcare challenges, understanding and mitigating nursing turnover are of paramount importance. Hence, the objectives of this systematic review and meta-analysis were to determine the pooled proportion ofturnover intention among nurses and to identify factors associated to it in Ethiopia.

A comprehensive search carried out for studies with full document and written in English language through an electronic web-based search strategy from databases including PubMed, CINAHL, Cochrane Library, Embase, Google Scholar and Ethiopian University Repository online. Checklist from the Joanna Briggs Institute (JBI) was used to assess the studies’ quality. STATA version 17 software was used for statistical analyses. Meta-analysis was done using a random-effects method. Heterogeneity between the primary studies was assessed by Cochran Q and I-square tests. Subgroup and sensitivity analyses were carried out to clarify the source of heterogeneity.

This systematic review and meta-analysis incorporated 8 articles, involving 3033 nurses in the analysis. The pooled proportion of turnover intention among nurses in Ethiopia was 53.35% (95% CI (41.64, 65.05%)), with significant heterogeneity between studies (I 2  = 97.9, P  = 0.001). Significant association of turnover intention among nurses was found with autonomous decision-making (OR: 0.28, CI: 0.14, 0.70) and promotion/development (OR: 0.67, C.I: 0.46, 0.89).

Conclusion and recommendation

Our meta-analysis on turnover intention among Ethiopian nurses highlights a significant challenge, with a pooled proportion of 53.35%. Regional variations, such as the highest turnover in Addis Ababa and the lowest in Sidama, underscore the need for tailored interventions. The findings reveal a strong link between turnover intention and factors like autonomous decision-making and promotion/development. Recommendations for stakeholders and concerned bodies involve formulating targeted retention strategies, addressing regional variations, collaborating for nurse welfare advocacy, prioritizing career advancement, reviewing policies for nurse retention improvement.

Peer Review reports

Turnover intention pertaining to employment, often referred to as the intention to leave, is characterized by an employee’s contemplation of voluntarily transitioning to a different job or company [ 1 ]. Nurse turnover intention, representing the extent to which nurses express a desire to leave their current positions, is a critical global public health challenge. This issue significantly affects the healthcare workforce, contributing to disruptions in healthcare delivery and organizational stability [ 2 ].

The global shortage of healthcare professionals, including nurses, is an ongoing challenge that significantly impacts the capacity of healthcare systems to provide quality services [ 3 ]. Nurses, as frontline healthcare providers, play a central role in patient care, making their retention crucial for maintaining the functionality and effectiveness of healthcare delivery. However, the phenomenon of turnover intention, reflecting a nurse’s contemplation of leaving their profession, poses a serious threat to workforce stability [ 4 ].

Studies conducted globally shows that high turnover rates among nurses in several regions, with notable figures reported in Alexandria (68%), China (63.88%), and Jordan (60.9%) [ 5 , 6 , 7 ]. In contrast, Israel has a remarkably low turnover rate of9% [ 8 ], while Brazil reports 21.1% [ 9 ], and Saudi hospitals26% [ 10 ]. These diverse turnover rates highlight the global nature of the nurse turnover phenomenon, indicating varying degrees of workforce mobility in different regions.

The magnitude and severity of turnover intention among nurses worldwide underscore the urgency of addressing this issue. High turnover rates not only disrupt healthcare services but also result in a loss of valuable skills and expertise within the nursing workforce. This, in turn, compromises the continuity and quality of patient care, with potential implications for patient outcomes and overall health service delivery [ 11 ]. Extensive research conducted worldwide has identified a range of factors contributing to turnover intention among nurses [ 11 , 12 , 13 , 14 , 15 , 16 , 17 ]. These factors encompass both individual and organizational aspects, such as high workload, inadequate support, limited career advancement opportunities, job satisfaction, conflict, payment or reward, burnout sense of belongingness to their work environment. The complex interplay of these factors makes addressing turnover intention a multifaceted challenge that requires targeted interventions.

In Ethiopia, a country facing its own unique set of healthcare challenges, understanding and mitigating nursing turnover are of paramount importance. The healthcare system in Ethiopia grapples with issues like resource constraints, infrastructural limitations, and disparities in healthcare access [ 18 ]. Consequently, the factors influencing nursing turnover in Ethiopia may differ from those in other regions. Previous studies conducted in the Ethiopian context have started to unravel some of these factors, emphasizing the need for a more comprehensive examination [ 18 , 19 ].

Although many cross-sectional studies have been conducted on turnover intention among nurses in Ethiopia, the results exhibit variations. The reported turnover intention rates range from a minimum of 30.6% to a maximum of 80.6%. In light of these disparities, this systematic review and meta-analysis was undertaken to ascertain the aggregated prevalence of turnover intention among nurses in Ethiopia. By systematically analyzing findings from various studies, we aimed to provide a nuanced understanding of the factors influencing turnover intention specific to the Ethiopian healthcare context. Therefore, this systematic review and meta-analysis aimed to answer the following research questions.

What is the pooled prevalence of turnover intention among nurses in Ethiopia?

What are the factors associated with turnover intention among nurses in Ethiopia?

The primary objective of this review was to assess the pooled proportion of turnover intention among nurses in Ethiopia. The secondary objective was identifying the factors associated to turnover intention among nurses in Ethiopia.

Study design and search strategy

A comprehensive systematic review and meta-analysis was conducted, examining observational studies on turnover intention among nurses in Ethiopia. The procedure for this systematic review and meta-analysis was developed in accordance with the Preferred Reporting Items for Systematic review and Meta-analysis Protocols (PRISMA-P) statement [ 20 ]. PRISMA-2015 statement was used to report the findings [ 21 , 22 ]. This systematic review and meta-analysis were registered on PROSPERO with the registration number of CRD42024499119.

We conducted systematic and an extensive search across multiple databases, including PubMed, CINAHL, Cochrane Library, Embase, Google Scholar and Ethiopian University Repository online to identify studies reporting turnover intention among nurses in Ethiopia. We reviewed the database available at http://www.library.ucsf.edu and the Cochrane Library to ensure that the intended task had not been previously undertaken, preventing any duplication. Furthermore, we screened the reference lists to retrieve relevant articles. The process involved utilizing EndNote (version X8) software for downloading, organizing, reviewing, and citing articles. Additionally, a manual search for cross-references was performed to discover any relevant studies not captured through the initial database search. The search employed a comprehensive set of the following search terms:“prevalence”, “turnover intention”, “intention to leave”, “attrition”, “employee attrition”, “nursing staff turnover”, “Ethiopian nurses”, “nurses”, and “Ethiopia”. These terms were combined using Boolean operators (AND, OR) to conduct a thorough and systematic search across the specified databases.

Eligibility criteria

Inclusion criteria.

The established inclusion criteria for this meta-analysis and systematic review are as follows to guide the selection of articles for inclusion in this review.

Population: Nurses working in Ethiopia.

Study period: studies conducted or published until 23November 2023.

Study design: All observational study designs, such as cross-sectional, longitudinal, and cohort studies, were considered.

Setting: Only studies conducted in Ethiopia were included.

Outcome; turnover intention.

Study: All studies, whether published or unpublished, in the form of journal articles, master’s theses, and dissertations, were included up to the final date of data analysis.

Language: This study exclusively considered studies in the English language.

Exclusion criteria

Excluded were studies lacking full text or Studies with a Newcastle–Ottawa Quality Assessment Scale (NOS) score of 6 or less. Studies failing to provide information on turnover intention among nurses or studies for which necessary details could not be obtained were excluded. Three authors (E.E., T.G., K.A) independently assessed the eligibility of retrieved studies, other two authors (E.I & M.M) input sought for consensus on potential in- or exclusion.

Quality assessment and data extraction

Two authors (E.E, A.A, G.N) independently conducted a critical appraisal of the included studies. Joanna Briggs Institute (JBI) checklists of prevalence study was used to assess the quality of the studies. Studies with a Newcastle–Ottawa Quality Assessment Scale (NOS) score of seven or more were considered acceptable [ 23 ]. The tool has nine parameters, which have yes, no, unclear, and not applicable options [ 24 ]. Two reviewers (I.A, B.A) were involved when necessary, during the critical appraisal process. Accordingly, all studies were included in our review. ( Table  1 ) Questions to evaluate the methodological quality of studies on turnover intention among nurses and its associated factors in Ethiopia are the followings:

Q1 = was the sample frame appropriate to address the target population?

Q2. Were study participants sampled appropriately.

Q3. Was the sample size adequate?

Q4. Were the study subjects and the setting described in detail?

Q5. Was the data analysis conducted with sufficient coverage of the identified sample?

Q6. Were the valid methods used for the identification of the condition?

Q7. Was the condition measured in a standard, reliable way for all participants?

Q8. Was there appropriate statistical analysis?

Q9. Was the response rate adequate, and if not, was the low response rate.

managed appropriately?

Data was extracted and recorded in a Microsoft Excel as guided by the Joanna Briggs Institute (JBI) data extraction form for observational studies. Three authors (E.E, M.G, T.T) independently conducted data extraction. Recorded data included the first author’s last name, publication year, study setting or country, region, study design, study period, sample size, response rate, population, type of management, proportion of turnover intention, and associated factors. Discrepancies in data extraction were resolved through discussion between extractors.

Data processing and analysis

Data analysis procedures involved importing the extracted data into STATA 14 statistical software for conducting a pooled proportion of turnover intention among nurses. To evaluate potential publication bias and small study effects, both funnel plots and Egger’s test were employed [ 25 , 26 ]. We used statistical tests such as the I statistic to quantify heterogeneity and explore potential sources of variability. Additionally, subgroup analyses were conducted to investigate the impact of specific study characteristics on the overall results. I 2 values of 0%, 25%, 50%, and 75% were interpreted as indicating no, low, medium, and high heterogeneity, respectively [ 27 ].

To assess publication bias, we employed several methods, including funnel plots and Egger’s test. These techniques allowed us to visually inspect asymmetry in the distribution of study results and statistically evaluate the presence of publication bias. Furthermore, we conducted sensitivity analyses to assess the robustness of our findings to potential publication bias and other sources of bias.

Utilizing a random-effects method, a meta-analysis was performed to assess turnover intention among nurses, employing this method to account for observed variability [ 28 ]. Subgroup analyses were conducted to compare the pooled magnitude of turnover intention among nurses and associated factors across different regions. The results of the pooled prevalence were visually presented in a forest plot format with a 95% confidence interval.

Study selection

After conducting the initial comprehensive search concerning turnover intention among nurses through Medline, Cochran Library, Web of Science, Embase, Ajol, Google Scholar, and other sources, a total of 1343 articles were retrieved. Of which 575 were removed due to duplication. Five hundred ninety-three articles were removed from the remaining 768 articles by title and abstract. Following theses, 44 articles which cannot be retrieved were removed. Finally, from the remaining 131 articles, 8 articles with a total 3033 nurses were included in the systematic review and meta-analysis (Fig.  1 ).

figure 1

PRISMA flow diagram of the selection process of studies on turnover intention among nurses in Ethiopia, 2024

Study characteristics

All included 8 studies had a cross-sectional design and of which, 2 were from Tigray region, 2 were from Addis Ababa(Capital), 1 from south region, 1 from Amhara region, 1 from Sidama region, and 1 was multiregional and Nationwide. The prevalence of turnover intention among nurses ‘ranges from 30.6 to 80.6%. Table  2 .

Pooled prevalence of turnover intention among nurses in Ethiopia

Our comprehensive meta-analysis revealed a notable turnover intention rate of 53.35% (95% CI: 41.64, 65.05%) among Ethiopian nurses, accompanied by substantial heterogeneity between studies (I 2  = 97.9, P  = 0.000) as depicted in Fig.  2 . Given the observed variability, we employed a random-effects model to analyze the data, ensuring a robust adjustment for the significant heterogeneity across the included studies.

figure 2

Forest plot showing the pooled proportion of turnover intention among nurses in Ethiopia, 2024

Subgroup analysis of turnover intention among nurses in Ethiopia

To address the observed heterogeneity, we conducted a subgroup analysis based on regions. The results of the subgroup analysis highlighted considerable variations, with the highest level of turnover intention identified in Addis Ababa at 69.10% (95% CI: 46.47, 91.74%) and substantial heterogeneity (I 2  = 98.1%). Conversely, the Sidama region exhibited the lowest level of turnover intention among nurses at 30.6% (95% CI: 25.18, 36.02%), accompanied by considerable heterogeneity (I 2  = 100.0%) ( Fig.  3 ).

figure 3

Subgroup analysis of systematic review and meta-analysis by region of turnover intention among nurses in Ethiopia, 2024

Publication bias of turnover intention among nurses in Ethiopia

The Egger’s test result ( p  = 0.64) is not statistically significant, indicating no evidence of publication bias in the meta-analysis (Table  3 ). Additionally, the symmetrical distribution of included studies in the funnel plot (Fig.  4 ) confirms the absence of publication bias across studies.

figure 4

Funnel plot of systematic review and meta-analysis on turnover intention among nurses in Ethiopia, 2024

Sensitivity analysis

The leave-out-one sensitivity analysis served as a meticulous evaluation of the influence of individual studies on the comprehensive pooled prevalence of turnover intention within the context of Ethiopian nurses. In this systematic process, each study was methodically excluded from the analysis one at a time. The outcomes of this meticulous examination indicated that the exclusion of any particular study did not lead to a noteworthy or statistically significant alteration in the overall pooled estimate of turnover intention among nurses in Ethiopia. The findings are visually represented in Fig.  5 , illustrating the stability and robustness of the overall pooled estimate even with the removal of specific studies from the analysis.

figure 5

Sensitivity analysis of pooled prevalence for each study being removed at a time for systematic review and meta-analysis of turnover intention among nurses in Ethiopia

Factors associated with turnover intention among nurses in Ethiopia

In our meta-analysis, we comprehensively reviewed and conducted a meta-analysis on the determinants of turnover intention among nurses in Ethiopia by examining eight relevant studies [ 6 , 29 , 30 , 31 , 32 , 33 , 34 , 35 ]. We identified a significant association between turnover intention with autonomous decision-making (OR: 0.28, CI: 0.14, 0.70) (Fig.  6 ) and promotion/development (OR: 0.67, CI: 0.46, 0.89) (Fig.  7 ). In both instances, the odds ratios suggest a negative association, signifying that increased levels of autonomous decision-making and promotion/development were linked to reduced odds of turnover intention.

figure 6

Forest plot of the association between autonomous decision making with turnover intention among nurses in Ethiopia2024

figure 7

Forest plot of the association between promotion/developpment with turnover intention among nurses in Ethiopia, 2024

In our comprehensive meta-analysis exploring turnover intention among nurses in Ethiopia, our findings revealed a pooled proportion of turnover intention at 53.35%. This significant proportion warrants a comparative analysis with turnover rates reported in other global regions. Distinct variations emerge when compared with turnover rates in Alexandria (68%), China (63.88%), and Jordan (60.9%) [ 5 , 6 , 7 ]. This comparison highlights that the multifaceted nature of turnover intention, influenced by diverse contextual, cultural, and organizational factors. Conversely, Ethiopia’s turnover rate among nurses contrasts with substantially lower figures reported in Israel (9%) [ 8 ], Brazil (21.1%) [ 9 ], and Saudi hospitals (26%) [ 10 ]. Challenges such as work overload, economic constraints, limited promotional opportunities, lack of recognition, and low job rewards are more prevalent among nurses in Ethiopia, contributing to higher turnover intention compared to their counterparts [ 7 , 29 , 36 ].

The highest turnover intention was observed in Addis Ababa, while Sidama region displayed the lowest turnover intention among nurses, These differences highlight the complexity of turnover intention among Ethiopian nurses, showing the importance of specific interventions in each region to address unique factors and improve nurses’ retention.

Our systematic review and meta-analysis in the Ethiopian nursing context revealed a significant inverse association between turnover intention and autonomous decision-making. The odd of turnover intention is approximately reduced by 72% in employees with autonomous decision-making compared to those without autonomous decision-making. This finding was supported by other similar studies conducted in South Africa, Tanzania, Kenya, and Turkey [ 37 , 38 , 39 , 40 ].

The significant association of turnover intention with promotion/development in our study underscores the crucial role of career advancement opportunities in alleviating turnover intention among nurses. Specifically, our analysis revealed that individuals with promotion/development had approximately 33% lower odds of turnover intention compared to those without such opportunities. These results emphasize the pivotal influence of organizational support in shaping the professional environment for nurses, providing substantive insights for the formulation of evidence-based strategies targeted at enhancing workforce retention. This finding is in line with former researches conducted in Taiwan, Philippines and Italy [ 41 , 42 , 43 ].

Our meta-analysis on turnover intention among Ethiopian nurses reveals a considerable challenge, with a pooled proportion of 53.35%. Regional variations highlight the necessity for region-specific strategies, with Addis Ababa displaying the highest turnover intention and Sidama region the lowest. A significant inverse association was found between turnover intention with autonomous decision-making and promotion/development. These insights support the formulation of evidence-based strategies and policies to enhance nurse retention, contributing to the overall stability of the Ethiopian healthcare system.


Federal ministry of health (fmoh).

The FMoH should consider the regional variations in turnover intention and formulate targeted retention strategies. Investment in professional development opportunities and initiatives to enhance autonomy can be integral components of these strategies.

Ethiopian nurses association (ENA)

ENA plays a pivotal role in advocating for the welfare of nurses. The association is encouraged to collaborate with healthcare institutions to promote autonomy, create mentorship programs, and advocate for improved working conditions to mitigate turnover intention.

Healthcare institutions

Hospitals and healthcare facilities should prioritize the provision of career advancement opportunities and recognize the value of professional autonomy in retaining nursing staff. Tailored interventions based on regional variations should be considered.

Policy makers

Policymakers should review existing healthcare policies to identify areas for improvement in nurse retention. Policy changes that address challenges such as work overload, limited promotional opportunities, and economic constraints can positively impact turnover rates.

Future research initiatives

Further research exploring the specific factors contributing to turnover intention in different regions of Ethiopia is recommended. Understanding the nuanced challenges faced by nurses in various settings will inform the development of more targeted interventions.

Strength and limitations

Our systematic review and meta-analysis on nurse turnover intention in Ethiopia present several strengths. The comprehensive inclusion of diverse studies provides a holistic view of the issue, enhancing the generalizability of our findings. The use of a random-effects model accounts for potential heterogeneity, ensuring a more robust and reliable synthesis of data.

However, limitations should be acknowledged. The heterogeneity observed across studies, despite the use of a random-effects model, may impact the precision of the pooled estimate. These considerations should be taken into account when interpreting and applying the results of our analysis.

Data availability

Data set used on this analysis will available from corresponding author upon reasonable request.


Ethiopian Nurses Association

Federal Ministry of Health

Joanna Briggs Institute

Preferred Reporting Items for Systematic review and Meta-analysis Protocols

Kanchana L, Jayathilaka R. Factors impacting employee turnover intentions among professionals in Sri Lankan startups. PLoS ONE. 2023;18(2):e0281729.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Boateng AB, et al. Factors influencing turnover intention among nurses and midwives in Ghana. Nurs Res Pract. 2022;2022:4299702.

PubMed   PubMed Central   Google Scholar  

Organization WH. WHO Guideline on Health Workforce Development Attraction, Recruitment and Retention in Rural and Remote Areas, 2021, pp. 1-104.

Hayes LJ, et al. Nurse turnover: a literature review. Int J Nurs Stud. 2006;43(2):237–63.

Article   PubMed   Google Scholar  

Yang H, et al. Validation of work pressure and associated factors influencing hospital nurse turnover: a cross-sectional investigation in Shaanxi Province, China. BMC Health Serv Res. 2017;17:1–11.

Article   Google Scholar  

Ayalew E et al. Nurses’ intention to leave their job in sub-Saharan Africa: A systematic review and meta-analysis. Heliyon, 2021. 7(6).

Al Momani M. Factors influencing public hospital nurses’ intentions to leave their current employment in Jordan. Int J Community Med Public Health. 2017;4(6):1847–53.

DeKeyser Ganz F, Toren O. Israeli nurse practice environment characteristics, retention, and job satisfaction. Isr J Health Policy Res. 2014;3(1):1–8.

de Oliveira DR, et al. Intention to leave profession, psychosocial environment and self-rated health among registered nurses from large hospitals in Brazil: a cross-sectional study. BMC Health Serv Res. 2017;17(1):21.

Article   PubMed   PubMed Central   Google Scholar  

Dall’Ora C, et al. Association of 12 h shifts and nurses’ job satisfaction, burnout and intention to leave: findings from a cross-sectional study of 12 European countries. BMJ Open. 2015;5(9):e008331.

Lu H, Zhao Y, While A. Job satisfaction among hospital nurses: a literature review. Int J Nurs Stud. 2019;94:21–31.

Ramoo V, Abdullah KL, Piaw CY. The relationship between job satisfaction and intention to leave current employment among registered nurses in a teaching hospital. J Clin Nurs. 2013;22(21–22):3141–52.

Al Sabei SD, et al. Nursing work environment, turnover intention, Job Burnout, and Quality of Care: the moderating role of job satisfaction. J Nurs Scholarsh. 2020;52(1):95–104.

Wang H, Chen H, Chen J. Correlation study on payment satisfaction, psychological reward satisfaction and turnover intention of nurses. Chin Hosp Manag. 2018;38(03):64–6.

Google Scholar  

Loes CN, Tobin MB. Interpersonal conflict and organizational commitment among licensed practical nurses. Health Care Manag (Frederick). 2018;37(2):175–82.

Wei H, et al. The state of the science of nurse work environments in the United States: a systematic review. Int J Nurs Sci. 2018;5(3):287–300.

Nantsupawat A, et al. Effects of nurse work environment on job dissatisfaction, burnout, intention to leave. Int Nurs Rev. 2017;64(1):91–8.

Article   CAS   PubMed   Google Scholar  

Ayalew F, et al. Factors affecting turnover intention among nurses in Ethiopia. World Health Popul. 2015;16(2):62–74.

Debie A, Khatri RB, Assefa Y. Contributions and challenges of healthcare financing towards universal health coverage in Ethiopia: a narrative evidence synthesis. BMC Health Serv Res. 2022;22(1):866.

Moher D, et al. Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement. Syst Reviews. 2015;4(1):1–9.

Moher D, et al. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Ann Intern Med. 2009;151(4):264–9.

Moher D et al. Group, P.-P.(2015) Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement.

Institute JB. Checklist for Prevalence Studies. Checkl prevalance Stud [Internet]. 2016;7.

Sakonidou S, et al. Interventions to improve quantitative measures of parent satisfaction in neonatal care: a systematic review. BMJ Paediatr Open. 2020;4(1):e000613.

Egger M, Smith GD. Meta-analysis: potentials and promise. BMJ. 1997;315(7119):1371.

Tura G, Fantahun M, Worku A. The effect of health facility delivery on neonatal mortality: systematic review and meta-analysis. BMC Pregnancy Childbirth. 2013;13:18.

Lin L. Comparison of four heterogeneity measures for meta-analysis. J Eval Clin Pract. 2020;26(1):376–84.

McFarland LV. Meta-analysis of probiotics for the prevention of antibiotic associated diarrhea and the treatment of Clostridium difficile disease. Am J Gastroenterol. 2006;101(4):812–22.

Asegid A, Belachew T, Yimam E. Factors influencing job satisfaction and anticipated turnover among nurses in Sidama zone public health facilities, South Ethiopia Nursing research and practice, 2014. 2014.

Wubetie A, Taye B, Girma B. Magnitude of turnover intention and associated factors among nurses working in emergency departments of governmental hospitals in Addis Ababa, Ethiopia: a cross-sectional institutional based study. BMC Nurs. 2020;19:97.

Getie GA, Betre ET, Hareri HA. Assessment of factors affecting turnover intention among nurses working at governmental health care institutions in east Gojjam, Amhara region, Ethiopia, 2013. Am J Nurs Sci. 2015;4(3):107–12.

Gebregziabher D, et al. The relationship between job satisfaction and turnover intention among nurses in Axum comprehensive and specialized hospital Tigray, Ethiopia. BMC Nurs. 2020;19(1):79.

Negarandeh R et al. Magnitude of nurses’ intention to leave their jobs and its associated factors of nurses working in tigray regional state, north ethiopia: cross sectional study 2020.

Nigussie Bolado G, et al. The magnitude of turnover intention and Associated factors among nurses working at Governmental Hospitals in Southern Ethiopia: a mixed-method study. Nursing: Research and Reviews; 2023. pp. 13–29.

Woldekiros AN, Getye E, Abdo ZA. Magnitude of job satisfaction and intention to leave their present job among nurses in selected federal hospitals in Addis Ababa, Ethiopia. PLoS ONE. 2022;17(6):e0269540.

Rhoades L, Eisenberger R. Perceived organizational support: a review of the literature. J Appl Psychol. 2002;87(4):698.

Lewis M. Causal factors that influence turnover intent in a manufacturing organisation. University of Pretoria (South Africa); 2008.

Kuria S, Alice O, Wanderi PM. Assessment of causes of labour turnover in three and five star-rated hotels in Kenya International journal of business and social science, 2012. 3(15).

Blaauw D, et al. Comparing the job satisfaction and intention to leave of different categories of health workers in Tanzania, Malawi, and South Africa. Global Health Action. 2013;6(1):19287.

Masum AKM, et al. Job satisfaction and intention to quit: an empirical analysis of nurses in Turkey. PeerJ. 2016;4:e1896.

Song L. A study of factors influencing turnover intention of King Power Group at Downtown Area in Bangkok, Thailand. Volume 2. International Review of Research in Emerging Markets & the Global Economy; 2016. 3.

Karanikola MN, et al. Moral distress, autonomy and nurse-physician collaboration among intensive care unit nurses in Italy. J Nurs Manag. 2014;22(4):472–84.

Labrague LJ, McEnroe-Petitte DM, Tsaras K. Predictors and outcomes of nurse professional autonomy: a cross-sectional study. Int J Nurs Pract. 2019;25(1):e12711.

Download references

No funding was received.

Author information

Authors and affiliations.

School of Nursing, College of Health Science and Medicine, Wolaita Sodo University, Wolaita Sodo, Ethiopia

Eshetu Elfios, Israel Asale, Merid Merkine, Temesgen Geta, Kidist Ashager, Getachew Nigussie, Ayele Agena & Bizuayehu Atinafu

Department of Midwifery, College of Health Science and Medicine, Wolaita Sodo University, Wolaita Sodo, Ethiopia

Eskindir Israel

Department of Midwifery, College of Health Science and Medicine, Wachamo University, Hossana, Ethiopia

Teketel Tesfaye

You can also search for this author in PubMed   Google Scholar


E.E. conceptualized the study, designed the research, performed statistical analysis, and led the manuscript writing. I.A, T.G, M.M contributed to the study design and provided critical revisions. K.A., G.N, B.A., E.I., and T.T. participated in data extraction and quality assessment. M.M. and T.G. K.A. and G.N. contributed to the literature review. I.A, A.A. and B.A. assisted in data interpretation. E.I. and T.T. provided critical revisions to the manuscript. All authors read and approved the final version.

Corresponding author

Correspondence to Eshetu Elfios .

Ethics declarations

Ethical approval.

Ethical approval and informed consent are not required, as this study is a systematic review and meta-analysis that only involved the use of previously published data.

Ethical guidelines

Not applicable.

Consent for publication

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Elfios, E., Asale, I., Merkine, M. et al. Turnover intention and its associated factors among nurses in Ethiopia: a systematic review and meta-analysis. BMC Health Serv Res 24 , 662 (2024). https://doi.org/10.1186/s12913-024-11122-9

Download citation

Received : 20 January 2024

Accepted : 20 May 2024

Published : 24 May 2024

DOI : https://doi.org/10.1186/s12913-024-11122-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Turnover intention
  • Systematic review
  • Meta-analysis

BMC Health Services Research

ISSN: 1472-6963

research proposal meta analysis

  • Open access
  • Published: 24 May 2024

Rates of bronchopulmonary dysplasia in very low birth weight neonates: a systematic review and meta-analysis

  • Alvaro Moreira 1 ,
  • Michelle Noronha 1   na1 ,
  • Jooby Joy 2   na1 ,
  • Noah Bierwirth 1 ,
  • Aina Tarriela 1 ,
  • Aliha Naqvi 1 ,
  • Sarah Zoretic 3 ,
  • Maxwell Jones 1 ,
  • Ali Marotta 1 ,
  • Taylor Valadie 1 ,
  • Jonathan Brick 1 ,
  • Caitlyn Winter 1 ,
  • Melissa Porter 1 ,
  • Isabelle Decker 1 ,
  • Matteo Bruschettini 4 &
  • Sunil K. Ahuja 5 , 6 , 7 , 8 , 9 , 10  

Respiratory Research volume  25 , Article number:  219 ( 2024 ) Cite this article

372 Accesses

2 Altmetric

Metrics details

Large-scale estimates of bronchopulmonary dysplasia (BPD) are warranted for adequate prevention and treatment. However, systematic approaches to ascertain rates of BPD are lacking.

To conduct a systematic review and meta-analysis to assess the prevalence of BPD in very low birth weight (≤ 1,500 g) or very low gestational age (< 32 weeks) neonates.

Data sources

A search of MEDLINE from January 1990 until September 2019 using search terms related to BPD and prevalence was performed.

Study selection

Randomized controlled trials and observational studies evaluating rates of BPD in very low birth weight or very low gestational age infants were eligible. Included studies defined BPD as positive pressure ventilation or oxygen requirement at 28 days (BPD28) or at 36 weeks postmenstrual age (BPD36).

Data extraction and synthesis

Two reviewers independently conducted all stages of the review. Random-effects meta-analysis was used to calculate the pooled prevalence. Subgroup analyses included gestational age group, birth weight group, setting, study period, continent, and gross domestic product. Sensitivity analyses were performed to reduce study heterogeneity.

Main outcomes and measures

Prevalence of BPD defined as BPD28, BPD36, and by subgroups.

A total of 105 articles or databases and 780,936 patients were included in this review. The pooled prevalence was 35% (95% CI, 28-42%) for BPD28 ( n  = 26 datasets, 132,247 neonates), and 21% (95% CI, 19-24%) for BPD36 ( n  = 70 studies, 672,769 neonates). In subgroup meta-analyses, birth weight category, gestational age category, and continent were strong drivers of the pooled prevalence of BPD.

Conclusions and relevance

This study provides a global estimation of BPD prevalence in very low birth weight/low gestation neonates.


Bronchopulmonary dysplasia (BPD), characterized as an arrest of lung growth and development, is an important cause of morbidity and mortality in very preterm newborns [ 1 ]. While interventions in neonatal care have led to survival of smaller and younger neonates, therapies for BPD are still limited [ 2 ]. Therefore, there is an urgent need for early prediction of BPD and implementation of strategies and therapies that can attenuate disease progression. To accomplish such endeavors, we must first ascertain large-scale estimates of BPD and its global impact over time. In doing so, the effect of interventions and progress towards reducing rates of BPD can be more readily measured. Valid and consistent estimates of the prevalence of BPD around the globe are largely lacking.

A previous study estimated global rates of BPD; however, the definition of BPD was not determined a priori and the estimation was reported as a set of ranges per country as opposed to a pooled rate [ 3 ]. Challenges to estimating comprehensive rates of BPD include the varying definitions (e.g., 28 day versus 36 week assessment [ 4 , 5 ]), as well as the heterogeneous inclusion criteria of preterm neonates in studies (e.g., gestational-based inclusion compared to birth weight-based parameters or a combination of both). To overcome these barriers, we sought to conduct a systematic review and meta-analysis that would: (i) estimate global trends in the prevalence of BPD, (ii) examine temporal changes in BPD rates, and (iii) stratify BPD rates according to definition, birth weight, gestational age, setting, continent, and gross domestic product (GDP).

We conducted a systematic review and meta-analysis according to recommendations from the Cochrane Handbook for Systematic Reviews of Interventions and adhered to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) criteria [ 6 ]. A protocol of this review was not registered.

Search strategy

Two investigators (A.M. and M.N.) systematically searched MEDLINE from January 1990 to September 30th, 2019. Search terms included ( bronchopulmonary dysplasia OR chronic lung disease) AND a list of each country. Articles were filtered to include children between the age range of birth and 1 month post-term, no limits were placed on language, and refined to remove review articles. Furthermore, review of references from included studies was performed to supplement our initial search. The full search strategy is presented in eMethods 1 in the Supplement. Lastly, we reviewed all the population-based articles from a systematic review by Siffel et al. [ 3 ] wherein they examined global rates of BPD. To enhance the comprehensiveness of our investigation, we integrated national registries from countries that were publicly available documenting outcomes related to BPD.

Two groups of investigators (group 1: A.T. and M.N.; group 2: A.N. and A.M) independently reviewed the titles and abstracts of all citations to determine suitability for inclusion. This was followed by independent review of the full-text articles to confirm eligibility. A third author (S.Z.) resolved any disagreements. Studies were included if they were international or national level (e.g., population-based) studies reporting rates of BPD from 1990 to 2019. The search was initiated from 1990, as this marks the time when surfactant therapy became increasingly standard of care in neonatal centers [ 7 ]. The end date was chosen as 2019 to exclude publications using the newest definition for BPD [ 8 ]. We included data for all neonates at risk for BPD with confirmed diagnosis occurring in the hospital or prior to discharge. Studies with inclusion criteria of male and female neonates with a birth weight of less than or equal to 1,500 g or a gestational age of less than 32 weeks were included. Due to limited availability of granular patient-level data in the included studies, mortality rates for each study were collected. Case reports, editorials, and commentaries were excluded.

Data extraction

Two sets of authors (A.T. and M.N.; A.N. and A.M.) independently collected study details. Two authors (J.J. and S.Z.) independently verified the accuracy of collated information. Inconsistencies were discussed among a panel of at least four investigators. Study specifics included country, BPD definition, BPD rates, total number of neonates in the study, years of observation, inclusion criteria, and study design. Articles and standardized data collection sheets were maintained in Google Drive folders. GetData Graph Digitizer version was used to collect values from figures when mortality data was not described in the article text.

Risk of bias

The risk of bias was judged in a binary fashion (e.g., yes = 1 or no = 0). We assessed the risk of bias for observational studies according to the Newcastle-Ottawa Quality Assessment Scale in three dimensions, selection, comparability, and outcome. The score for observational studies ranged from 0 to 8, representing bias risk for each article. Studies were defined as having a high risk of bias if the total score was five or lower, moderate risk of bias if the score was between five and six, and low bias if the total summed to greater than seven. We assessed the risk of bias for controlled studies according to the Cochrane Risk of Bias Tool using seven dimensions, selection bias (including random sequence generation and allocation concealment), reporting bias, other bias, performance bias, detection bias, and attrition bias. The score for randomized controlled studies ranged from 0 to 7, representing bias risk for each article.

Definitions and outcomes

A priori , BPD was defined by two categories: (i) BPD28- supplemental oxygen or positive pressure ventilation at 28 days of life, and (ii) BPD36- supplemental oxygen or positive pressure ventilation at 36 weeks postmenstrual age. The pooled prevalence of BPD is presented as forest plots for BPD28 and BPD36. If the study stratified patient numbers by both definitions, we included both to each pooled rate. When articles overlapped in time period for a particular country, the articles with more comprehensive data were selected for inclusion. Prespecified subgroup analyses included birth weight categories, gestational age, years, setting, continent, and gross domestic product (GDP). Precisely, gestational age was divided into extremely low gestational age (ELGA) (≤ 28 weeks) vs. very low gestational age (VLGA) (< 32 weeks), while study setting was stratified into international or national. Study years were binned into three decades: 1990–1999, 2000–2009, 2010–2019. This approach was used to explore temporal changes in BPD. The year 1990 was used as the time of inception as the late 1980s and early 1990s is when clinical trials for surfactant use demonstrated efficacy in the care of preterm neonates with respiratory distress syndrome. Birth weight was sorted into extremely low birth weight (ELBW) ( ≤  1,000 g), very low birth weight (VLBW) (≤ 1,500 g), and modifications of these terms (e.g., 501–750 g, 751–1000 g, 1001–1250 g, and 1251–1500 g). To clarify, the subgroup analysis by birth weight of 1000 g was conducted by categorizing studies based on the specified birth weight ranges. Specifically, studies were included in this subgroup analysis if they reported data on all infants falling within the designated birth weight range of interest and not average birthweight reported for a cohort.

Statistical analysis

The primary outcome was expressed using direct proportions (PR) with a 95% confidence interval (CI) following Freeman-Tukey double arc-sine transformation of the raw data [ 9 ]. Expecting high heterogeneity, defined as an I 2 #x2009;> 50%, all analyses used a DerSimonian–Laird estimate with a random-effects meta-analysis model. The presence of publication bias was evaluated qualitatively using funnel plots and quantitatively conducing Egger’s linear regression test. At least ten studies were needed to perform subgroup analyses. All statistical analyses were performed using R version 4.1.0.

Identification of Eligible studies

Our search yielded 4582 records, of which 2318 were reviewed in full. After applying the eligibility criteria, a total of 42 were included in this review. We also identified three publicly available national datasets: Australian and New Zealand Neonatal Network, Canadian Neonatal Network, and Neonatal Research Network of Japan Database. Meta-analyses were performed on all studies and databases, moving forward now referred to as datasets. In sum, a combination of 74 datasets comprised the analysis for BPD28 and BPD26 as well as their subgroup analyses. The flow diagram of selected articles is shown in Fig.  1 .

figure 1

Life satisfaction scores at age 30. 12 = highest possible score. 3 = Lowest possible score

Figure 1 PRISMA flowchart of literature identification and study selection.

Study characteristics

Table  1 provides detailed characteristics of the included articles. All the chosen articles were based on cohort investigations and on the two predetermined BPD definitions: BPD28 and BPD36. The most commonly used definition for BPD was BPD36. Thirty countries were represented in the studies, and the countries that produced the most data were Australia and New Zealand ( n  = 24/70 datasets, 34.3%). A total of 672,769 patients were included in this review. Twenty-six out of the 70 datasets (37.1%) in BPD36 were published from 2010 onwards.

Pooled and stratified prevalence of BPD

The pooled prevalence for BPD28 calculated from 27 datasets and 132,424 neonates was 35% (95% CI, 0.28–0.42) using random effects meta-analysis (Fig. 2). For BPD36 ( n  = 70 studies, 672,769 neonates), the pooled prevalence was 21% (95% CI, 0.19–0.24) (Fig. 3). Table  2 depicts the prevalence of BPD28 and BPD36 according to gestational age, birth weight, study period, continent, setting, and GDP (subgroup analysis).

Figure 2 Pooled prevalence for BPD28. Forest plot demonstrating pooled prevalence for BPD28 and 95% CI with a random-effects meta-analysis model.

Figure 3 Pooled prevalence for BPD36. Forest plot demonstrating pooled prevalence for BPD36 and 95% CI with a random-effects meta-analysis model.

Subgroup analysis for BPD28

When stratified by birth weight, the highest rates of BPD28 were found in infants with lower birth weights: <1000 g (ELBW). For instance, infants in the lowest birth weight stratum (< 500 g) had a BPD28 prevalence of 99% (95% CI, 0.97-1.00), while those in the second-lowest birth weight stratum (501–750 g) had a BPD28 prevalence of 87% (95% CI, 0.75–0.96). The BPD28 prevalence was lowest (16%; 95% CI, 0.11–0.22) in infants with the highest birth weights (1251–1500 g). The prevalence of BPD28 was higher in ELGA versus VLGA neonates (90% vs. 29%). The subgroup analysis of BPD28 by setting showed a higher rate in the national compared to multinational studies, as well as Oceania compared to other continents. Overall, no differences were observed in BPD28 prevalence when stratified by year or GDP.

Subgroup analysis for BPD36

The subgroup analysis for prevalence of BPD36 stratified by birth weight was very similar to the BPD28 analysis, in which an upward trend in the prevalence of BPD36 was associated with lower birth weights. For example, the highest prevalence of BPD36 was noted in neonates with a birth weight of less than 1000 g (ELBW). Further stratification of the ELBW neonates revealed BPD36 prevalence rates of 71% (95% CI, 0.51–0.87) and 60% (95% CI, 0.51–0.68) in neonates with birth weights of < 500 g and 501–750 g, respectively. Again, the lowest prevalence of BPD36, 10% (95% CI, 0.07–0.13), was seen in the highest (1251–1500 g) birth weight stratum.

Similar to the findings using the BPD28 definition, prevalence of BPD36 was higher in ELGA neonates (43% n  = 358,636, versus 12% n  = 126,368). Prevalence of BPD36 was also higher in national studies. Lastly, BPD36 prevalence again differed when stratified by continent. The highest prevalence was seen in North America at 329% (95% CI, 0.25–0.33). Rates of BPD36 were similar across GDP strata and year.

Sensitivity analysis and mortality rate

We conducted sensitivity analysis on the prevalence of BPD28 and BPD36 to reduce heterogeneity, defined as an I 2  ≥ 50%. After keeping only 4 studies, the prevalence of BPD28 was 32% (95% CI, 0.31–0.32; I 2  = 0%, eResults 1 ). For BPD36, 10 studies remained after filtering for high heterogeneity. The resulting rate of BPD36 was 25% (95% CI, 0.25–0.26; I 2  = 49%, eResults 2 ). The table in eResults 3 shows the varying range of mortality rates for each of the studies (range of 0–23.9% with an average rate of 8.1%).

Risk of bias and publication bias

Forty-two studies were evaluated by the Newcastle-Ottawa Quality Assessment Scale and one study by the Cochrane Risk of Bias Tool. Thirty (74%) of the observational studies had a moderate bias (total score ranging from 5 to 6) ( eTable 1 ). The domain that had the most bias pertained to questions regarding follow-up outcomes. Nine studies (21%) had low risk of bias (total score between 7 and 8). The single randomized controlled trial had a risk of bias score of five out of seven. Publication bias was low for BPD28 and BPD36. Plots can be viewed in eFigures 1, 2 .

Bronchopulmonary dysplasia remains the most common morbidity of prematurity and carries a significant disease burden [ 10 ]. Throughout the published literature, BPD displays itself as a disease with significant heterogeneity [ 11 , 12 , 13 , 14 ]. This is found not only within different “types” of BPD but also within the definition itself; as published data defines it as oxygen at 28 days, 36 weeks or other combinations of factors [ 15 ]. Therefore, it is essential to have accurate information for prediction, analysis and treatment. We performed this systematic review and meta-analysis to determine large-scale rates of bronchopulmonary dysplasia, with a subgroup analysis according to two major definitions. To our knowledge this is the largest and most comprehensive study describing BPD prevalence to date.

Our study expands on the 2019 study by Siffel et al. [ 3 ] to provide a more complete review of available data. We discovered, reviewed and analyzed data over a 41-year period (versus 11 years), with inclusion of a higher number of studies across more regions. As an additional contrast, we defined BPD (oxygen at 28 days or 36 weeks) and manually extracted data for combined analysis. This allowed us to use pooled data to compare subgroups and pursue further statistical analyses. We were therefore able to provide a more accurate prevalence for each provided outcome, rather than reporting outcomes as a set of ranges from individual studies.

As anticipated, the foremost risk factor for developing BPD was found to be low birth weight, particularly with a weight below 750 g. This trend was evident across both individual subgroup analyses and combined evaluations. Additionally, our observations revealed discrepancies in BPD rates among different gestational age groups, notably between ELGA and VLGA infants. These findings align with existing literature that underscores an inverse association between BPD rates and gestational age/birthweight, further affirming the current understanding in the field [ 8 , 16 ].

We also compared BPD rates across three decades (1990–1999, 2000–2010 and 2010–2020), which showed no difference between the groups across the definitions of BPD. This is found throughout the literature and highlights the difficulty in preventing and treating this disease. Medical advancements in the care of preterm neonates have led to higher survival, especially in the most industrialized nations [ 17 , 18 ]. This coincides with the survival of more infants with BPD and accounts for much of the similarity of the prevalence across decades. While our study focuses on reporting BPD rates in decade cohorts, it’s essential to acknowledge the limitations inherent in utilizing these broader definitions of BPD. We recognize that the clinical landscape of BPD management may have evolved over the past 30 years, potentially leading to improvements not fully captured by the BPD28 and BPD36 definitions. Our exclusion of studies using the newer BPD definition by Jensen et al. was indeed mentioned in the methods section, but we acknowledge the importance of reiterating this point here for clarity.

While the incidence of BPD exhibits considerable variation among different countries, current evidence indicates minimal disparities in its prevalence across major continents. Numerous studies have explored BPD incidence and associated risk factors in various regions spanning North America, Europe, Asia, and Australia, generally yielding comparable rates. For instance, research by Jain et al. found no significant divergence in BPD incidence among preterm infants across North America, Europe, and Australia [ 19 ]. In contrast, our investigation suggests notable differences in BPD rates among regions or continents, particularly with lower rates observed in Europe and South America. However, it’s noteworthy that South America’s data pool was limited to just 1–2 studies. These findings imply that the risk factors and underlying pathophysiology of BPD may not uniformly align across geographical regions, underscoring the imperative for further investigation to elucidate these distinctions. This prompts consideration as to whether disparities in clinical practices might potentially justify these findings.

The Neonatal Research Network (NRN) in the United States has compiled large retrospective analyses of care practice and patient outcomes among extremely premature infants. They have demonstrated that rates of antenatal steroids and surfactant administration have increased, delivery room intubation has decreased [ 7 ]. However, the rates of bronchopulmonary dysplasia (BPD36) in their study ranged from 32 to 45%, which is notably higher than the 21% observed in this study. This difference could be attributed to the varying gestational ages included in the studies, as the NRN’s research comprised newborns between 22 and 28 weeks. In comparison, the Chinese Neonatal Network’s cohort of 8,148 preterm neonates had a BPD36 rate of 29.2%, which is higher than our study’s results, again differences most likely due to their inclusion of neonates 31 weeks and younger whereas our study included neonates of ≤ 32 weeks [ 20 ].

The prevalence of BPD varied depending on the study setting, with national cohorts demonstrating the highest rates for both definitions of BPD. These estimates may be more reliable, as they offer a broader representation across multiple institutions, reducing the impact of outliers and the unique management practices of individual hospitals on the results. Furthermore, many of these national studies employed inclusion criteria that targeted younger gestational ages, further enhancing their robustness. Despite the thought that GDP may have an impact on BPD rates, subgroup analyses based on quartiles of a nation’s GDP showed no differences. One possible explanation for this finding is that other factors beyond GDP, such as access to healthcare and neonatal resources, may play a more significant role.


Despite conducting an extensive data search employing multiple reviewers and diverse search methods, there remains a possibility that certain available studies may have been overlooked. Our findings reveal considerable heterogeneity across all examined outcomes, with many I2 values approaching 1. Despite efforts to minimize this through meticulous data extraction and analysis, the persistence of heterogeneity underscores the importance of cautiously applying the results to specific disease populations. For example, Bonamy et al. reported low BPD rates as it exclusively classified the condition in individuals with the severe form of the disease. In an attempt to mitigate the observed heterogeneity, we conducted a sensitivity analysis, which yielded rates comparable to those obtained in the initial analysis characterized by high heterogeneity.

Another constraint stems from the limited granularity of the original datasets, owing to the diverse definitions of BPD and the myriad ways in which data can be presented. This limitation restricts our ability to conduct more sophisticated statistical analyses and may lead to unequal weighting of studies where data accessibility varies. Additionally, there is a notable disparity in the amount of data available for some regions, notably North America, Oceania, and Europe, compared to other global populations. It would have been ideal to gather data as comprehensive as that publicly available from Australia and New Zealand, Canada, and Japan. Moreover, handling mortality data was a significant challenge in our analysis. We encountered variations among studies, where some solely included survivors while others reported mortality rates without adjusting them in their BPD rates. Some observed rates may have been exceptionally low, especially if their mortality rates were high. We were unable to solely include survivors due to variations in study methodologies, with some studies including only survivors while others encompassed all patients in their denominator for BPD, regardless of neonatal mortality. Adapting our analysis to account for this disparity without access to patient-level data limited our analyses. To address this limitation, we included mortality rates in the supplementary materials. This allows for transparency regarding the impact of mortality on our findings and provides additional context for interpreting the results. While we hypothesized differences in pathophysiology as a possible cause for national differences, it is essential to acknowledge other potential factors that may influence BPD rates, such as variations in reporting practices, gestational age and birth weight distributions, and early mortality rates. These factors could contribute to the observed regional differences in BPD rates and warrant further investigation. Also, differences in the sophistication of medical treatment across regions impacts survival and eventual diagnosis of BPD, all of which affect overall outcomes and generalizability.


To conclude, this large systematic review and meta-analysis shows that despite advancements, the prevalence of bronchopulmonary dysplasia has remained consistent through decades and is a significant burden across populations. The data generated from this study could serve as baseline rates for future research and could help guide the development of bundled care strategies aimed at decreasing BPD rates [ 21 ]. Ultimately, a greater understanding of modifiable factors that contribute to BPD development is critical to improving outcomes and reducing the burden of this disease.

Data availability

All data generated or analyzed during this study are included in this published article and its supplementary information.


bronchopulmonary dysplasia

gross domestic product

extremely low gestational age

very low gestational age

extremely low birth weight

very low birth weight

confidence interval

Neonatal Research Network

Jobe AH. The new bronchopulmonary dysplasia. Curr Opin Pediatr. 2011. https://doi.org/10.1097/MOP.0b013e3283423e6b .

Article   PubMed   PubMed Central   Google Scholar  

Doyle LW, Carse E, Adams A-M, Ranganathan S, Opie G, Cheong JLY. Ventilation in extremely Preterm infants and respiratory function at 8 years. N Engl J Med. 2017;377(4):329–37. https://doi.org/10.1056/nejmoa1700827 .

Article   PubMed   Google Scholar  

Siffel C, Kistler KD, Lewis JFM, Sarda SP. Global prevalence of bronchopulmonary dysplasia among extremely preterm infants: a systematic literature review. J Matern Neonatal Med. 2021;34(11):1721–31. https://doi.org/10.1080/14767058.2019.1646240 .

Article   Google Scholar  

Jobe AH, Bancalari E. NICHD / NHLBI / ORD workshop Summary. Am J Respir Crit Care Med. 2001;163:1723–9.

Article   CAS   PubMed   Google Scholar  

Higgins RD, Jobe AH, Koso-Thomas M, et al. Bronchopulmonary dysplasia: executive summary of a workshop. J Pediatr. 2018;197:300–8. https://doi.org/10.1016/j.jpeds.2018.01.043 .

Moher D, Liberati A, Tetzlaff J, Altman DG. Academia and Clinic annals of Internal Medicine Preferred reporting items for systematic reviews and Meta-analyses. Ann Intern Med. 2009;151(4):264–9.

Stoll BJ, Hansen NI, Bell EF, et al. Trends in Care practices, Morbidity, and mortality of extremely Preterm neonates, 1993–2012. JAMA. 2015;314(10):1039–51. https://doi.org/10.1001/jama.2015.10244 .

Article   CAS   PubMed   PubMed Central   Google Scholar  

Bancalari E, Jain D. Bronchopulmonary dysplasia: 50 years after the original description. Neonatology. 2019;115(4):384–91. https://doi.org/10.1159/000497422 .

Barendregt JJ, Doi SA, Lee YY, Norman RE, Vos T. Meta-analysis of prevalence p = Ì ^ i) E- -. 2013;67(11):974–8.

Thébaud B, Goss KN, Laughon M, et al. Bronchopulmonary dysplasia. Nat Rev Dis Prim. 2019;5(1). https://doi.org/10.1038/s41572-019-0127-7 .

Gibbs K, Jensen EA, Alexiou S, Munson D, Zhang H. Ventilation strategies in severe bronchopulmonary dysplasia. Neoreviews. 2020;21(4):e226–37. https://doi.org/10.1542/NEO.21-4-E226 .

Cochrane database Syst Rev . 2021;10(10):CD001146. doi:10.1002/14651858.CD001146.pub6.

Wu KY, Jensen EA, White AM, et al. Characterization of Disease phenotype in very preterm infants with severe bronchopulmonary dysplasia. Am J Respir Crit Care Med. 2020;201(11):1398–406. https://doi.org/10.1164/RCCM.201907-1342OC .

Bamat NA, Zhang H, McKenna KJ, Morris H, Stoller JZ, Gibbs K. The clinical evaluation of severe bronchopulmonary dysplasia. Neoreviews. 2020;21(7):e442–53. https://doi.org/10.1542/NEO.21-7-E442 .

Ibrahim J, Bhandari V. The definition of bronchopulmonary dysplasia: an evolving dilemma. Pediatr Res. 2018;84(5):586–8. https://doi.org/10.1038/s41390-018-0167-9 .

Jensen EA, Schmidt B. Epidemiology of bronchopulmonary dysplasia. Birth Defects Res Part Clin Mol Teratol. 2014;100(3):145–57. https://doi.org/10.1002/BDRA.23235 .

Article   CAS   Google Scholar  

Bell EF, Hintz SR, Hansen NI, et al. Mortality, In-Hospital morbidity, Care practices, and 2-Year outcomes for extremely Preterm infants in the US, 2013–2018. JAMA. 2022;327(3):248–63. https://doi.org/10.1001/jama.2021.23580 .

Twilhaar ES, Wade RM, De Kieviet JF, Van Goudoever JB, Van Elburg RM, Oosterlaan J. Cognitive outcomes of children born extremely or very Preterm since the 1990s and Associated Risk factors: a Meta-analysis and Meta-regression. JAMA Pediatr. 2018;172(4):361–7. https://doi.org/10.1001/JAMAPEDIATRICS.2017.5323 .

Jain D, Bancalari E. Bronchopulmonary dysplasia: clinical perspective. Birth Defects Res Clin Mol Teratol. 2014;100(3):134–44. https://doi.org/10.1002/bdra.23229 .

Cao Y, Jiang S, Sun J, et al. Assessment of neonatal Intensive Care Unit practices, Morbidity, and Mortality among very Preterm infants in China. JAMA Netw Open. 2021;4(8):e2118904. https://doi.org/10.1001/jamanetworkopen.2021.18904 . Published 2021 Aug 2.

Villosis MFB, Barseghyan K, Ambat MT, Rezaie KK, Braun D. Rates of Bronchopulmonary Dysplasia following implementation of a Novel Prevention Bundle. JAMA Netw Open. 2021;4(6):e2114140. https://doi.org/10.1001/jamanetworkopen.2021.14140 . Published 2021 Jun 1.

Download references


Not applicable.

AM reports a grant from the Parker B. Francis Foundation, National Institutes of Health (NIH) Eunice Kennedy Shriver National Institute of Child Health and Human Development K23HD101701 and a research grant from NIH National Heart, Lung, and Blood Institute 2R25-HL126140, outside the submitted work. All other authors declare no competing interests.

Author information

Michelle Noronha and Jooby Joy co-1st authors.

Authors and Affiliations

Department of Pediatrics, Division of Neonatology, University of Texas Health Science Center at San Antonio, San Antonio, TX, 78229-3900, USA

Alvaro Moreira, Michelle Noronha, Noah Bierwirth, Aina Tarriela, Aliha Naqvi, Maxwell Jones, Ali Marotta, Taylor Valadie, Jonathan Brick, Caitlyn Winter, Melissa Porter & Isabelle Decker

University of Texas Rio Grande Valley School of Medicine, Edinburg, TX, USA

Department of Pediatrics, University of Texas Southwestern, Dallas, TX, USA

Sarah Zoretic

Department of Pediatrics, Lund University, Lund, Sweden

Matteo Bruschettini

Veterans Administration Research Center for AIDS and HIV-1 Infection and Center for Personalized Medicine, South Texas Veterans Health Care System, San Antonio, TX, USA

Sunil K. Ahuja

Veterans Administration Center for Personalized Medicine, South Texas Veterans Health Care System, San Antonio, TX, USA

The Foundation for Advancing Veterans’ Health Research, South Texas Veterans Health Care System, San Antonio, TX, USA

Department of Microbiology, Immunology & Molecular Genetics, University of Texas Health Science Center at San Antonio, San Antonio, TX, USA

Department of Medicine, University of Texas Health Science Center at San Antonio, San Antonio, TX, USA

Department of Biochemistry and Structural Biology, University of Texas Health Science Center at San Antonio, San Antonio, TX, USA

You can also search for this author in PubMed   Google Scholar


AM collected data and was a major contributor in writing the manuscript, assessed risk of bias, and statistical analysis. MN collected data, assessed risk of bias and was a major contributor in writing the manuscript. JJ collected data and was a major contributor in writing the manuscript. NB was a major contributor in writing the manuscript. AT collected data and assessed risk of bias. AN collected data and assessed risk of bias. SZ verified all data collection. MJ collected data. AM collected data.TV reviewed and critiqued manuscript writing. JB reviewed and critiqued manuscript writing. CW was a major contributor in writing the manuscript and reviewed and critiqued manuscript writing. MP collected data and critiqued manuscript writing. ID reviewed and critiqued manuscript writing. MB verified risk of bias and reviewed and critiqued manuscript writing. SA oversaw the project and reviewed and critiqued manuscript writing. All authors approved final version of manuscript.

Corresponding author

Correspondence to Alvaro Moreira .

Ethics declarations

Ethics approval and consent to participate, consent for publication, role of sponsor.

The funders of the study had no role in study design, data collection, data analysis, data interpretation, or writing of the manuscript.

Originality of content

All information and materials in the manuscript are original.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Supplementary material 2, supplementary material 3, supplementary material 4, supplementary material 5, supplementary material 6, supplementary material 7, supplementary material 8, supplementary material 9, supplementary material 10, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Moreira, A., Noronha, M., Joy, J. et al. Rates of bronchopulmonary dysplasia in very low birth weight neonates: a systematic review and meta-analysis. Respir Res 25 , 219 (2024). https://doi.org/10.1186/s12931-024-02850-x

Download citation

Received : 21 December 2023

Accepted : 14 May 2024

Published : 24 May 2024

DOI : https://doi.org/10.1186/s12931-024-02850-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Bronchopulmonary dysplasia
  • Chronic lung disease
  • Meta-analysis

Respiratory Research

ISSN: 1465-993X

research proposal meta analysis

Case Western Reserve University

  • News & Events

Nursing Research News: May 2024

research proposal meta analysis

Each month, the Center for Research and Scholarship at the Frances Payne Bolton School of Nursing sends an internal research newsletter to faculty, staff, students and researchers. A recap is posted here.

Message from the Associate Dean for Research

Last summer, Celeste Alfes , assistant dean for academic affairs, and I sponsored a 2-day workshop/retreat for FPB faculty and postdoctoral researchers to enable the participants to focus on research and academic manuscripts they had begun or had been meaning to write, but had not found the time to start or complete during previous semesters.  

A number of faculty, including postdocs and a VA Quality Scholar, participated and found the event last July to be helpful, informative, and productive.

Publication is key to advancing your career to a tenure-track position. We understand that it can be challenging to find the time and a quiet place to focus on transferring your knowledge and experience to the written word for publication.

This summer, we’ve split our workshop/retreat in two single-day events to choose from, though you are certainly welcome to attend both. We’ll provide you with a large, comfortable conference room in the Samson Pavilion at the Health Education Campus on Friday, June 28 , and Friday, July 19 , to join with your colleagues to conquer “writer’s block” and procrastination on that unfinished/unwritten manuscript you’ve wanted to get ready for submission to a journal.

  • On June 28 , Celeste Alfes, who has published a number of journal articles and co-authored a book, will lead a discussion on two topics: “Publishing Your DNP Project,” and “Tackling Peer Reviewers’ Comments.” 
  • On July 19 , Daniela Solomon, research librarian at the Kelvin Smith Library, and also a published book author, will return to the HEC to address (1) Publishing in interdisciplinary journals: weighing impact factor and visibility in the journal selection process; and (2) Finding journals that report on American Association of Colleges of Nursing (AACN) essentials and competency-based education with simulation in nursing.

Daniela’s presentation will be followed by a special screening of a previously recorded talk, “How to Get Your Book Published: Tips, Tricks, and Best Practices in Publishing,” presented by Justin Race , director of acquisitions, and Tara Saunders , production editor, at Purdue University Press.

For the remainder of these 2 separate days in June and July, participants will be free to stay in the workshop conference room to write, or come and go as they please, but the room will be available to all of them from 8:30 a.m. until 5 p.m. on both days. Matt McManus will be on hand to facilitate. After the retreats, when you have a draft completed for submission to a nursing or healthcare publication, he can schedule a thorough edit of your paper.

These free writing workshop/retreats will be held on Fridays, June 28 and July 19, in Samson Pavilion room 139, from 8:30 a.m. to 5 p.m. both days.

If you would like to register for this event, simply contact Matt McManus at [email protected] and he’ll get back to you with more details.

Ronald Hickman , PhD, RN, ACNP-BC, FNAP, FAAN Associate Dean for Research

Scholarship Awards & Grants

Nephrology Researcher Award

Christine Horvat Davey , assistant professor, was recently honored with the 2024 Nephrology Nurse Researcher Award by the American Nephrology Nurse Association.

Sigma Grant Recipients

Rayhanah R. Almutairi and Shemaine Martin , students at the Frances Payne Bolton School of Nursing, each received $1,000 research grants from the Alpha Mu Chapter of Sigma Theta Tau International (STTI) Honor Society of Nursing. Almutairi was awarded a grant for her proposal, “Depressive Symptoms, Sleep Quality, and Resourcefulness in Family Caregivers of People with Dementia,” and Martin’s grant is for “A Phenomenological Inquiry into Stigma and Sickle Cell Disease in the African American Community.”

Allocation of these STTI funds is based on the quality of the proposed research, the future promise of the applicant, and the applicant’s research budget. Applications from novice researchers who have received no other national research funds are encouraged.

NIH News & Updates

Fellowship Review and Application Revised

The National Institutes of Health (NIH) recently announced revisions to the NIH fellowship review and application process. Details of changes to be implemented for applications submitted for due dates on or after January 25, 2025, can be found on this webpage .

NIH will be hosting a webinar, “Updates to NIH Training Grant Applications,” on June 5, 2024. More information and registration details for this webinar can be found here.

RPPR Update

NIH will be updating its Research Performance Progress Report (RPPR) instructions to address the NIH Data Management and Sharing Policy. NIH plans to implement new questions about updates on the status of data sharing and repositories and unique identifiers for data that have been shared for RPPRs submitted on or after October 1, 2024.

Recent School of Nursing Publications

Aaron, S., DeSimio, S., & Zhang, A. Y. (2024). Behavioral interventions for managing lower urinary tract symptoms in men: A literature review from 2018 to 2024. Andrology, 13(3).

Adhiambo, H. F., Cook, P., Erlandson, K. M., Jankowski, C., Oliveira, V. H. F., Do, H., Khuu, V., Horvat Davey, C., & Webel, A. R. (2024). Qualitative Description of Exercise Perceptions and Experiences Among People With Human Immunodeficiency Virus in the High-Intensity Exercise to Attenuate Limitations and Train Habits Study. Journal of Cardiovascular Nursing. 

Gillani, B., Prince, D. M., Ray-Novak, M., Feerasta, G., Jones, D., Mintz, L. J., & Moore, S. E. (2024). Mapping the Dynamic Complexity of Sexual and Gender Minority Healthcare Disparities: A Systems Thinking Approach. Healthcare, 12(4), 424. 

Johnson, C. R., Barto, L., Worley, S., Rothstein, R., & Wenzell, M. L. (2024). Follow-up of Telehealth Parent Training for Sleep Disturbances in Young Children with Autism Spectrum Disorder. Sleep Medicine. Advance online publication.

Moore, S. E., Davey, C. H., Morgan, M., & Webel, A. (2024). Symptoms, Lifetime Duration of Estrogen Exposure, and Ovarian Reserve Among Women Living With HIV: A Cross-Sectional Observational Study. Journal of the Association of Nurses in AIDS Care, 35(3), 264-280.

Melnyk, B. M. and Click, E. R. (2024). Creating and sustaining wellness cultures for faculty, staff, and students to thrive. Higher Education Today, American Council on Education, May 13, 2024. 

Narendrula, A., Brinza, E., Horvat Davey, C., Longenecker, C. T., & Webel, A. R. (2024). The Relationship Between Objectively Measured Physical Activity and Subclinical Cardiovascular Disease: A Systematic Review. BMJ Open Sport & Exercise Medicine, 10(1).

Scahill, L., Lecavalier, L., Edwards, M. C., Wenzell, M. L., Barto, L., Mulligan, A., Williams, A., Ousley, O., Sinha, C., Taylor, C., Kim, S. Y., Johnson, L., Gillespie, S., & Johnson, C. R. (in press). Toward better outcome measurement for insomnia in children with autism spectrum disorder. Autism.

St Marie, B. J., & Bernhofer, E. I. (2024). Ethical considerations for nurse practitioners conducting research in populations with opioid use disorder. Journal of the American Association of Nurse Practitioners. Advance online publication.  

Vangone, I., Arrigoni, C., Magon, A., Conte, G., Russo, S., Belloni, S., Stievano, A., Alfes, C. M., & Caruso, R. (2024). The efficacy of high-fidelity simulation on knowledge and performance in undergraduate nursing students: An umbrella review of systematic reviews and meta-analysis. Nurse Education Today, 139. 

Wenzell, M. L., Burant, C., & Zauszniewski, J. A. (in press). Evaluating the Need for Resourcefulness Training for Parents of Young Children with Autism Spectrum Disorder. Pediatric Nursing.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Hippokratia
  • v.14(Suppl 1); 2010 Dec

Meta-analysis in medical research

The objectives of this paper are to provide an introduction to meta-analysis and to discuss the rationale for this type of research and other general considerations. Methods used to produce a rigorous meta-analysis are highlighted and some aspects of presentation and interpretation of meta-analysis are discussed.

Meta-analysis is a quantitative, formal, epidemiological study design used to systematically assess previous research studies to derive conclusions about that body of research. Outcomes from a meta-analysis may include a more precise estimate of the effect of treatment or risk factor for disease, or other outcomes, than any individual study contributing to the pooled analysis. The examination of variability or heterogeneity in study results is also a critical outcome. The benefits of meta-analysis include a consolidated and quantitative review of a large, and often complex, sometimes apparently conflicting, body of literature. The specification of the outcome and hypotheses that are tested is critical to the conduct of meta-analyses, as is a sensitive literature search. A failure to identify the majority of existing studies can lead to erroneous conclusions; however, there are methods of examining data to identify the potential for studies to be missing; for example, by the use of funnel plots. Rigorously conducted meta-analyses are useful tools in evidence-based medicine. The need to integrate findings from many studies ensures that meta-analytic research is desirable and the large body of research now generated makes the conduct of this research feasible.

Important medical questions are typically studied more than once, often by different research teams in different locations. In many instances, the results of these multiple small studies of an issue are diverse and conflicting, which makes the clinical decision-making difficult. The need to arrive at decisions affecting clinical practise fostered the momentum toward "evidence-based medicine" 1 – 2 . Evidence-based medicine may be defined as the systematic, quantitative, preferentially experimental approach to obtaining and using medical information. Therefore, meta-analysis, a statistical procedure that integrates the results of several independent studies, plays a central role in evidence-based medicine. In fact, in the hierarchy of evidence ( Figure 1 ), where clinical evidence is ranked according to the strength of the freedom from various biases that beset medical research, meta-analyses are in the top. In contrast, animal research, laboratory studies, case series and case reports have little clinical value as proof, hence being in the bottom.

An external file that holds a picture, illustration, etc.
Object name is hippokratia-14-29-g001.jpg

Meta-analysis did not begin to appear regularly in the medical literature until the late 1970s but since then a plethora of meta-analyses have emerged and the growth is exponential over time ( Figure 2 ) 3 . Moreover, it has been shown that meta-analyses are the most frequently cited form of clinical research 4 . The merits and perils of the somewhat mysterious procedure of meta-analysis, however, continue to be debated in the medical community 5 – 8 . The objectives of this paper are to introduce meta-analysis and to discuss the rationale for this type of research and other general considerations.

An external file that holds a picture, illustration, etc.
Object name is hippokratia-14-30-g001.jpg

Meta-Analysis and Systematic Review

Glass first defined meta-analysis in the social science literature as "The statistical analysis of a large collection of analysis results from individual studies for the purpose of integrating the findings" 9 . Meta-analysis is a quantitative, formal, epidemiological study design used to systematically assess the results of previous research to derive conclusions about that body of research. Typically, but not necessarily, the study is based on randomized, controlled clinical trials. Outcomes from a meta-analysis may include a more precise estimate of the effect of treatment or risk factor for disease, or other outcomes, than any individual study contributing to the pooled analysis. Identifying sources of variation in responses; that is, examining heterogeneity of a group of studies, and generalizability of responses can lead to more effective treatments or modifications of management. Examination of heterogeneity is perhaps the most important task in meta-analysis. The Cochrane collaboration has been a long-standing, rigorous, and innovative leader in developing methods in the field 10 . Major contributions include the development of protocols that provide structure for literature search methods, and new and extended analytic and diagnostic methods for evaluating the output of meta-analyses. Use of the methods outlined in the handbook should provide a consistent approach to the conduct of meta-analysis. Moreover, a useful guide to improve reporting of systematic reviews and meta-analyses is the PRISMA (Preferred Reporting Items for Systematic reviews and Meta-analyses) statement that replaced the QUOROM (QUality Of Reporting of Meta-analyses) statement 11 – 13 .

Meta-analyses are a subset of systematic review. A systematic review attempts to collate empirical evidence that fits prespecified eligibility criteria to answer a specific research question. The key characteristics of a systematic review are a clearly stated set of objectives with predefined eligibility criteria for studies; an explicit, reproducible methodology; a systematic search that attempts to identify all studies that meet the eligibility criteria; an assessment of the validity of the findings of the included studies (e.g., through the assessment of risk of bias); and a systematic presentation and synthesis of the attributes and findings from the studies used. Systematic methods are used to minimize bias, thus providing more reliable findings from which conclusions can be drawn and decisions made than traditional review methods 14 , 15 . Systematic reviews need not contain a meta-analysisthere are times when it is not appropriate or possible; however, many systematic reviews contain meta-analyses 16 .

The inclusion of observational medical studies in meta-analyses led to considerable debate over the validity of meta-analytical approaches, as there was necessarily a concern that the observational studies were likely to be subject to unidentified sources of confounding and risk modification 17 . Pooling such findings may not lead to more certain outcomes. Moreover, an empirical study showed that in meta-analyses were both randomized and non-randomized was included, nonrandomized studies tended to show larger treatment effects 18 .

Meta-analyses are conducted to assess the strength of evidence present on a disease and treatment. One aim is to determine whether an effect exists; another aim is to determine whether the effect is positive or negative and, ideally, to obtain a single summary estimate of the effect. The results of a meta-analysis can improve precision of estimates of effect, answer questions not posed by the individual studies, settle controversies arising from apparently conflicting studies, and generate new hypotheses. In particular, the examination of heterogeneity is vital to the development of new hypotheses.

Individual or Aggregated Data

The majority of meta-analyses are based on a series of studies to produce a point estimate of an effect and measures of the precision of that estimate. However, methods have been developed for the meta-analyses to be conducted on data obtained from original trials 19 , 20 . This approach may be considered the "gold standard" in metaanalysis because it offers advantages over analyses using aggregated data, including a greater ability to validate the quality of data and to conduct appropriate statistical analysis. Further, it is easier to explore differences in effect across subgroups within the study population than with aggregated data. The use of standardized individual-level information may help to avoid the problems encountered in meta-analyses of prognostic factors 21 , 22 . It is the best way to obtain a more global picture of the natural history and predictors of risk for major outcomes, such as in scleroderma 23 – 26 .This approach relies on cooperation between researchers who conducted the relevant studies. Researchers who are aware of the potential to contribute or conduct these studies will provide and obtain additional benefits by careful maintenance of original databases and making these available for future studies.

Literature Search

A sound meta-analysis is characterized by a thorough and disciplined literature search. A clear definition of hypotheses to be investigated provides the framework for such an investigation. According to the PRISMA statement, an explicit statement of questions being addressed with reference to participants, interventions, comparisons, outcomes and study design (PICOS) should be provided 11 , 12 . It is important to obtain all relevant studies, because loss of studies can lead to bias in the study. Typically, published papers and abstracts are identified by a computerized literature search of electronic databases that can include PubMed ( www.ncbi.nlm.nih.gov./entrez/query.fcgi ), ScienceDirect ( www.sciencedirect.com ), Scirus ( www.scirus.com/srsapp ), ISI Web of Knowledge ( http://www.isiwebofknowledge.com ), Google Scholar ( http://scholar.google.com ) and CENTRAL (Cochrane Central Register of Controlled Trials, http://www.mrw.interscience.wiley.com/cochrane/cochrane_clcentral_articles_fs.htm ). PRISMA statement recommends that a full electronic search strategy for at least one major database to be presented 12 . Database searches should be augmented with hand searches of library resources for relevant papers, books, abstracts, and conference proceedings. Crosschecking of references, citations in review papers, and communication with scientists who have been working in the relevant field are important methods used to provide a comprehensive search. Communication with pharmaceutical companies manufacturing and distributing test products can be appropriate for studies examining the use of pharmaceutical interventions.

It is not feasible to find absolutely every relevant study on a subject. Some or even many studies may not be published, and those that are might not be indexed in computer-searchable databases. Useful sources for unpublished trials are the clinical trials registers, such as the National Library of Medicine's ClinicalTrials.gov Website. The reviews should attempt to be sensitive; that is, find as many studies as possible, to minimize bias and be efficient. It may be appropriate to frame a hypothesis that considers the time over which a study is conducted or to target a particular subpopulation. The decision whether to include unpublished studies is difficult. Although language of publication can provide a difficulty, it is important to overcome this difficulty, provided that the populations studied are relevant to the hypothesis being tested.

Inclusion or Exclusion Criteria and Potential for Bias

Studies are chosen for meta-analysis based on inclusion criteria. If there is more than one hypothesis to be tested, separate selection criteria should be defined for each hypothesis. Inclusion criteria are ideally defined at the stage of initial development of the study protocol. The rationale for the criteria for study selection used should be clearly stated.

One important potential source of bias in meta-analysis is the loss of trials and subjects. Ideally, all randomized subjects in all studies satisfy all of the trial selection criteria, comply with all the trial procedures, and provide complete data. Under these conditions, an "intention-totreat" analysis is straightforward to implement; that is, statistical analysis is conducted on all subjects that are enrolled in a study rather than those that complete all stages of study considered desirable. Some empirical studies had shown that certain methodological characteristics, such as poor concealment of treatment allocation or no blinding in studies exaggerate treatment effects 27 . Therefore, it is important to critically appraise the quality of studies in order to assess the risk of bias.

The study design, including details of the method of randomization of subjects to treatment groups, criteria for eligibility in the study, blinding, method of assessing the outcome, and handling of protocol deviations are important features defining study quality. When studies are excluded from a meta-analysis, reasons for exclusion should be provided for each excluded study. Usually, more than one assessor decides independently which studies to include or exclude, together with a well-defined checklist and a procedure that is followed when the assessors disagree. Two people familiar with the study topic perform the quality assessment for each study, independently. This is followed by a consensus meeting to discuss the studies excluded or included. Practically, the blinding of reviewers from details of a study such as authorship and journal source is difficult.

Before assessing study quality, a quality assessment protocol and data forms should be developed. The goal of this process is to reduce the risk of bias in the estimate of effect. Quality scores that summarize multiple components into a single number exist but are misleading and unhelpful 28 . Rather, investigators should use individual components of quality assessment and describe trials that do not meet the specified quality standards and probably assess the effect on the overall results by excluding them, as part of the sensitivity analyses.

Further, not all studies are completed, because of protocol failure, treatment failure, or other factors. Nonetheless, missing subjects and studies can provide important evidence. It is desirable to obtain data from all relevant randomized trials, so that the most appropriate analysis can be undertaken. Previous studies have discussed the significance of missing trials to the interpretation of intervention studies in medicine 29 , 30 . Journal editors and reviewers need to be aware of the existing bias toward publishing positive findings and ensure that papers that publish negative or even failed trials be published, as long as these meet the quality guidelines for publication.

There are occasions when authors of the selected papers have chosen different outcome criteria for their main analysis. In practice, it may be necessary to revise the inclusion criteria for a meta-analysis after reviewing all of the studies found through the search strategy. Variation in studies reflects the type of study design used, type and application of experimental and control therapies, whether or not the study was published, and, if published, subjected to peer review, and the definition used for the outcome of interest. There are no standardized criteria for inclusion of studies in meta-analysis. Universal criteria are not appropriate, however, because meta-analysis can be applied to a broad spectrum of topics. Published data in journal papers should also be cross-checked with conference papers to avoid repetition in presented data.

Clearly, unpublished studies are not found by searching the literature. It is possible that published studies are systemically different from unpublished studies; for example, positive trial findings may be more likely to be published. Therefore, a meta-analysis based on literature search results alone may lead to publication bias.

Efforts to minimize this potential bias include working from the references in published studies, searching computerized databases of unpublished material, and investigating other sources of information including conference proceedings, graduate dissertations and clinical trial registers.

Statistical analysis

The most common measures of effect used for dichotomous data are the risk ratio (also called relative risk) and the odds ratio. The dominant method used for continuous data are standardized mean difference (SMD) estimation. Methods used in meta-analysis for post hoc analysis of findings are relatively specific to meta-analysis and include heterogeneity analysis, sensitivity analysis, and evaluation of publication bias.

All methods used should allow for the weighting of studies. The concept of weighting reflects the value of the evidence of any particular study. Usually, studies are weighted according to the inverse of their variance 31 . It is important to recognize that smaller studies, therefore, usually contribute less to the estimates of overall effect. However, well-conducted studies with tight control of measurement variation and sources of confounding contribute more to estimates of overall effect than a study of identical size less well conducted.

One of the foremost decisions to be made when conducting a meta-analysis is whether to use a fixed-effects or a random-effects model. A fixed-effects model is based on the assumption that the sole source of variation in observed outcomes is that occurring within the study; that is, the effect expected from each study is the same. Consequently, it is assumed that the models are homogeneous; there are no differences in the underlying study population, no differences in subject selection criteria, and treatments are applied the same way 32 . Fixed-effect methods used for dichotomous data include most often the Mantel-Haenzel method 33 and the Peto method 34 (only for odds ratios).

Random-effects models have an underlying assumption that a distribution of effects exists, resulting in heterogeneity among study results, known as τ2. Consequently, as software has improved, random-effects models that require greater computing power have become more frequently conducted. This is desirable because the strong assumption that the effect of interest is the same in all studies is frequently untenable. Moreover, the fixed effects model is not appropriate when statistical heterogeneity (τ2) is present in the results of studies in the meta-analysis. In the random-effects model, studies are weighted with the inverse of their variance and the heterogeneity parameter. Therefore, it is usually a more conservative approach with wider confidence intervals than the fixed-effects model where the studies are weighted only with the inverse of their variance. The most commonly used random-effects method is the DerSimonian and Laird method 35 . Furthermore, it is suggested that comparing the fixed-effects and random-effect models developed as this process can yield insights to the data 36 .


Arguably, the greatest benefit of conducting metaanalysis is to examine sources of heterogeneity, if present, among studies. If heterogeneity is present, the summary measure must be interpreted with caution 37 . When heterogeneity is present, one should question whether and how to generalize the results. Understanding sources of heterogeneity will lead to more effective targeting of prevention and treatment strategies and will result in new research topics being identified. Part of the strategy in conducting a meta-analysis is to identify factors that may be significant determinants of subpopulation analysis or covariates that may be appropriate to explore in all studies.

To understand the nature of variability in studies, it is important to distinguish between different sources of heterogeneity. Variability in the participants, interventions, and outcomes studied has been described as clinical diversity, and variability in study design and risk of bias has been described as methodological diversity 10 . Variability in the intervention effects being evaluated among the different studies is known as statistical heterogeneity and is a consequence of clinical or methodological diversity, or both, among the studies. Statistical heterogeneity manifests itself in the observed intervention effects varying by more than the differences expected among studies that would be attributable to random error alone. Usually, in the literature, statistical heterogeneity is simply referred to as heterogeneity.

Clinical variation will cause heterogeneity if the intervention effect is modified by the factors that vary across studies; most obviously, the specific interventions or participant characteristics that are often reflected in different levels of risk in the control group when the outcome is dichotomous. In other words, the true intervention effect will differ for different studies. Differences between studies in terms of methods used, such as use of blinding or differences between studies in the definition or measurement of outcomes, may lead to differences in observed effects. Significant statistical heterogeneity arising from differences in methods used or differences in outcome assessments suggests that the studies are not all estimating the same effect, but does not necessarily suggest that the true intervention effect varies. In particular, heterogeneity associated solely with methodological diversity indicates that studies suffer from different degrees of bias. Empirical evidence suggests that some aspects of design can affect the result of clinical trials, although this may not always be the case.

The scope of a meta-analysis will largely determine the extent to which studies included in a review are diverse. Meta-analysis should be conducted when a group of studies is sufficiently homogeneous in terms of subjects involved, interventions, and outcomes to provide a meaningful summary. However, it is often appropriate to take a broader perspective in a meta-analysis than in a single clinical trial. Combining studies that differ substantially in design and other factors can yield a meaningless summary result, but the evaluation of reasons for the heterogeneity among studies can be very insightful. It may be argued that these studies are of intrinsic interest on their own, even though it is not appropriate to produce a single summary estimate of effect.

Variation among k trials is usually assessed using Cochran's Q statistic, a chi-squared (χ 2 ) test of heterogeneity with k-1 degrees of freedom. This test has relatively poor power to detect heterogeneity among small numbers of trials; consequently, an α-level of 0.10 is used to test hypotheses 38 , 39 .

Heterogeneity of results among trials is better quantified using the inconsistency index I 2 , which describes the percentage of total variation across studies 40 . Uncertainty intervals for I 2 (dependent on Q and k) are calculated using the method described by Higgins and Thompson 41 . Negative values of I 2 are put equal to zero, consequently I 2 lies between 0 and 100%. A value >75% may be considered substantial heterogeneity 41 . This statistic is less influenced by the number of trials compared with other methods used to estimate the heterogeneity and provides a logical and readily interpretable metric but it still can be unstable when only a few studies are combined 42 .

Given that there are several potential sources of heterogeneity in the data, several steps should be considered in the investigation of the causes. Although random-effects models are appropriate, it may be still very desirable to examine the data to identify sources of heterogeneity and to take steps to produce models that have a lower level of heterogeneity, if appropriate. Further, if the studies examined are highly heterogeneous, it may be not appropriate to present an overall summary estimate, even when random effects models are used. As Petiti notes 43 , statistical analysis alone will not make contradictory studies agree; critically, however, one should use common sense in decision-making. Despite heterogeneity in responses, if all studies had a positive point direction and the pooled confidence interval did not include zero, it would not be logical to conclude that there was not a positive effect, provided that sufficient studies and subject numbers were present. The appropriateness of the point estimate of the effect is much more in question.

Some of the ways to investigate the reasons for heterogeneity; are subgroup analysis and meta-regression. The subgroup analysis approach, a variation on those described above, groups categories of subjects (e.g., by age, sex) to compare effect sizes. The meta-regression approach uses regression analysis to determine the influence of selected variables (the independent variables) on the effect size (the dependent variable). In a meta-regresregression, studies are regarded as if they were individual patients, but their effects are properly weighted to account for their different variances 44 .

Sensitivity analyses have also been used to examine the effects of studies identified as being aberrant concerning conduct or result, or being highly influential in the analysis. Recently, another method has been proposed that reduces the weight of studies that are outliers in meta-analyses 45 . All of these methods for examining heterogeneity have merit, and the variety of methods available reflects the importance of this activity.

Presentation of results

A useful graph, presented in the PRISMA statement 11 , is the four-phase flow diagram ( Figure 3 ).

An external file that holds a picture, illustration, etc.
Object name is hippokratia-14-33-g001.jpg

This flow-diagram depicts the flow of information through the different phases of a systematic review or meta-analysis. It maps out the number of records identified, included and excluded, and the reasons for exclusions. The results of meta-analyses are often presented in a forest plot, where each study is shown with its effect size and the corresponding 95% confidence interval ( Figure 4 ).

An external file that holds a picture, illustration, etc.
Object name is hippokratia-14-34-g001.jpg

The pooled effect and 95% confidence interval is shown in the bottom in the same line with "Overall". In the right panel of Figure 4 , the cumulative meta-analysis is graphically displayed, where data are entered successively, typically in the order of their chronological appearance 46 , 47 . Such cumulative meta-analysis can retrospectively identify the point in time when a treatment effect first reached conventional levels of significance. Cumulative meta-analysis is a compelling way to examine trends in the evolution of the summary-effect size, and to assess the impact of a specific study on the overall conclusions 46 . The figure shows that many studies were performed long after cumulative meta-analysis would have shown a significant beneficial effect of antibiotic prophylaxis in colon surgery.

Biases in meta-analysis

Although the intent of a meta-analysis is to find and assess all studies meeting the inclusion criteria, it is not always possible to obtain these. A critical concern is the papers that may have been missed. There is good reason to be concerned about this potential loss because studies with significant, positive results (positive studies) are more likely to be published and, in the case of interventions with a commercial value, to be promoted, than studies with non-significant or "negative" results (negative studies). Studies that produce a positive result, especially large studies, are more likely to have been published and, conversely, there has been a reluctance to publish small studies that have non-significant results. Further, publication bias is not solely the responsibility of editorial policy as there is reluctance among researchers to publish results that were either uninteresting or are not randomized 48 . There are, however, problems with simply including all studies that have failed to meet peer-review standards. All methods of retrospectively dealing with bias in studies are imperfect.

It is important to examine the results of each meta-analysis for evidence of publication bias. An estimation of likely size of the publication bias in the review and an approach to dealing with the bias is inherent to the conduct of many meta-analyses. Several methods have been developed to provide an assessment of publication bias; the most commonly used is the funnel plot. The funnel plot provides a graphical evaluation of the potential for bias and was developed by Light and Pillemer 49 and discussed in detail by Egger and colleagues 50 , 51 . A funnel plot is a scatterplot of treatment effect against a measure of study size. If publication bias is not present, the plot is expected to have a symmetric inverted funnel shape, as shown in Figure 5A .

An external file that holds a picture, illustration, etc.
Object name is hippokratia-14-35-g001.jpg

In a study in which there is no publication bias, larger studies (i.e., have lower standard error) tend to cluster closely to the point estimate. As studies become less precise, such as in smaller trials (i.e., have a higher standard error), the results of the studies can be expected to be more variable and are scattered to both sides of the more precise larger studies. Figure 5A shows that the smaller, less precise studies are, indeed, scattered to both sides of the point estimate of effect and that these seem to be symmetrical, as an inverted funnel-plot, showing no evidence of publication bias. In contrast to Figure 5A , Figure 5B shows evidence of publication bias. There is evidence of the possibility that studies using smaller numbers of subjects and showing an decrease in effect size (lower odds ratio) were not published.

Asymmetry of funnel plots is not solely attributable to publication bias, but may also result from clinical heterogeneity among studies. Sources of clinical heterogeneity include differences in control or exposure of subjects to confounders or effect modifiers, or methodological heterogeneity between studies; for example, a failure to conceal treatment allocation. There are several statistical tests for detecting funnel plot asymmetry; for example, Eggers linear regression test 50 , and Begg's rank correlation test 52 but these do not have considerable power and are rarely used. However, the funnel plot is not without problems. If high precision studies really are different than low precision studies with respect to effect size (e.g., due different populations examined) a funnel plot may give a wrong impression of publication bias 53 . The appearance of the funnel plot plot can change quite dramatically depending on the scale on the y-axis - whether it is the inverse square error or the trial size 54 .

Other types of biases in meta-analysis include the time lag bias, selective reporting bias and the language bias. The time lag bias arises from the published studies, when those with striking results are published earlier than those with non-significant findings 55 . Moreover, it has been shown that positive studies with high early accrual of patients are published sooner than negative trials with low early accrual 56 . However, missing studies, either due to publication bias or time-lag bias may increasingly be identified from trials registries.

The selective reporting bias exists when published articles have incomplete or inadequate reporting. Empirical studies have shown that this bias is widespread and of considerable importance when published studies were compared with their study protocols 29 , 30 . Furthermore, recent evidence suggests that selective reporting might be an issue in safety outcomes and the reporting of harms in clinical trials is still suboptimal 57 . Therefore, it might not be possible to use quantitative objective evidence for harms in performing meta-analyses and making therapeutic decisions.

Excluding clinical trials reported in languages other than English from meta-analyses may introduce the language bias and reduce the precision of combined estimates of treatment effects. Trials with statistically significant results have been shown to be published in English 58 . In contrast, a later more extensive investigation showed that trials published in languages other than English tend to be of lower quality and produce more favourable treatment effects than trials published in English and concluded that excluding non-English language trials has generally only modest effects on summary treatment effect estimates but the effect is difficult to predict for individual meta-analyses 59 .

Evolution of meta-analyses

The classical meta-analysis compares two treatments while network meta-analysis (or multiple treatment metaanalysis) can provide estimates of treatment efficacy of multiple treatment regimens, even when direct comparisons are unavailable by indirect comparisons 60 . An example of a network analysis would be the following. An initial trial compares drug A to drug B. A different trial studying the same patient population compares drug B to drug C. Assume that drug A is found to be superior to drug B in the first trial. Assume drug B is found to be equivalent to drug C in a second trial. Network analysis then, allows one to potentially say statistically that drug A is also superior to drug C for this particular patient population. (Since drug A is better than drug B, and drug B is equivalent to drug C, then drug A is also better to drug C even though it was not directly tested against drug C.)

Meta-analysis can also be used to summarize the performance of diagnostic and prognostic tests. However, studies that evaluate the accuracy of tests have a unique design requiring different criteria to appropriately assess the quality of studies and the potential for bias. Additionally, each study reports a pair of related summary statistics (for example, sensitivity and specificity) rather than a single statistic (such as a risk ratio) and hence requires different statistical methods to pool the results of the studies 61 . Various techniques to summarize results from diagnostic and prognostic test results have been proposed 62 – 64 . Furthermore, there are many methodologies for advanced meta-analysis that have been developed to address specific concerns, such as multivariate meta-analysis 65 – 67 , and special types of meta-analysis in genetics 68 but will not be discussed here.

Meta-analysis is no longer a novelty in medicine. Numerous meta-analyses have been conducted for the same medical topic by different researchers. Recently, there is a trend to combine the results of different meta-analyses, known as a meta-epidemiological study, to assess the risk of bias 79 , 70 .


The traditional basis of medical practice has been changed by the use of randomized, blinded, multicenter clinical trials and meta-analysis, leading to the widely used term "evidence-based medicine". Leaders in initiating this change have been the Cochrane Collaboration who have produced guidelines for conducting systematic reviews and meta-analyses 10 and recently the PRISMA statement, a helpful resource to improve reporting of systematic reviews and meta-analyses has been released 11 . Moreover, standards by which to conduct and report meta-analyses of observational studies have been published to improve the quality of reporting 71 .

Meta-analysis of randomized clinical trials is not an infallible tool, however, and several examples exist of meta-analyses which were later contradicted by single large randomized controlled trials, and of meta-analyses addressing the same issue which have reached opposite conclusions 72 . A recent example, was the controversy between a meta-analysis of 42 studies 73 and the subsequent publication of the large-scale trial (RECORD trial) that did not support the cardiovascular risk of rosiglitazone 74 . However, the reason for this controversy was explained by the numerous methodological flaws found both in the meta-analysis and the large clinical trial 75 .

No single study, whether meta-analytic or not, will provide the definitive understanding of responses to treatment, diagnostic tests, or risk factors influencing disease. Despite this limitation, meta-analytic approaches have demonstrable benefits in addressing the limitations of study size, can include diverse populations, provide the opportunity to evaluate new hypotheses, and are more valuable than any single study contributing to the analysis. The conduct of the studies is critical to the value of a meta-analysis and the methods used need to be as rigorous as any other study conducted.


  1. 11 Research Proposal Examples to Make a Great Paper

    research proposal meta analysis

  2. What is a Meta-Analysis? The benefits and challenges

    research proposal meta analysis

  3. A guide to prospective meta-analysis

    research proposal meta analysis

  4. (PDF) Conducting a meta-analysis for your student dissertation

    research proposal meta analysis

  5. Infographics

    research proposal meta analysis

  6. Meta-Analysis Methodology for Basic Research: A Practical Guide

    research proposal meta analysis


  1. Research Proposal

  2. 7-6 How to do a systematic review or a meta-analysis with HubMeta: Outlier Analysis

  3. Overview of a Research Proposal

  4. SCP-001: Meta Ike Proposal "The Solution"

  5. 1-4 How to do a systematic review or a meta-analysis with HubMeta: Understanding HubMeta's Dashboard

  6. Meta-Essentials for Meta Analysis


  1. How to conduct a meta-analysis in eight steps: a practical guide

    2.1 Step 1: defining the research question. The first step in conducting a meta-analysis, as with any other empirical study, is the definition of the research question. Most importantly, the research question determines the realm of constructs to be considered or the type of interventions whose effects shall be analyzed.

  2. Meta-Analytic Methodology for Basic Research: A Practical Guide

    Meta-analysis refers to the statistical analysis of the data from independent primary studies focused on the same question, which aims to generate a quantitative estimate of the studied phenomenon, for example, the effectiveness of the intervention (Gopalakrishnan and Ganeshkumar, 2013). In clinical research, systematic reviews and meta ...

  3. Ten simple rules for carrying out and writing meta-analyses

    Meta-analysis is a powerful tool to cumulate and summarize the knowledge in a research field . ... Morton SC, Olkin I, Williamson GD, et al. (2000) Meta-analysis of observational studies in epidemiology: a proposal for reporting. Meta-analysis Of Observational Studies in Epidemiology (MOOSE) group. JAMA 283: 2008-2012.

  4. A step by step guide for conducting a systematic review and meta

    To do the meta-analysis, we can use free software, such as RevMan or R package meta . In this example, we will use the R package meta. The tutorial of meta package can be accessed through "General Package for Meta-Analysis" tutorial pdf . The R codes and its guidance for meta-analysis done can be found in Additional file 5: File S3.

  5. PDF How to conduct a meta-analysis in eight steps: a practical guide

    2 Eight steps in conducting a meta‑analysis. 2.1 Step 1: dening the research question. The rst step in conducting a meta-analysis, as with any other empirical study, is the denition of the research question. Most importantly, the research question deter- mines the realm of constructs to be considered or the type of interventions whose eects ...

  6. Methodological Guidance Paper: High-Quality Meta-Analysis in a

    The term meta-analysis was first used by Gene Glass (1976) in his presidential address at the AERA (American Educational Research Association) annual meeting, though Pearson (1904) used methods to combine results from studies on the relationship between enteric fever and mortality in 1904. The 1980s was a period of rapid development of statistical methods (Cooper & Hedges, 2009) leading to the ...

  7. How to write a systematic review or meta-analysis protocol

    The protocol should explicitly outline the roles and responsibilities of any funder(s) in study design, data analysis and interpretation, manuscript writing and dissemination of results. CONCLUSION. A protocol is an important document that specifies the research plan for a systematic review or meta-analysis.

  8. How to Conduct a Meta-Analysis for Research

    Define your research question and objective. Clearly define the research objective of your meta-analysis. Doing this will help you narrow down your search and establish inclusion and exclusion criteria for selecting studies. 2. Conduct a comprehensive literature search. Thoroughly search electronic databases, such as PubMed, Google Scholar, or ...

  9. A guide to prospective meta-analysis

    In a prospective meta-analysis (PMA), study selection criteria, hypotheses, and analyses are specified before the results of the studies related to the PMA research question are known, reducing many of the problems associated with a traditional (retrospective) meta-analysis. PMAs have many advantages: they can help reduce research waste and bias, and they are adaptive, efficient, and ...

  10. A Guide to Conducting a Meta-Analysis

    Abstract. Meta-analysis is widely accepted as the preferred method to synthesize research findings in various disciplines. This paper provides an introduction to when and how to conduct a meta-analysis. Several practical questions, such as advantages of meta-analysis over conventional narrative review and the number of studies required for a ...

  11. Chapter 10: Analysing data and undertaking meta-analyses

    Such findings may generate proposals for further investigations and future research. ... Langan D, Salanti G. Methods to estimate the between-study variance and its uncertainty in meta-analysis. Research Synthesis Methods 2016; 7: 55-79. Whitehead A, Jones NMB. A meta-analysis of clinical trials involving different classifications of response ...

  12. Meta-Analysis

    Definition. "A meta-analysis is a formal, epidemiological, quantitative study design that uses statistical methods to generalise the findings of the selected independent studies. Meta-analysis and systematic review are the two most authentic strategies in research. When researchers start looking for the best available evidence concerning ...

  13. Systematic Reviews and Meta-Analysis: A Guide for Beginners

    Meta-analysis is a statistical tool that provides pooled estimates of effect from the data extracted from individual studies in the systematic review. The graphical output of meta-analysis is a forest plot which provides information on individual studies and the pooled effect. Systematic reviews of literature can be undertaken for all types of ...

  14. PDF Systematic review and meta-analysis proposal

    review and meta-analysis as a syndrome that require tenderness on pressure (tenderpoints) in at least 11 of 18 specified sites and the presence of widespread pain for diagnosis. Widespread pain is defined as axial pain, left- and right-sided pain, and upper and lower segment pain [9]. RCTs that reported data for

  15. Meta‐analysis and traditional systematic literature reviews—What, why

    Meta-analysis is a research method for systematically combining and synthesizing findings from multiple quantitative studies in a research domain. Despite its importance, most literature evaluating meta-analyses are based on data analysis and statistical discussions. This paper takes a holistic view, comparing meta-analyses to traditional ...

  16. A step by step guide for conducting a systematic review and meta

    Detailed steps for conducting any systematic review and meta-analysis. We searched the methods reported in published SR/MA in tropical medicine and other healthcare fields besides the published guidelines like Cochrane guidelines {Higgins, 2011 #7} [] to collect the best low-bias method for each step of SR/MA conduction steps.Furthermore, we used guidelines that we apply in studies for all SR ...

  17. Meta-analysis

    Graphical summary of a meta-analysis of over 1,000 cases of diffuse intrinsic pontine glioma and other pediatric gliomas, in which information about the mutations involved as well as generic outcomes were distilled from the underlying primary literature. Meta-analysis is the statistical combination of the results of multiple studies addressing a similar research question.

  18. Systematic Review and Meta‐Analysis: a Primer

    INTRODUCTION. Sacket et al 1,2 defined evidence‐based practice as "the integration of best research evidence with clinical expertise and patient values". The "best evidence" can be gathered by reading randomized controlled trials (RCTs), systematic reviews, and meta‐analyses. 2 It should be noted that the "best evidence" (e.g. concerning clinical prognosis, or patient ...

  19. Protocol for a systematic review and meta-analysis of research on the

    Background Existing evidence on the association between exposure to bullying and sleep is limited and inconclusive. The aims of this planned systematic review and meta-analysis are therefore (1) to determine whether exposure to workplace bullying is related to changes in sleep function and (2) to establish mediating and moderating factors that govern the relationship between bullying and sleep ...

  20. Proposal to Conduct a Meta-Analysis of

    Halversen (1994), Rosenthal (1995) and Cooper (2010) was used to guide the format of. the final report's major subject headings and content. In sum, then, by following these procedures a meta-analysis was conducted that. produced estimates of mean correlations, confidence intervals and tests of significance;

  21. Systematic review and meta-analysis of hepatitis E seroprevalence in

    To commence this systematic review and meta-analysis, we adhered to the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines and used the PRISMA assessment checklist [Supplementary Table 1].The study included pertinent research conducted within the population of Southeast Asian countries, as outlined by the United Nations [], and perform a meta-analysis on the ...

  22. Validating Self-Assessment Measures for Quality of Center-Based

    A three-level meta-analysis (k = 13, ES = 45) revealed a positive association between self-assessment ratings and ratings with validated measures of ECE quality (r = .38), indicating a moderate convergent validity. Studies with lower methodological quality and published "peer reviewed" studies reported somewhat higher correlations between ...

  23. Turnover intention and its associated factors among nurses in Ethiopia

    Meta-analysis was done using a random-effects method. Heterogeneity between the primary studies was assessed by Cochran Q and I-square tests. Subgroup and sensitivity analyses were carried out to clarify the source of heterogeneity. Result. This systematic review and meta-analysis incorporated 8 articles, involving 3033 nurses in the analysis.

  24. A brief introduction of meta‐analyses in clinical practice and research

    When conducted properly, a meta‐analysis of medical studies is considered as decisive evidence because it occupies a top level in the hierarchy of evidence. An understanding of the principles, performance, advantages and weaknesses of meta‐analyses is important. Therefore, we aim to provide a basic understanding of meta‐analyses for ...

  25. Rates of bronchopulmonary dysplasia in very low birth weight neonates

    Importance Large-scale estimates of bronchopulmonary dysplasia (BPD) are warranted for adequate prevention and treatment. However, systematic approaches to ascertain rates of BPD are lacking. Objective To conduct a systematic review and meta-analysis to assess the prevalence of BPD in very low birth weight (≤ 1,500 g) or very low gestational age (< 32 weeks) neonates. Data sources A search ...

  26. Nursing Research News: May 2024

    News & Events. Nursing Research News: May 2024. May 31, 2024. Each month, the Center for Research and Scholarship at the Frances Payne Bolton School of Nursing sends an internal research newsletter to faculty, staff, students and researchers. A recap is posted here.

  27. Meta-analysis in medical research

    Meta-analysis did not begin to appear regularly in the medical literature until the late 1970s but since then a plethora of meta-analyses have emerged and the growth is exponential over time (Figure 2) 3.Moreover, it has been shown that meta-analyses are the most frequently cited form of clinical research 4.The merits and perils of the somewhat mysterious procedure of meta-analysis, however ...

  28. Dow Jones Futures: Market Closes Strong As Nvidia Jumps; Meta Leads 7

    Meta stock has a 531.49 consolidation buy point. GE Aerospace stock fell 2.5% to 161, but found support at the 10-week line. The aerospace pure play is set to have a flat base with a 170.80 buy ...