• Privacy Policy

Research Method

Home » Variables in Research – Definition, Types and Examples

Variables in Research – Definition, Types and Examples

Table of Contents

Variables in Research

Variables in Research

Definition:

In Research, Variables refer to characteristics or attributes that can be measured, manipulated, or controlled. They are the factors that researchers observe or manipulate to understand the relationship between them and the outcomes of interest.

Types of Variables in Research

Types of Variables in Research are as follows:

Independent Variable

This is the variable that is manipulated by the researcher. It is also known as the predictor variable, as it is used to predict changes in the dependent variable. Examples of independent variables include age, gender, dosage, and treatment type.

Dependent Variable

This is the variable that is measured or observed to determine the effects of the independent variable. It is also known as the outcome variable, as it is the variable that is affected by the independent variable. Examples of dependent variables include blood pressure, test scores, and reaction time.

Confounding Variable

This is a variable that can affect the relationship between the independent variable and the dependent variable. It is a variable that is not being studied but could impact the results of the study. For example, in a study on the effects of a new drug on a disease, a confounding variable could be the patient’s age, as older patients may have more severe symptoms.

Mediating Variable

This is a variable that explains the relationship between the independent variable and the dependent variable. It is a variable that comes in between the independent and dependent variables and is affected by the independent variable, which then affects the dependent variable. For example, in a study on the relationship between exercise and weight loss, the mediating variable could be metabolism, as exercise can increase metabolism, which can then lead to weight loss.

Moderator Variable

This is a variable that affects the strength or direction of the relationship between the independent variable and the dependent variable. It is a variable that influences the effect of the independent variable on the dependent variable. For example, in a study on the effects of caffeine on cognitive performance, the moderator variable could be age, as older adults may be more sensitive to the effects of caffeine than younger adults.

Control Variable

This is a variable that is held constant or controlled by the researcher to ensure that it does not affect the relationship between the independent variable and the dependent variable. Control variables are important to ensure that any observed effects are due to the independent variable and not to other factors. For example, in a study on the effects of a new teaching method on student performance, the control variables could include class size, teacher experience, and student demographics.

Continuous Variable

This is a variable that can take on any value within a certain range. Continuous variables can be measured on a scale and are often used in statistical analyses. Examples of continuous variables include height, weight, and temperature.

Categorical Variable

This is a variable that can take on a limited number of values or categories. Categorical variables can be nominal or ordinal. Nominal variables have no inherent order, while ordinal variables have a natural order. Examples of categorical variables include gender, race, and educational level.

Discrete Variable

This is a variable that can only take on specific values. Discrete variables are often used in counting or frequency analyses. Examples of discrete variables include the number of siblings a person has, the number of times a person exercises in a week, and the number of students in a classroom.

Dummy Variable

This is a variable that takes on only two values, typically 0 and 1, and is used to represent categorical variables in statistical analyses. Dummy variables are often used when a categorical variable cannot be used directly in an analysis. For example, in a study on the effects of gender on income, a dummy variable could be created, with 0 representing female and 1 representing male.

Extraneous Variable

This is a variable that has no relationship with the independent or dependent variable but can affect the outcome of the study. Extraneous variables can lead to erroneous conclusions and can be controlled through random assignment or statistical techniques.

Latent Variable

This is a variable that cannot be directly observed or measured, but is inferred from other variables. Latent variables are often used in psychological or social research to represent constructs such as personality traits, attitudes, or beliefs.

Moderator-mediator Variable

This is a variable that acts both as a moderator and a mediator. It can moderate the relationship between the independent and dependent variables and also mediate the relationship between the independent and dependent variables. Moderator-mediator variables are often used in complex statistical analyses.

Variables Analysis Methods

There are different methods to analyze variables in research, including:

  • Descriptive statistics: This involves analyzing and summarizing data using measures such as mean, median, mode, range, standard deviation, and frequency distribution. Descriptive statistics are useful for understanding the basic characteristics of a data set.
  • Inferential statistics : This involves making inferences about a population based on sample data. Inferential statistics use techniques such as hypothesis testing, confidence intervals, and regression analysis to draw conclusions from data.
  • Correlation analysis: This involves examining the relationship between two or more variables. Correlation analysis can determine the strength and direction of the relationship between variables, and can be used to make predictions about future outcomes.
  • Regression analysis: This involves examining the relationship between an independent variable and a dependent variable. Regression analysis can be used to predict the value of the dependent variable based on the value of the independent variable, and can also determine the significance of the relationship between the two variables.
  • Factor analysis: This involves identifying patterns and relationships among a large number of variables. Factor analysis can be used to reduce the complexity of a data set and identify underlying factors or dimensions.
  • Cluster analysis: This involves grouping data into clusters based on similarities between variables. Cluster analysis can be used to identify patterns or segments within a data set, and can be useful for market segmentation or customer profiling.
  • Multivariate analysis : This involves analyzing multiple variables simultaneously. Multivariate analysis can be used to understand complex relationships between variables, and can be useful in fields such as social science, finance, and marketing.

Examples of Variables

  • Age : This is a continuous variable that represents the age of an individual in years.
  • Gender : This is a categorical variable that represents the biological sex of an individual and can take on values such as male and female.
  • Education level: This is a categorical variable that represents the level of education completed by an individual and can take on values such as high school, college, and graduate school.
  • Income : This is a continuous variable that represents the amount of money earned by an individual in a year.
  • Weight : This is a continuous variable that represents the weight of an individual in kilograms or pounds.
  • Ethnicity : This is a categorical variable that represents the ethnic background of an individual and can take on values such as Hispanic, African American, and Asian.
  • Time spent on social media : This is a continuous variable that represents the amount of time an individual spends on social media in minutes or hours per day.
  • Marital status: This is a categorical variable that represents the marital status of an individual and can take on values such as married, divorced, and single.
  • Blood pressure : This is a continuous variable that represents the force of blood against the walls of arteries in millimeters of mercury.
  • Job satisfaction : This is a continuous variable that represents an individual’s level of satisfaction with their job and can be measured using a Likert scale.

Applications of Variables

Variables are used in many different applications across various fields. Here are some examples:

  • Scientific research: Variables are used in scientific research to understand the relationships between different factors and to make predictions about future outcomes. For example, scientists may study the effects of different variables on plant growth or the impact of environmental factors on animal behavior.
  • Business and marketing: Variables are used in business and marketing to understand customer behavior and to make decisions about product development and marketing strategies. For example, businesses may study variables such as consumer preferences, spending habits, and market trends to identify opportunities for growth.
  • Healthcare : Variables are used in healthcare to monitor patient health and to make treatment decisions. For example, doctors may use variables such as blood pressure, heart rate, and cholesterol levels to diagnose and treat cardiovascular disease.
  • Education : Variables are used in education to measure student performance and to evaluate the effectiveness of teaching strategies. For example, teachers may use variables such as test scores, attendance, and class participation to assess student learning.
  • Social sciences : Variables are used in social sciences to study human behavior and to understand the factors that influence social interactions. For example, sociologists may study variables such as income, education level, and family structure to examine patterns of social inequality.

Purpose of Variables

Variables serve several purposes in research, including:

  • To provide a way of measuring and quantifying concepts: Variables help researchers measure and quantify abstract concepts such as attitudes, behaviors, and perceptions. By assigning numerical values to these concepts, researchers can analyze and compare data to draw meaningful conclusions.
  • To help explain relationships between different factors: Variables help researchers identify and explain relationships between different factors. By analyzing how changes in one variable affect another variable, researchers can gain insight into the complex interplay between different factors.
  • To make predictions about future outcomes : Variables help researchers make predictions about future outcomes based on past observations. By analyzing patterns and relationships between different variables, researchers can make informed predictions about how different factors may affect future outcomes.
  • To test hypotheses: Variables help researchers test hypotheses and theories. By collecting and analyzing data on different variables, researchers can test whether their predictions are accurate and whether their hypotheses are supported by the evidence.

Characteristics of Variables

Characteristics of Variables are as follows:

  • Measurement : Variables can be measured using different scales, such as nominal, ordinal, interval, or ratio scales. The scale used to measure a variable can affect the type of statistical analysis that can be applied.
  • Range : Variables have a range of values that they can take on. The range can be finite, such as the number of students in a class, or infinite, such as the range of possible values for a continuous variable like temperature.
  • Variability : Variables can have different levels of variability, which refers to the degree to which the values of the variable differ from each other. Highly variable variables have a wide range of values, while low variability variables have values that are more similar to each other.
  • Validity and reliability : Variables should be both valid and reliable to ensure accurate and consistent measurement. Validity refers to the extent to which a variable measures what it is intended to measure, while reliability refers to the consistency of the measurement over time.
  • Directionality: Some variables have directionality, meaning that the relationship between the variables is not symmetrical. For example, in a study of the relationship between smoking and lung cancer, smoking is the independent variable and lung cancer is the dependent variable.

Advantages of Variables

Here are some of the advantages of using variables in research:

  • Control : Variables allow researchers to control the effects of external factors that could influence the outcome of the study. By manipulating and controlling variables, researchers can isolate the effects of specific factors and measure their impact on the outcome.
  • Replicability : Variables make it possible for other researchers to replicate the study and test its findings. By defining and measuring variables consistently, other researchers can conduct similar studies to validate the original findings.
  • Accuracy : Variables make it possible to measure phenomena accurately and objectively. By defining and measuring variables precisely, researchers can reduce bias and increase the accuracy of their findings.
  • Generalizability : Variables allow researchers to generalize their findings to larger populations. By selecting variables that are representative of the population, researchers can draw conclusions that are applicable to a broader range of individuals.
  • Clarity : Variables help researchers to communicate their findings more clearly and effectively. By defining and categorizing variables, researchers can organize and present their findings in a way that is easily understandable to others.

Disadvantages of Variables

Here are some of the main disadvantages of using variables in research:

  • Simplification : Variables may oversimplify the complexity of real-world phenomena. By breaking down a phenomenon into variables, researchers may lose important information and context, which can affect the accuracy and generalizability of their findings.
  • Measurement error : Variables rely on accurate and precise measurement, and measurement error can affect the reliability and validity of research findings. The use of subjective or poorly defined variables can also introduce measurement error into the study.
  • Confounding variables : Confounding variables are factors that are not measured but that affect the relationship between the variables of interest. If confounding variables are not accounted for, they can distort or obscure the relationship between the variables of interest.
  • Limited scope: Variables are defined by the researcher, and the scope of the study is therefore limited by the researcher’s choice of variables. This can lead to a narrow focus that overlooks important aspects of the phenomenon being studied.
  • Ethical concerns: The selection and measurement of variables may raise ethical concerns, especially in studies involving human subjects. For example, using variables that are related to sensitive topics, such as race or sexuality, may raise concerns about privacy and discrimination.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Quantitative Variable

Quantitative Variable – Definition, Types and...

Control Variable

Control Variable – Definition, Types and Examples

Ratio Variable

Ratio Variable – Definition, Purpose and Examples

Dichotomous Variable

Dichotomous Variable – Definition Types and...

Confounding Variable

Confounding Variable – Definition, Method and...

Interval Variable

Interval Variable – Definition, Purpose and...

  • USC Libraries
  • Research Guides

Organizing Your Social Sciences Research Paper

  • Independent and Dependent Variables
  • Purpose of Guide
  • Design Flaws to Avoid
  • Glossary of Research Terms
  • Reading Research Effectively
  • Narrowing a Topic Idea
  • Broadening a Topic Idea
  • Extending the Timeliness of a Topic Idea
  • Academic Writing Style
  • Applying Critical Thinking
  • Choosing a Title
  • Making an Outline
  • Paragraph Development
  • Research Process Video Series
  • Executive Summary
  • The C.A.R.S. Model
  • Background Information
  • The Research Problem/Question
  • Theoretical Framework
  • Citation Tracking
  • Content Alert Services
  • Evaluating Sources
  • Primary Sources
  • Secondary Sources
  • Tiertiary Sources
  • Scholarly vs. Popular Publications
  • Qualitative Methods
  • Quantitative Methods
  • Insiderness
  • Using Non-Textual Elements
  • Limitations of the Study
  • Common Grammar Mistakes
  • Writing Concisely
  • Avoiding Plagiarism
  • Footnotes or Endnotes?
  • Further Readings
  • Generative AI and Writing
  • USC Libraries Tutorials and Other Guides
  • Bibliography

Definitions

Dependent Variable The variable that depends on other factors that are measured. These variables are expected to change as a result of an experimental manipulation of the independent variable or variables. It is the presumed effect.

Independent Variable The variable that is stable and unaffected by the other variables you are trying to measure. It refers to the condition of an experiment that is systematically manipulated by the investigator. It is the presumed cause.

Cramer, Duncan and Dennis Howitt. The SAGE Dictionary of Statistics . London: SAGE, 2004; Penslar, Robin Levin and Joan P. Porter. Institutional Review Board Guidebook: Introduction . Washington, DC: United States Department of Health and Human Services, 2010; "What are Dependent and Independent Variables?" Graphic Tutorial.

Identifying Dependent and Independent Variables

Don't feel bad if you are confused about what is the dependent variable and what is the independent variable in social and behavioral sciences research . However, it's important that you learn the difference because framing a study using these variables is a common approach to organizing the elements of a social sciences research study in order to discover relevant and meaningful results. Specifically, it is important for these two reasons:

  • You need to understand and be able to evaluate their application in other people's research.
  • You need to apply them correctly in your own research.

A variable in research simply refers to a person, place, thing, or phenomenon that you are trying to measure in some way. The best way to understand the difference between a dependent and independent variable is that the meaning of each is implied by what the words tell us about the variable you are using. You can do this with a simple exercise from the website, Graphic Tutorial. Take the sentence, "The [independent variable] causes a change in [dependent variable] and it is not possible that [dependent variable] could cause a change in [independent variable]." Insert the names of variables you are using in the sentence in the way that makes the most sense. This will help you identify each type of variable. If you're still not sure, consult with your professor before you begin to write.

Fan, Shihe. "Independent Variable." In Encyclopedia of Research Design. Neil J. Salkind, editor. (Thousand Oaks, CA: SAGE, 2010), pp. 592-594; "What are Dependent and Independent Variables?" Graphic Tutorial; Salkind, Neil J. "Dependent Variable." In Encyclopedia of Research Design , Neil J. Salkind, editor. (Thousand Oaks, CA: SAGE, 2010), pp. 348-349;

Structure and Writing Style

The process of examining a research problem in the social and behavioral sciences is often framed around methods of analysis that compare, contrast, correlate, average, or integrate relationships between or among variables . Techniques include associations, sampling, random selection, and blind selection. Designation of the dependent and independent variable involves unpacking the research problem in a way that identifies a general cause and effect and classifying these variables as either independent or dependent.

The variables should be outlined in the introduction of your paper and explained in more detail in the methods section . There are no rules about the structure and style for writing about independent or dependent variables but, as with any academic writing, clarity and being succinct is most important.

After you have described the research problem and its significance in relation to prior research, explain why you have chosen to examine the problem using a method of analysis that investigates the relationships between or among independent and dependent variables . State what it is about the research problem that lends itself to this type of analysis. For example, if you are investigating the relationship between corporate environmental sustainability efforts [the independent variable] and dependent variables associated with measuring employee satisfaction at work using a survey instrument, you would first identify each variable and then provide background information about the variables. What is meant by "environmental sustainability"? Are you looking at a particular company [e.g., General Motors] or are you investigating an industry [e.g., the meat packing industry]? Why is employee satisfaction in the workplace important? How does a company make their employees aware of sustainability efforts and why would a company even care that its employees know about these efforts?

Identify each variable for the reader and define each . In the introduction, this information can be presented in a paragraph or two when you describe how you are going to study the research problem. In the methods section, you build on the literature review of prior studies about the research problem to describe in detail background about each variable, breaking each down for measurement and analysis. For example, what activities do you examine that reflect a company's commitment to environmental sustainability? Levels of employee satisfaction can be measured by a survey that asks about things like volunteerism or a desire to stay at the company for a long time.

The structure and writing style of describing the variables and their application to analyzing the research problem should be stated and unpacked in such a way that the reader obtains a clear understanding of the relationships between the variables and why they are important. This is also important so that the study can be replicated in the future using the same variables but applied in a different way.

Fan, Shihe. "Independent Variable." In Encyclopedia of Research Design. Neil J. Salkind, editor. (Thousand Oaks, CA: SAGE, 2010), pp. 592-594; "What are Dependent and Independent Variables?" Graphic Tutorial; “Case Example for Independent and Dependent Variables.” ORI Curriculum Examples. U.S. Department of Health and Human Services, Office of Research Integrity; Salkind, Neil J. "Dependent Variable." In Encyclopedia of Research Design , Neil J. Salkind, editor. (Thousand Oaks, CA: SAGE, 2010), pp. 348-349; “Independent Variables and Dependent Variables.” Karl L. Wuensch, Department of Psychology, East Carolina University [posted email exchange]; “Variables.” Elements of Research. Dr. Camille Nebeker, San Diego State University.

  • << Previous: Design Flaws to Avoid
  • Next: Glossary of Research Terms >>
  • Last Updated: Jun 18, 2024 10:45 AM
  • URL: https://libguides.usc.edu/writingguide
  • What is New
  • Download Your Software
  • Behavioral Research
  • Software for Consumer Research
  • Software for Human Factors R&D
  • Request Live Demo
  • Contact Sales

Sensor Hardware

Man wearing VR headset

We carry a range of biosensors from the top hardware producers. All compatible with iMotions

iMotions for Higher Education

Imotions for business.

significant role of variables in doing a research paper

What is the Observer Effect? 

Morten Pedersen

significant role of variables in doing a research paper

Neuroaesthetics: Decoding the Brain’s Love for Art and Beauty

Consumer Insights

News & Events

  • iMotions Lab
  • iMotions Online
  • Eye Tracking
  • Eye Tracking Screen Based
  • Eye Tracking VR
  • Eye Tracking Glasses
  • Eye Tracking Webcam
  • FEA (Facial Expression Analysis)
  • Voice Analysis
  • EDA/GSR (Electrodermal Activity)
  • EEG (Electroencephalography)
  • ECG (Electrocardiography)
  • EMG (Electromyography)
  • Respiration
  • iMotions Lab: New features
  • iMotions Lab: Developers
  • EEG sensors
  • Sensory and Perceptual
  • Consumer Inights
  • Human Factors R&D
  • Work Environments, Training and Safety
  • Customer Stories
  • Published Research Papers
  • Document Library
  • Customer Support Program
  • Help Center
  • Release Notes
  • Contact Support
  • Partnerships
  • Mission Statement
  • Ownership and Structure
  • Executive Management
  • Job Opportunities

Publications

  • Newsletter Sign Up

Roles of Independent and Dependent Variables in Research

Morten Pedersen

Explore the essential roles of independent and dependent variables in research. This guide delves into their definitions, significance in experiments, and their critical relationship. Learn how these variables are the foundation of research design, influencing hypothesis testing, theory development, and statistical analysis, empowering researchers to understand and predict outcomes of research studies.

Table of Contents

Introduction.

At the very base of scientific inquiry and research design , variables act as the fundamental steps, guiding the rhythm and direction of research. This is particularly true in human behavior research, where the quest to understand the complexities of human actions and reactions hinges on the meticulous manipulation and observation of these variables. At the heart of this endeavor lie two different types of variables, namely: independent and dependent variables, whose roles and interplay are critical in scientific discovery.

Understanding the distinction between independent and dependent variables is not merely an academic exercise; it is essential for anyone venturing into the field of research. This article aims to demystify these concepts, offering clarity on their definitions, roles, and the nuances of their relationship in the study of human behavior, and in science generally. We will cover hypothesis testing and theory development, illuminating how these variables serve as the cornerstone of experimental design and statistical analysis.

significant role of variables in doing a research paper

The significance of grasping the difference between independent and dependent variables extends beyond the confines of academia. It empowers researchers to design robust studies, enables critical evaluation of research findings, and fosters an appreciation for the complexity of human behavior research. As we delve into this exploration, our objective is clear: to equip readers with a deep understanding of these fundamental concepts, enhancing their ability to contribute to the ever-evolving field of human behavior research.

Chapter 1: The Role of Independent Variables in Human Behavior Research

In the realm of human behavior research, independent variables are the keystones around which studies are designed and hypotheses are tested. Independent variables are the factors or conditions that researchers manipulate or observe to examine their effects on dependent variables, which typically reflect aspects of human behavior or psychological phenomena. Understanding the role of independent variables is crucial for designing robust research methodologies, ensuring the reliability and validity of findings.

Defining Independent Variables

Independent variables are those variables that are changed or controlled in a scientific experiment to test the effects on dependent variables. In studies focusing on human behavior, these can range from psychological interventions (e.g., cognitive-behavioral therapy), environmental adjustments (e.g., noise levels, lighting, smells, etc), to societal factors (e.g., social media use). For example, in an experiment investigating the impact of sleep on cognitive performance, the amount of sleep participants receive is the independent variable. 

Selection and Manipulation

Selecting an independent variable requires careful consideration of the research question and the theoretical framework guiding the study. Researchers must ensure that their chosen variable can be effectively, and consistently manipulated or measured and is ethically and practically feasible, particularly when dealing with human subjects.

Manipulating an independent variable involves creating different conditions (e.g., treatment vs. control groups) to observe how changes in the variable affect outcomes. For instance, researchers studying the effect of educational interventions on learning outcomes might vary the type of instructional material (digital vs. traditional) to assess differences in student performance.

Challenges in Human Behavior Research

Manipulating independent variables in human behavior research presents unique challenges. Ethical considerations are paramount, as interventions must not harm participants. For example, studies involving vulnerable populations or sensitive topics require rigorous ethical oversight to ensure that the manipulation of independent variables does not result in adverse effects.

significant role of variables in doing a research paper

Practical limitations also come into play, such as controlling for extraneous variables that could influence the outcomes. In the aforementioned example of sleep and cognitive performance, factors like caffeine consumption or stress levels could confound the results. Researchers employ various methodological strategies, such as random assignment and controlled environments, to mitigate these influences.

Chapter 2: Dependent Variables: Measuring Human Behavior

The dependent variable in human behavior research acts as a mirror, reflecting the outcomes or effects resulting from variations in the independent variable. It is the aspect of human experience or behavior that researchers aim to understand, predict, or change through their studies. This section explores how dependent variables are measured, the significance of their accurate measurement, and the inherent challenges in capturing the complexities of human behavior.

Defining Dependent Variables

Dependent variables are the responses or outcomes that researchers measure in an experiment, expecting them to vary as a direct result of changes in the independent variable. In the context of human behavior research, dependent variables could include measures of emotional well-being, cognitive performance, social interactions, or any other aspect of human behavior influenced by the experimental manipulation. For instance, in a study examining the effect of exercise on stress levels, stress level would be the dependent variable, measured through various psychological assessments or physiological markers.

Measurement Methods and Tools

Measuring dependent variables in human behavior research involves a diverse array of methodologies, ranging from self-reported questionnaires and interviews to physiological measurements and behavioral observations. The choice of measurement tool depends on the nature of the dependent variable and the objectives of the study.

  • Self-reported Measures: Often used for assessing psychological states or subjective experiences, such as anxiety, satisfaction, or mood. These measures rely on participants’ introspection and honesty, posing challenges in terms of accuracy and bias.
  • Behavioral Observations: Involve the direct observation and recording of participants’ behavior in natural or controlled settings. This method is used for behaviors that can be externally observed and quantified, such as social interactions or task performance.
  • Physiological Measurements: Include the use of technology to measure physical responses that indicate psychological states, such as heart rate, cortisol levels, or brain activity. These measures can provide objective data about the physiological aspects of human behavior.

Reliability and Validity

The reliability and validity of the measurement of dependent variables are critical to the integrity of human behavior research.

  • Reliability refers to the consistency of a measure; a reliable tool yields similar results under consistent conditions.
  • Validity pertains to the accuracy of the measure; a valid tool accurately reflects the concept it aims to measure.

Ensuring reliability and validity often involves the use of established measurement instruments with proven track records, pilot testing new instruments, and applying rigorous statistical analyses to evaluate measurement properties.

Challenges in Measuring Human Behavior

Measuring human behavior presents challenges due to its complexity and the influence of multiple, often interrelated, variables. Researchers must contend with issues such as participant bias, environmental influences, and the subjective nature of many psychological constructs. Additionally, the dynamic nature of human behavior means that it can change over time, necessitating careful consideration of when and how measurements are taken.

Section 3: Relationship between Independent and Dependent Variables

Understanding the relationship between independent and dependent variables is at the core of research in human behavior. This relationship is what researchers aim to elucidate, whether they seek to explain, predict, or influence human actions and psychological states. This section explores the nature of this relationship, the means by which it is analyzed, and common misconceptions that may arise.

The Nature of the Relationship

The relationship between independent and dependent variables can manifest in various forms—direct, indirect, linear, nonlinear, and may be moderated or mediated by other variables. At its most basic, this relationship is often conceptualized as cause and effect: the independent variable (the cause) influences the dependent variable (the effect). For instance, increased physical activity (independent variable) may lead to decreased stress levels (dependent variable).

Analyzing the Relationship

Statistical analyses play a pivotal role in examining the relationship between independent and dependent variables. Techniques vary depending on the nature of the variables and the research design, ranging from simple correlation and regression analyses for quantifying the strength and form of relationships, to complex multivariate analyses for exploring relationships among multiple variables simultaneously.

  • Correlation Analysis : Used to determine the degree to which two variables are related. However, it’s crucial to note that correlation does not imply causation.
  • Regression Analysis : Goes a step further by not only assessing the strength of the relationship but also predicting the value of the dependent variable based on the independent variable.
  • Experimental Design : Provides a more robust framework for inferring causality, where manipulation of the independent variable and control of confounding factors allow researchers to directly observe the impact on the dependent variable.

Independent and Dependent Variables in Research

Causality vs. Correlation

A fundamental consideration in human behavior research is the distinction between causality and correlation. Causality implies that changes in the independent variable cause changes in the dependent variable. Correlation, on the other hand, indicates that two variables are related but does not establish a cause-effect relationship. Confounding variables may influence both, creating the appearance of a direct relationship where none exists. Understanding this distinction is crucial for accurate interpretation of research findings.

Common Misinterpretations

The complexity of human behavior and the myriad factors that influence it often lead to challenges in interpreting the relationship between independent and dependent variables. Researchers must be wary of:

  • Overestimating the strength of causal relationships based on correlational data.
  • Ignoring potential confounding variables that may influence the observed relationship.
  • Assuming the directionality of the relationship without adequate evidence.

This exploration highlights the importance of understanding independent and dependent variables in human behavior research. Independent variables act as the initiating factors in experiments, influencing the observed behaviors, while dependent variables reflect the results of these influences, providing insights into human emotions and actions. 

Ethical and practical challenges arise, especially in experiments involving human participants, necessitating careful consideration to respect participants’ well-being. The measurement of these variables is critical for testing theories and validating hypotheses, with their relationship offering potential insights into causality and correlation within human behavior. 

Rigorous statistical analysis and cautious interpretation of findings are essential to avoid misconceptions. Overall, the study of these variables is fundamental to advancing human behavior research, guiding researchers towards deeper understanding and potential interventions to improve the human condition.

Free 44-page Experimental Design Guide

For Beginners and Intermediates

  • Introduction to experimental methods
  • Respondent management with groups and populations
  • How to set up stimulus selection and arrangement

significant role of variables in doing a research paper

Last edited

About the author

See what is next in human behavior research

Follow our newsletter to get the latest insights and events send to your inbox.

Related Posts

significant role of variables in doing a research paper

The Impact of Gaze Tracking Technology: Applications and Benefits

significant role of variables in doing a research paper

The Ultimatum Game

significant role of variables in doing a research paper

The Stag Hunt (Game Theory)

significant role of variables in doing a research paper

Unlocking the Potential of VR Eye Trackers: How They Work and Their Applications

You might also like these.

significant role of variables in doing a research paper

Exploring Mobile Eye Trackers: How Eye Tracking Glasses Work and Their Applications

significant role of variables in doing a research paper

Understanding Screen-Based Eye Trackers: How They Work and Their Applications

Product guides.

Case Stories

Explore Blog Categories

Best Practice

Collaboration, product news, research fundamentals, research insights, 🍪 use of cookies.

We are committed to protecting your privacy and only use cookies to improve the user experience.

Chose which third-party services that you will allow to drop cookies. You can always change your cookie settings via the Cookie Settings link in the footer of the website. For more information read our Privacy Policy.

  • gtag This tag is from Google and is used to associate user actions with Google Ad campaigns to measure their effectiveness. Enabling this will load the gtag and allow for the website to share information with Google. This service is essential and can not be disabled.
  • Livechat Livechat provides you with direct access to the experts in our office. The service tracks visitors to the website but does not store any information unless consent is given. This service is essential and can not be disabled.
  • Pardot Collects information such as the IP address, browser type, and referring URL. This information is used to create reports on website traffic and track the effectiveness of marketing campaigns.
  • Third-party iFrames Allows you to see thirdparty iFrames.

significant role of variables in doing a research paper

Variables in Research | Types, Definiton & Examples

significant role of variables in doing a research paper

Introduction

What is a variable, what are the 5 types of variables in research, other variables in research.

Variables are fundamental components of research that allow for the measurement and analysis of data. They can be defined as characteristics or properties that can take on different values. In research design , understanding the types of variables and their roles is crucial for developing hypotheses , designing methods , and interpreting results .

This article outlines the the types of variables in research, including their definitions and examples, to provide a clear understanding of their use and significance in research studies. By categorizing variables into distinct groups based on their roles in research, their types of data, and their relationships with other variables, researchers can more effectively structure their studies and achieve more accurate conclusions.

significant role of variables in doing a research paper

A variable represents any characteristic, number, or quantity that can be measured or quantified. The term encompasses anything that can vary or change, ranging from simple concepts like age and height to more complex ones like satisfaction levels or economic status. Variables are essential in research as they are the foundational elements that researchers manipulate, measure, or control to gain insights into relationships, causes, and effects within their studies. They enable the framing of research questions, the formulation of hypotheses, and the interpretation of results.

Variables can be categorized based on their role in the study (such as independent and dependent variables ), the type of data they represent (quantitative or categorical), and their relationship to other variables (like confounding or control variables). Understanding what constitutes a variable and the various variable types available is a critical step in designing robust and meaningful research.

significant role of variables in doing a research paper

ATLAS.ti makes complex data easy to understand

Turn to our powerful data analysis tools to make the most of your research. Get started with a free trial.

Variables are crucial components in research, serving as the foundation for data collection , analysis , and interpretation . They are attributes or characteristics that can vary among subjects or over time, and understanding their types is essential for any study. Variables can be broadly classified into five main types, each with its distinct characteristics and roles within research.

This classification helps researchers in designing their studies, choosing appropriate measurement techniques, and analyzing their results accurately. The five types of variables include independent variables, dependent variables, categorical variables, continuous variables, and confounding variables. These categories not only facilitate a clearer understanding of the data but also guide the formulation of hypotheses and research methodologies.

Independent variables

Independent variables are foundational to the structure of research, serving as the factors or conditions that researchers manipulate or vary to observe their effects on dependent variables. These variables are considered "independent" because their variation does not depend on other variables within the study. Instead, they are the cause or stimulus that directly influences the outcomes being measured. For example, in an experiment to assess the effectiveness of a new teaching method on student performance, the teaching method applied (traditional vs. innovative) would be the independent variable.

The selection of an independent variable is a critical step in research design, as it directly correlates with the study's objective to determine causality or association. Researchers must clearly define and control these variables to ensure that observed changes in the dependent variable can be attributed to variations in the independent variable, thereby affirming the reliability of the results. In experimental research, the independent variable is what differentiates the control group from the experimental group, thereby setting the stage for meaningful comparison and analysis.

Dependent variables

Dependent variables are the outcomes or effects that researchers aim to explore and understand in their studies. These variables are called "dependent" because their values depend on the changes or variations of the independent variables.

Essentially, they are the responses or results that are measured to assess the impact of the independent variable's manipulation. For instance, in a study investigating the effect of exercise on weight loss, the amount of weight lost would be considered the dependent variable, as it depends on the exercise regimen (the independent variable).

The identification and measurement of the dependent variable are crucial for testing the hypothesis and drawing conclusions from the research. It allows researchers to quantify the effect of the independent variable , providing evidence for causal relationships or associations. In experimental settings, the dependent variable is what is being tested and measured across different groups or conditions, enabling researchers to assess the efficacy or impact of the independent variable's variation.

To ensure accuracy and reliability, the dependent variable must be defined clearly and measured consistently across all participants or observations. This consistency helps in reducing measurement errors and increases the validity of the research findings. By carefully analyzing the dependent variables, researchers can derive meaningful insights from their studies, contributing to the broader knowledge in their field.

Categorical variables

Categorical variables, also known as qualitative variables, represent types or categories that are used to group observations. These variables divide data into distinct groups or categories that lack a numerical value but hold significant meaning in research. Examples of categorical variables include gender (male, female, other), type of vehicle (car, truck, motorcycle), or marital status (single, married, divorced). These categories help researchers organize data into groups for comparison and analysis.

Categorical variables can be further classified into two subtypes: nominal and ordinal. Nominal variables are categories without any inherent order or ranking among them, such as blood type or ethnicity. Ordinal variables, on the other hand, imply a sort of ranking or order among the categories, like levels of satisfaction (high, medium, low) or education level (high school, bachelor's, master's, doctorate).

Understanding and identifying categorical variables is crucial in research as it influences the choice of statistical analysis methods. Since these variables represent categories without numerical significance, researchers employ specific statistical tests designed for a nominal or ordinal variable to draw meaningful conclusions. Properly classifying and analyzing categorical variables allow for the exploration of relationships between different groups within the study, shedding light on patterns and trends that might not be evident with numerical data alone.

Continuous variables

Continuous variables are quantitative variables that can take an infinite number of values within a given range. These variables are measured along a continuum and can represent very precise measurements. Examples of continuous variables include height, weight, temperature, and time. Because they can assume any value within a range, continuous variables allow for detailed analysis and a high degree of accuracy in research findings.

The ability to measure continuous variables at very fine scales makes them invaluable for many types of research, particularly in the natural and social sciences. For instance, in a study examining the effect of temperature on plant growth, temperature would be considered a continuous variable since it can vary across a wide spectrum and be measured to several decimal places.

When dealing with continuous variables, researchers often use methods incorporating a particular statistical test to accommodate a wide range of data points and the potential for infinite divisibility. This includes various forms of regression analysis, correlation, and other techniques suited for modeling and analyzing nuanced relationships between variables. The precision of continuous variables enhances the researcher's ability to detect patterns, trends, and causal relationships within the data, contributing to more robust and detailed conclusions.

Confounding variables

Confounding variables are those that can cause a false association between the independent and dependent variables, potentially leading to incorrect conclusions about the relationship being studied. These are extraneous variables that were not considered in the study design but can influence both the supposed cause and effect, creating a misleading correlation.

Identifying and controlling for a confounding variable is crucial in research to ensure the validity of the findings. This can be achieved through various methods, including randomization, stratification, and statistical control. Randomization helps to evenly distribute confounding variables across study groups, reducing their potential impact. Stratification involves analyzing the data within strata or layers that share common characteristics of the confounder. Statistical control allows researchers to adjust for the effects of confounders in the analysis phase.

Properly addressing confounding variables strengthens the credibility of research outcomes by clarifying the direct relationship between the dependent and independent variables, thus providing more accurate and reliable results.

significant role of variables in doing a research paper

Beyond the primary categories of variables commonly discussed in research methodology , there exists a diverse range of other variables that play significant roles in the design and analysis of studies. Below is an overview of some of these variables, highlighting their definitions and roles within research studies:

  • Discrete variables : A discrete variable is a quantitative variable that represents quantitative data , such as the number of children in a family or the number of cars in a parking lot. Discrete variables can only take on specific values.
  • Categorical variables : A categorical variable categorizes subjects or items into groups that do not have a natural numerical order. Categorical data includes nominal variables, like country of origin, and ordinal variables, such as education level.
  • Predictor variables : Often used in statistical models, a predictor variable is used to forecast or predict the outcomes of other variables, not necessarily with a causal implication.
  • Outcome variables : These variables represent the results or outcomes that researchers aim to explain or predict through their studies. An outcome variable is central to understanding the effects of predictor variables.
  • Latent variables : Not directly observable, latent variables are inferred from other, directly measured variables. Examples include psychological constructs like intelligence or socioeconomic status.
  • Composite variables : Created by combining multiple variables, composite variables can measure a concept more reliably or simplify the analysis. An example would be a composite happiness index derived from several survey questions .
  • Preceding variables : These variables come before other variables in time or sequence, potentially influencing subsequent outcomes. A preceding variable is crucial in longitudinal studies to determine causality or sequences of events.

significant role of variables in doing a research paper

Master qualitative research with ATLAS.ti

Turn data into critical insights with our data analysis platform. Try out a free trial today.

significant role of variables in doing a research paper

significant role of variables in doing a research paper

  • Get new issue alerts Get alerts
  • IARS MEMBER LOGIN

Secondary Logo

Journal logo.

Colleague's E-mail is Invalid

Your message has been successfully sent to your colleague.

Save my selection

Fundamentals of Research Data and Variables: The Devil Is in the Details

Vetter, Thomas R. MD, MPH

From the Department of Surgery and Perioperative Care, Dell Medical School at the University of Texas at Austin, Austin, Texas.

Accepted for publication June 26, 2017.

Published ahead of print August 4, 2017.

Funding: None.

The author declares no conflicts of interest.

Reprints will not be available from the author.

Address correspondence to Thomas R. Vetter, MD, MPH, Department of Surgery and Perioperative Care, Dell Medical School at the University of Texas at Austin, Dell Pediatric Research Institute, Suite 1.114, 1400 Barbara Jordan Blvd, Austin, TX 78723. Address e-mail to [email protected] .

Designing, conducting, analyzing, reporting, and interpreting the findings of a research study require an understanding of the types and characteristics of data and variables. Descriptive statistics are typically used simply to calculate, describe, and summarize the collected research data in a logical, meaningful, and efficient way. Inferential statistics allow researchers to make a valid estimate of the association between an intervention and the treatment effect in a specific population, based upon their randomly collected, representative sample data. Categorical data can be either dichotomous or polytomous. Dichotomous data have only 2 categories, and thus are considered binary. Polytomous data have more than 2 categories. Unlike dichotomous and polytomous data, ordinal data are rank ordered, typically based on a numerical scale that is comprised of a small set of discrete classes or integers. Continuous data are measured on a continuum and can have any numeric value over this continuous range. Continuous data can be meaningfully divided into smaller and smaller or finer and finer increments, depending upon the precision of the measurement instrument. Interval data are a form of continuous data in which equal intervals represent equal differences in the property being measured. Ratio data are another form of continuous data, which have the same properties as interval data, plus a true definition of an absolute zero point, and the ratios of the values on the measurement scale make sense. The normal (Gaussian) distribution (“bell-shaped curve”) is of the most common statistical distributions. Many applied inferential statistical tests are predicated on the assumption that the analyzed data follow a normal distribution. The histogram and the Q–Q plot are 2 graphical methods to assess if a set of data have a normal distribution (display “normality”). The Shapiro-Wilk test and the Kolmogorov-Smirnov test are 2 well-known and historically widely applied quantitative methods to assess for data normality. Parametric statistical tests make certain assumptions about the characteristics and/or parameters of the underlying population distribution upon which the test is based, whereas nonparametric tests make fewer or less rigorous assumptions. If the normality test concludes that the study data deviate significantly from a Gaussian distribution, rather than applying a less robust nonparametric test, the problem can potentially be remedied by judiciously and openly: (1) performing a data transformation of all the data values; or (2) eliminating any obvious data outlier(s).

Der teufel steckt im detail [The devil is in the details]

Friedrich Wilhelm Nietzsche (1844–1900)

Designing, conducting, analyzing, reporting, and interpreting the findings of a research study require an understanding of the types and characteristics of data and variables. This basic statistical tutorial discusses the following fundamental concepts about research data and variables:

  • Population parameter versus sample variable;
  • Types of research variables;
  • Descriptive statistics versus inferential statistics;
  • Primary data versus secondary data and analyses;
  • Measurement scales and types of data;
  • Normal versus non-normal data distribution;
  • Assessing for normality of data;
  • Parametric versus nonparametric statistical tests; and
  • Data transformation to achieve normality.

POPULATION PARAMETER VERSUS SAMPLE VARIABLE

In conducting a research study, one ideally would obtain the pertinent data from all the members of the specific, targeted population, which defines the population parameter. However, this is seldom feasible, unless the entire targeted population is relatively small, and all its members are easily and readily accessible. 1–4

Pertinent data instead are typically collected on a random, representative subset or sample chosen from the members of the overall specific population, which defines the sample variable. The unknown population parameter, representing the characteristic or association of interest, is then estimated from this chosen study sample, with a varying degree of accuracy or precision. One essentially extrapolates from this sample to make conclusions about the population. 1–4

TYPES OF RESEARCH VARIABLES

When undertaking research, there are 4 basic types of variables to consider and define: 1 , 5–7

  • Independent variable: A variable that is believed to be the cause of some of the observed effect or association and one that is directly manipulated by the researcher during the study or experiment.
  • Dependent variable: A variable that is believed to be directly affected by changes in an independent variable and one that is directly measured by the researcher during the study or experiment.
  • Predictor variable: A variable that is believed to predict another variable and one that is identified, determined, and/or controlled by the researcher during the study or experiment (essentially synonymous with an independent variable).
  • Outcome variable: A variable that is believed to change as a result of a change in a predictor variable and one that is directly measured by the researcher during the study or experiment (essentially synonymous with a dependent variable).

DESCRIPTIVE STATISTICS VERSUS INFERENTIAL STATISTICS

Descriptive statistics are specific methods used simply to calculate, describe, and summarize the collected research data in a logical, meaningful, and efficient way. Descriptive statistics are reported numerically in the text and tables or in graphical forms. 1 , 8 Descriptive statistics will be the topic of the next basic tutorial in this series.

Researchers often pose a hypothesis (“if this is done, then this occurs” or “if this occurs, then this happens”) and seek to describe and to compare the quantitative or qualitative characteristics of 2 or more populations: 1 with and 1 without a specific intervention, or before and after the intervention in the same group. Purely descriptive statistics alone do not allow a conclusion to be made about association or effect and thus cannot answer a research hypothesis. 1 , 8

Inferential statistics involves using available data about a sample variable to make a valid inference (estimate) about its corresponding, underlying, but unknown population parameter. Inferential statistics also allow researchers to make a valid estimate of the association between an intervention and the treatment effect (causal-effect) in a specific population, based upon their randomly collected, representative sample data. 1 , 3 , 8

For example, Castro-Alves et al 9 recently reported on their prospective, randomized, placebo-controlled, double-blinded trial, in which the perioperative administration of duloxetine improved postoperative quality of recovery after abdominal hysterectomy. Based upon their study sample, these researchers made the valid inference that duloxetine appears to be an effective medication to improve postoperative quality of recovery in all similar patients undergoing abdominal hysterectomy. 9

PRIMARY DATA VERSUS SECONDARY DATA AND ANALYSIS

Frequently, there is confusion about the terms primary data and primary data analysis versus secondary data and secondary data analysis. 10

Primary data are intentionally and originally collected for the purposes of a specific research study, and it is a priori planned primary data analysis. 10–12 Primary data are usually collected prospectively but can be collected retrospectively. 12 Valid and reliable primary clinical data collection tends to be time consuming, labor intensive, and costly, especially if undertaken on a large scale and/or at multiple, independent, care-delivery locations or sites. 13

For example, a large-scale randomized study is being undertaken in 40 centers in 5 countries over 3 years to determine whether a stronger association (and thus more likely causality) exists between relatively deep anesthesia, as guided by the bispectral index, and increased postoperative mortality. 14

Likewise, the General Anesthesia compared to Spinal anesthesia study is an ongoing prospective randomized, controlled, multisite, trial designed to assess the influence of general anesthesia on neurodevelopment at 5 years of age. 15

Secondary data are initially collected for other purposes, and these existing data are subsequently used for a research study and its secondary data analyses. 11 , 16 Examples include the myriad of bedside clinical data recorded for routine patient care and administrative claims data utilized for billing and third-party payer purposes. 13

Such hospital administrative data (health care claims data) represent an important alternative data source that can be used to answer a broad range of research questions, including perioperative and critical care medicine, which would be difficult to study with a prospective randomized controlled trial. 17 , 18

Secondary clinical data can also be gathered and coalesced into a large-scale research data repository or warehouse, which is intentionally created for quality assurance, performance improvement, health services, or clinical outcomes research purposes. Data on study-specific variables are then extracted (“abstracted”) from one of these already existing secondary data sources. 16 , 19 , 20 An example is the National Anesthesia Clinical Outcomes Registry, developed by the Anesthesia Quality Institute of the American Society of Anesthesiologists. 21

Despite the resources needed for their creation, maintenance, and extraction, secondary data are typically less time consuming, labor intensive, and costly than primary data, especially if needed on a large scale (eg, health services and outcomes research questions in perioperative and critical care medicine). 22 However, the possible study variables are limited to those that already exist. 16 , 20 , 22 Furthermore, the validity of the findings of the research study can be adversely affected by a poorly constructed or executed secondary data collection or extraction process (“garbage in—garbage out”). 16 , 22

The term “secondary analysis of existing data” is generally preferred to the traditional term “secondary data analysis” because the former avoids the need to decide whether the data used in an analysis are primary or secondary. 10 An example is the predefined secondary analysis of existing data, prospectively collected in the Vascular Events in Non-Cardiac Surgery Patients Cohort Evaluation study, which assessed the association between preoperative heart rate and myocardial injury after noncardiac surgery. 23

MEASUREMENT SCALES AND TYPES OF DATA

Categorical data.

Some demographic and clinical characteristics can be parsed into and described using separate, discrete categories. The key distinction is the lack of rank order to these discrete categories. Categorical data can also be called nominal data (from the Latin word, nomen, for “name”), implying that there is no ordering to the categories, but rather simply names. Categorical data can be either dichotomous (2 categories) or polytomous (more than 2 categories). 1 , 5 , 24 , 25

Dichotomous data have only 2 categories, and thus are considered binary (yes or no; positive or negative). 1 , 5 , 24 , 25 Many clinical outcomes (eg, postoperative nausea/vomiting, myocardial infarction, stroke, sepsis, and mortality) can be recorded and reported as dichotomous data.

Polytomous data have more than 2 categories. Examples of such data include sex (man, woman, or transgender), race/ethnicity (American Indian or Alaska Native, Asian, black or African American, Hispanic or Latino, Native Hawaiian or other Pacific Islander, and white or Caucasian), body habitus (ectomorph, mesomorph, or endomorph), hair color (black, brown, blond, or red), blood type (A, B, AB, or O), and diet (carnivore, omnivore, vegetarian, or vegan). 1 , 5 , 6 , 24 , 25

Ordinal Data

Unlike nominal or categorical data, ordinal data follow a logical order. Ordinal data are rank ordered, typically based on a numerical scale that is comprised of a small set of discrete classes or integers. 1 , 5 , 24 , 25 A key characteristic is that the response categories have a rank order, but the intervals between the values cannot be presumed to be equal. 26 The numeric Likert scale (1 = strongly disagree to 5 = strongly agree), which is commonly used to measure respondent attitude, generates ordinal data. 26 Other examples of ordinal data include socioeconomic status (low, medium, or high), highest educational level completed (elementary, middle school, high school, college, or postcollege graduate), the American Society of Anesthesiologists Physical Status Score (I, II, III, IV, or V), and the 11-point numerical rating scale (0–10) for pain intensity.

Discrete Data.

Categorical data that are counts or integers (eg, the number of episodes of intraoperative bradycardia or hypotension experienced by a patient) are typically called discrete data. Discrete data may be more appropriately analyzed using different statistical methods than ordinal data 27 ; however, in practice, the same methods are often used for these 2 variable types. In general, “discrete data variables” refer to those which can only take on certain specific values and are thus distinguished from continuous data, which are discussed next.

Continuous (Interval or Ratio) Data

Continuous data are measured on a continuum and can have or occupy any numeric value over this continuous range. Continuous data can be meaningfully divided into smaller and smaller or finer and finer increments, depending upon the sensitivity or precision of the measurement instrument. 25

Interval data are a form of continuous data in which equal intervals represent equal differences in the property being measured. 1 , 5 , 6 , 28 For example, the 1° difference between a temperature of 37° and 36° is the same 1° difference as between a temperature of 36° and 35°. However, when using the Fahrenheit or Celsius scale, a temperature of 100° is not twice as hot as 50° because a temperature of 0° on either scale does not mean “no heat” (but this would be true for Kelvin temperature). 28 This leads us naturally to a definition of ratio data.

Ratio data are another form of continuous data, which have the same properties as interval data, plus a true definition of an absolute zero point, and the ratios of the values on the measurement scale must make sense. 1 , 5 , 6 , 28 Age, height, weight, heart rate, and blood pressure are also ratio data. For example, a weight of 4 g is twice the weight of 2 g. 28 The visual analog scale (VAS) pain intensity tool generates ratio data. 29 A VAS score of 0 represents no pain, and a VAS score of 60 actually represents twice as much pain as a VAS score of 30.

NORMAL VERSUS NON-NORMAL DATA DISTRIBUTION

A statistical distribution is a graph of the possible specific values or the intervals of values of a variable (on the x-axis) and how often the observed values occur (on the y-axis). There are multiple types of data distribution, including the normal (Gaussian) distribution, binomial distribution, and Poisson distribution. 30 , 31 The so-inclined reader is referred to a more in-depth discussion of the various types or patterns of data distribution. 32

F1

The normal (Gaussian) distribution (the “bell-shaped curve”) ( Figure 1 ) is one of the most common statistical distributions. 33 , 34 Many applied inferential statistical tests are predicated on the assumption that the analyzed data follow a normal distribution. Therefore, the normal distribution is also one of the most relevant to basic inferential statistics. 30 , 31 , 33–35

METHODS FOR ASSESSING DATA NORMALITY

The histogram and the Q–Q plot are 2 graphical methods to visually assess if a set of data have a normal distribution (display “normality”). The Shapiro-Wilk test and Kolmogorov-Smirnov test are 2 well-known and historically widely applied quantitative methods to assess for data normality. 36 Graphical methods and quantitative testing can complement one another; therefore, it is preferable that data normality be assessed both visually and with a statistical test. 30 , 37 , 38 However, if one is uncertain about how to correctly interpret the more subjective histogram or the Q–Q plot, it is better to rely instead on a numerical test statistic. 37 See the study by Kuhn et al 39 for an example.

F2

The histogram or frequency distribution of the study data can be used to graphically assess for normality. If the study data are normally distributed, the histogram or frequency distribution of these data will fall within the shape of a bell curve ( Figure 2A ), whereas if the study data are not normally distributed, the histogram or frequency distribution of these data will fall outside the shape of a bell curve ( Figure 2B ). 35 When applicable, authors state in their manuscript their use of a histogram to assess the normality of their primary outcome data, but they do not reproduce this graph. See the study by Blitz et al 40 for an example.

F3

One can also use the output of a quantile–quantile or Q–Q plot to graphically assess if a set of data plausibly came from a normal distribution. The Q–Q plot is a scatterplot of the quantiles of a theoretical normal data set (on x-axis) and the quantiles of the actual sample data set (on y-axis). If the data are normally distributed, the data points on the Q–Q plot will be closely aligned with the 45°, reference diagonal line ( Figure 3A ). If the individual data points stray from the reference diagonal line in an obviously nonlinear fashion, the data are not normally distributed ( Figure 3B ). When applicable, authors state in their manuscript their use of a Q–Q plot to assess the normality of their primary outcome data, but they do not reproduce this graph. See the study by Jæger et al 41 for an example.

Shapiro-Wilk Test and Kolmogorov-Smirnov Test

Both the Shapiro-Wilk and the Kolmogorov-Smirnov tests compare the scores in the study sample with a normally distributed set of scores with the same mean and SD; their null hypothesis is that sample distribution is normal. Therefore, if the test is significant ( P < .05), the sample data distribution is non-normal. 30 , 36 When applicable, authors should state in their manuscript which test was used to assess the normality of their primary outcome data and report its corresponding P value.

The Shapiro-Wilk test is more appropriate for small sample sizes (N ≤ 50), but it can also be validly applied with large sample sizes. The Shapiro-Wilk test provides greater power than the Kolmogorov-Smirnov test (even with its Lilliefors correction). For these reasons, the Shapiro-Wilk test has been recommended as the numerical means for assessing data normality. 30 , 36 , 37

PARAMETRIC VERSUS NONPARAMETRIC STATISTICAL TESTS

The details and appropriate use of the wide array of available inferential statistical tests will be the topics of several future tutorials in this current series.

These statistical tests are commonly classified as parametric versus nonparametric. This distinction is generally predicated on the number and rigor of the assumptions (requirements) regarding the underlying study population. 42 Parametric statistical tests make certain assumptions about the characteristics and/or parameters of the underlying population distribution upon which the test is based, whereas nonparametric tests make fewer or less rigorous assumptions. 42

Specifically, parametric statistical tests assume that the data have been sampled from a specific probability distribution (a normal distribution); nonparametric statistical tests make no such distribution assumption. 43 , 44 In general, parametric tests are more powerful (“robust”) than nonparametric tests, and so if possible, a parametric test should be applied. 43

DATA TRANSFORMATION TO ACHIEVE NORMALITY

Researchers may find that their available study data are not normally distributed, ostensibly calling into question the validity of using a more robust parametric statistical test.

While the results of the above tests of normality are typically reported (including in Anesthesia & Analgesia ), they are not a panacea. With small sample sizes, these normality tests do not have much power to detect a non-Gaussian distribution. With large sample sizes, minor deviations from the Gaussian “ideal” might be deemed “statistically significant” by a normality test; however, the commonly applied parametric t test and analysis of variance are then fairly tolerant of a violation of the normality assumption. 36 , 45 The decision to apply a parametric test versus a nonparametric test is thus sometimes a difficult one, requiring thought and perspective, and should not be simply automated. 36

If the normality test concludes that study data deviate significantly from a Gaussian distribution, rather than applying a less robust nonparametric test, the problem can potentially be remedied by judiciously and openly: (1) performing a data transformation of all the data values; or (2) eliminating any obvious data outlier(s). 36 , 46 Most commonly, logarithmic, square root, or reciprocal data transformation are applied to achieve data normality. 47 See the studies by Law et al 48 and Maquoi et al 49 for examples.

CONCLUSIONS

A basic understanding of data and variables is required to design, conduct, analyze, report, and interpret, as well as to understand and apply, the findings of a research study. The assumption of study data demonstrating a normal (Gaussian) distribution, and the corresponding choice of a parametric versus nonparametric statistical test, can be a complex and vexing issue. As will be discussed in detail in future tutorials, the type and characteristics of study data and variables essentially determine the appropriate descriptive statistics and inferential statistical tests to apply.

DISCLOSURES

Name: Thomas R. Vetter, MD, MPH.

Contribution: This author wrote and revised the manuscript.

This manuscript was handled by: Jean-Francois Pittet, MD.

  • Cited Here |
  • Google Scholar
  • + Favorites
  • View in Gallery

Readers Of this Article Also Read

Magic mirror, on the wall—which is the right study design of them all—part i, descriptive statistics: reporting the answers to the 5 basic questions of who,..., in the beginning—there is the introduction—and your study hypothesis, significance, errors, power, and sample size: the blocking and tackling of..., diagnostic testing and decision-making: beauty is not just in the eye of the....

Elements of Research

                                                                                   

The purpose of all research is to describe and explain in the world. Variance is simply the difference; that is, variation that occurs naturally in the world or change that we create as a result of a manipulation. Variables are names that are given to the variance we wish to explain.

A variable is either a result of some force or is itself the force that causes a change in another variable. In experiments, these are called and variables respectively. When a researcher gives an active drug to one group of people and a placebo , or inactive drug, to another group of people, the independent variable is the drug treatment. Each person's response to the active drug or is called the dependent variable. This could be many things depending upon what the drug is for, such as high blood pressure or muscle pain. Therefore in experiments, a researcher manipulates an independent variable to determine if it causes a change in the dependent variable.

As we learned earlier in a descriptive study, variables are not manipulated.  They are observed as they naturally occur and then associations between variables are studied.  In a way, all the variables in descriptive studies are dependent variables because they are studied in relation to all the other variables that exist in the setting where the research is taking place. However, in descriptive studies, variables are not discussed using the terms "independent" or "dependent." Instead, the names of the variables are used when discussing the study.  For example, there is more diabetes in people of Native American heritage than people who come from Eastern Europe.  In a descriptive study, the researcher would examine how diabetes (a variable) is related to a person's genetic heritage (another variable).

Variables are important to understand because they are the basic units of the information studied and interpreted in research studies. Researchers carefully analyze and interpret the value(s) of each variable to make sense of how things relate to each other in a descriptive study or what has happened in an experiment.

                                

                                                                                                          

 

Frequently asked questions

Why are independent and dependent variables important.

Determining cause and effect is one of the most important parts of scientific research. It’s essential to know which is the cause – the independent variable – and which is the effect – the dependent variable.

Frequently asked questions: Methodology

Attrition refers to participants leaving a study. It always happens to some extent—for example, in randomized controlled trials for medical research.

Differential attrition occurs when attrition or dropout rates differ systematically between the intervention and the control group . As a result, the characteristics of the participants who drop out differ from the characteristics of those who stay in the study. Because of this, study results may be biased .

Action research is conducted in order to solve a particular issue immediately, while case studies are often conducted over a longer period of time and focus more on observing and analyzing a particular ongoing phenomenon.

Action research is focused on solving a problem or informing individual and community-based knowledge in a way that impacts teaching, learning, and other related processes. It is less focused on contributing theoretical input, instead producing actionable input.

Action research is particularly popular with educators as a form of systematic inquiry because it prioritizes reflection and bridges the gap between theory and practice. Educators are able to simultaneously investigate an issue as they solve it, and the method is very iterative and flexible.

A cycle of inquiry is another name for action research . It is usually visualized in a spiral shape following a series of steps, such as “planning → acting → observing → reflecting.”

To make quantitative observations , you need to use instruments that are capable of measuring the quantity you want to observe. For example, you might use a ruler to measure the length of an object or a thermometer to measure its temperature.

Criterion validity and construct validity are both types of measurement validity . In other words, they both show you how accurately a method measures something.

While construct validity is the degree to which a test or other measurement method measures what it claims to measure, criterion validity is the degree to which a test can predictively (in the future) or concurrently (in the present) measure something.

Construct validity is often considered the overarching type of measurement validity . You need to have face validity , content validity , and criterion validity in order to achieve construct validity.

Convergent validity and discriminant validity are both subtypes of construct validity . Together, they help you evaluate whether a test measures the concept it was designed to measure.

  • Convergent validity indicates whether a test that is designed to measure a particular construct correlates with other tests that assess the same or similar construct.
  • Discriminant validity indicates whether two tests that should not be highly related to each other are indeed not related. This type of validity is also called divergent validity .

You need to assess both in order to demonstrate construct validity. Neither one alone is sufficient for establishing construct validity.

  • Discriminant validity indicates whether two tests that should not be highly related to each other are indeed not related

Content validity shows you how accurately a test or other measurement method taps  into the various aspects of the specific construct you are researching.

In other words, it helps you answer the question: “does the test measure all aspects of the construct I want to measure?” If it does, then the test has high content validity.

The higher the content validity, the more accurate the measurement of the construct.

If the test fails to include parts of the construct, or irrelevant parts are included, the validity of the instrument is threatened, which brings your results into question.

Face validity and content validity are similar in that they both evaluate how suitable the content of a test is. The difference is that face validity is subjective, and assesses content at surface level.

When a test has strong face validity, anyone would agree that the test’s questions appear to measure what they are intended to measure.

For example, looking at a 4th grade math test consisting of problems in which students have to add and multiply, most people would agree that it has strong face validity (i.e., it looks like a math test).

On the other hand, content validity evaluates how well a test represents all the aspects of a topic. Assessing content validity is more systematic and relies on expert evaluation. of each question, analyzing whether each one covers the aspects that the test was designed to cover.

A 4th grade math test would have high content validity if it covered all the skills taught in that grade. Experts(in this case, math teachers), would have to evaluate the content validity by comparing the test to the learning objectives.

Snowball sampling is a non-probability sampling method . Unlike probability sampling (which involves some form of random selection ), the initial individuals selected to be studied are the ones who recruit new participants.

Because not every member of the target population has an equal chance of being recruited into the sample, selection in snowball sampling is non-random.

Snowball sampling is a non-probability sampling method , where there is not an equal chance for every member of the population to be included in the sample .

This means that you cannot use inferential statistics and make generalizations —often the goal of quantitative research . As such, a snowball sample is not representative of the target population and is usually a better fit for qualitative research .

Snowball sampling relies on the use of referrals. Here, the researcher recruits one or more initial participants, who then recruit the next ones.

Participants share similar characteristics and/or know each other. Because of this, not every member of the population has an equal chance of being included in the sample, giving rise to sampling bias .

Snowball sampling is best used in the following cases:

  • If there is no sampling frame available (e.g., people with a rare disease)
  • If the population of interest is hard to access or locate (e.g., people experiencing homelessness)
  • If the research focuses on a sensitive topic (e.g., extramarital affairs)

The reproducibility and replicability of a study can be ensured by writing a transparent, detailed method section and using clear, unambiguous language.

Reproducibility and replicability are related terms.

  • Reproducing research entails reanalyzing the existing data in the same manner.
  • Replicating (or repeating ) the research entails reconducting the entire analysis, including the collection of new data . 
  • A successful reproduction shows that the data analyses were conducted in a fair and honest manner.
  • A successful replication shows that the reliability of the results is high.

Stratified sampling and quota sampling both involve dividing the population into subgroups and selecting units from each subgroup. The purpose in both cases is to select a representative sample and/or to allow comparisons between subgroups.

The main difference is that in stratified sampling, you draw a random sample from each subgroup ( probability sampling ). In quota sampling you select a predetermined number or proportion of units, in a non-random manner ( non-probability sampling ).

Purposive and convenience sampling are both sampling methods that are typically used in qualitative data collection.

A convenience sample is drawn from a source that is conveniently accessible to the researcher. Convenience sampling does not distinguish characteristics among the participants. On the other hand, purposive sampling focuses on selecting participants possessing characteristics associated with the research study.

The findings of studies based on either convenience or purposive sampling can only be generalized to the (sub)population from which the sample is drawn, and not to the entire population.

Random sampling or probability sampling is based on random selection. This means that each unit has an equal chance (i.e., equal probability) of being included in the sample.

On the other hand, convenience sampling involves stopping people at random, which means that not everyone has an equal chance of being selected depending on the place, time, or day you are collecting your data.

Convenience sampling and quota sampling are both non-probability sampling methods. They both use non-random criteria like availability, geographical proximity, or expert knowledge to recruit study participants.

However, in convenience sampling, you continue to sample units or cases until you reach the required sample size.

In quota sampling, you first need to divide your population of interest into subgroups (strata) and estimate their proportions (quota) in the population. Then you can start your data collection, using convenience sampling to recruit participants, until the proportions in each subgroup coincide with the estimated proportions in the population.

A sampling frame is a list of every member in the entire population . It is important that the sampling frame is as complete as possible, so that your sample accurately reflects your population.

Stratified and cluster sampling may look similar, but bear in mind that groups created in cluster sampling are heterogeneous , so the individual characteristics in the cluster vary. In contrast, groups created in stratified sampling are homogeneous , as units share characteristics.

Relatedly, in cluster sampling you randomly select entire groups and include all units of each group in your sample. However, in stratified sampling, you select some units of all groups and include them in your sample. In this way, both methods can ensure that your sample is representative of the target population .

A systematic review is secondary research because it uses existing research. You don’t collect new data yourself.

The key difference between observational studies and experimental designs is that a well-done observational study does not influence the responses of participants, while experiments do have some sort of treatment condition applied to at least some participants by random assignment .

An observational study is a great choice for you if your research question is based purely on observations. If there are ethical, logistical, or practical concerns that prevent you from conducting a traditional experiment , an observational study may be a good choice. In an observational study, there is no interference or manipulation of the research subjects, as well as no control or treatment groups .

It’s often best to ask a variety of people to review your measurements. You can ask experts, such as other researchers, or laypeople, such as potential participants, to judge the face validity of tests.

While experts have a deep understanding of research methods , the people you’re studying can provide you with valuable insights you may have missed otherwise.

Face validity is important because it’s a simple first step to measuring the overall validity of a test or technique. It’s a relatively intuitive, quick, and easy way to start checking whether a new measure seems useful at first glance.

Good face validity means that anyone who reviews your measure says that it seems to be measuring what it’s supposed to. With poor face validity, someone reviewing your measure may be left confused about what you’re measuring and why you’re using this method.

Face validity is about whether a test appears to measure what it’s supposed to measure. This type of validity is concerned with whether a measure seems relevant and appropriate for what it’s assessing only on the surface.

Statistical analyses are often applied to test validity with data from your measures. You test convergent validity and discriminant validity with correlations to see if results from your test are positively or negatively related to those of other established tests.

You can also use regression analyses to assess whether your measure is actually predictive of outcomes that you expect it to predict theoretically. A regression analysis that supports your expectations strengthens your claim of construct validity .

When designing or evaluating a measure, construct validity helps you ensure you’re actually measuring the construct you’re interested in. If you don’t have construct validity, you may inadvertently measure unrelated or distinct constructs and lose precision in your research.

Construct validity is often considered the overarching type of measurement validity ,  because it covers all of the other types. You need to have face validity , content validity , and criterion validity to achieve construct validity.

Construct validity is about how well a test measures the concept it was designed to evaluate. It’s one of four types of measurement validity , which includes construct validity, face validity , and criterion validity.

There are two subtypes of construct validity.

  • Convergent validity : The extent to which your measure corresponds to measures of related constructs
  • Discriminant validity : The extent to which your measure is unrelated or negatively related to measures of distinct constructs

Naturalistic observation is a valuable tool because of its flexibility, external validity , and suitability for topics that can’t be studied in a lab setting.

The downsides of naturalistic observation include its lack of scientific control , ethical considerations , and potential for bias from observers and subjects.

Naturalistic observation is a qualitative research method where you record the behaviors of your research subjects in real world settings. You avoid interfering or influencing anything in a naturalistic observation.

You can think of naturalistic observation as “people watching” with a purpose.

A dependent variable is what changes as a result of the independent variable manipulation in experiments . It’s what you’re interested in measuring, and it “depends” on your independent variable.

In statistics, dependent variables are also called:

  • Response variables (they respond to a change in another variable)
  • Outcome variables (they represent the outcome you want to measure)
  • Left-hand-side variables (they appear on the left-hand side of a regression equation)

An independent variable is the variable you manipulate, control, or vary in an experimental study to explore its effects. It’s called “independent” because it’s not influenced by any other variables in the study.

Independent variables are also called:

  • Explanatory variables (they explain an event or outcome)
  • Predictor variables (they can be used to predict the value of a dependent variable)
  • Right-hand-side variables (they appear on the right-hand side of a regression equation).

As a rule of thumb, questions related to thoughts, beliefs, and feelings work well in focus groups. Take your time formulating strong questions, paying special attention to phrasing. Be careful to avoid leading questions , which can bias your responses.

Overall, your focus group questions should be:

  • Open-ended and flexible
  • Impossible to answer with “yes” or “no” (questions that start with “why” or “how” are often best)
  • Unambiguous, getting straight to the point while still stimulating discussion
  • Unbiased and neutral

A structured interview is a data collection method that relies on asking questions in a set order to collect data on a topic. They are often quantitative in nature. Structured interviews are best used when: 

  • You already have a very clear understanding of your topic. Perhaps significant research has already been conducted, or you have done some prior research yourself, but you already possess a baseline for designing strong structured questions.
  • You are constrained in terms of time or resources and need to analyze your data quickly and efficiently.
  • Your research question depends on strong parity between participants, with environmental conditions held constant.

More flexible interview options include semi-structured interviews , unstructured interviews , and focus groups .

Social desirability bias is the tendency for interview participants to give responses that will be viewed favorably by the interviewer or other participants. It occurs in all types of interviews and surveys , but is most common in semi-structured interviews , unstructured interviews , and focus groups .

Social desirability bias can be mitigated by ensuring participants feel at ease and comfortable sharing their views. Make sure to pay attention to your own body language and any physical or verbal cues, such as nodding or widening your eyes.

This type of bias can also occur in observations if the participants know they’re being observed. They might alter their behavior accordingly.

The interviewer effect is a type of bias that emerges when a characteristic of an interviewer (race, age, gender identity, etc.) influences the responses given by the interviewee.

There is a risk of an interviewer effect in all types of interviews , but it can be mitigated by writing really high-quality interview questions.

A semi-structured interview is a blend of structured and unstructured types of interviews. Semi-structured interviews are best used when:

  • You have prior interview experience. Spontaneous questions are deceptively challenging, and it’s easy to accidentally ask a leading question or make a participant uncomfortable.
  • Your research question is exploratory in nature. Participant answers can guide future research questions and help you develop a more robust knowledge base for future research.

An unstructured interview is the most flexible type of interview, but it is not always the best fit for your research topic.

Unstructured interviews are best used when:

  • You are an experienced interviewer and have a very strong background in your research topic, since it is challenging to ask spontaneous, colloquial questions.
  • Your research question is exploratory in nature. While you may have developed hypotheses, you are open to discovering new or shifting viewpoints through the interview process.
  • You are seeking descriptive data, and are ready to ask questions that will deepen and contextualize your initial thoughts and hypotheses.
  • Your research depends on forming connections with your participants and making them feel comfortable revealing deeper emotions, lived experiences, or thoughts.

The four most common types of interviews are:

  • Structured interviews : The questions are predetermined in both topic and order. 
  • Semi-structured interviews : A few questions are predetermined, but other questions aren’t planned.
  • Unstructured interviews : None of the questions are predetermined.
  • Focus group interviews : The questions are presented to a group instead of one individual.

Deductive reasoning is commonly used in scientific research, and it’s especially associated with quantitative research .

In research, you might have come across something called the hypothetico-deductive method . It’s the scientific method of testing hypotheses to check whether your predictions are substantiated by real-world data.

Deductive reasoning is a logical approach where you progress from general ideas to specific conclusions. It’s often contrasted with inductive reasoning , where you start with specific observations and form general conclusions.

Deductive reasoning is also called deductive logic.

There are many different types of inductive reasoning that people use formally or informally.

Here are a few common types:

  • Inductive generalization : You use observations about a sample to come to a conclusion about the population it came from.
  • Statistical generalization: You use specific numbers about samples to make statements about populations.
  • Causal reasoning: You make cause-and-effect links between different things.
  • Sign reasoning: You make a conclusion about a correlational relationship between different things.
  • Analogical reasoning: You make a conclusion about something based on its similarities to something else.

Inductive reasoning is a bottom-up approach, while deductive reasoning is top-down.

Inductive reasoning takes you from the specific to the general, while in deductive reasoning, you make inferences by going from general premises to specific conclusions.

In inductive research , you start by making observations or gathering data. Then, you take a broad scan of your data and search for patterns. Finally, you make general conclusions that you might incorporate into theories.

Inductive reasoning is a method of drawing conclusions by going from the specific to the general. It’s usually contrasted with deductive reasoning, where you proceed from general information to specific conclusions.

Inductive reasoning is also called inductive logic or bottom-up reasoning.

A hypothesis states your predictions about what your research will find. It is a tentative answer to your research question that has not yet been tested. For some research projects, you might have to write several hypotheses that address different aspects of your research question.

A hypothesis is not just a guess — it should be based on existing theories and knowledge. It also has to be testable, which means you can support or refute it through scientific research methods (such as experiments, observations and statistical analysis of data).

Triangulation can help:

  • Reduce research bias that comes from using a single method, theory, or investigator
  • Enhance validity by approaching the same topic with different tools
  • Establish credibility by giving you a complete picture of the research problem

But triangulation can also pose problems:

  • It’s time-consuming and labor-intensive, often involving an interdisciplinary team.
  • Your results may be inconsistent or even contradictory.

There are four main types of triangulation :

  • Data triangulation : Using data from different times, spaces, and people
  • Investigator triangulation : Involving multiple researchers in collecting or analyzing data
  • Theory triangulation : Using varying theoretical perspectives in your research
  • Methodological triangulation : Using different methodologies to approach the same topic

Many academic fields use peer review , largely to determine whether a manuscript is suitable for publication. Peer review enhances the credibility of the published manuscript.

However, peer review is also common in non-academic settings. The United Nations, the European Union, and many individual nations use peer review to evaluate grant applications. It is also widely used in medical and health-related fields as a teaching or quality-of-care measure. 

Peer assessment is often used in the classroom as a pedagogical tool. Both receiving feedback and providing it are thought to enhance the learning process, helping students think critically and collaboratively.

Peer review can stop obviously problematic, falsified, or otherwise untrustworthy research from being published. It also represents an excellent opportunity to get feedback from renowned experts in your field. It acts as a first defense, helping you ensure your argument is clear and that there are no gaps, vague terms, or unanswered questions for readers who weren’t involved in the research process.

Peer-reviewed articles are considered a highly credible source due to this stringent process they go through before publication.

In general, the peer review process follows the following steps: 

  • First, the author submits the manuscript to the editor.
  • Reject the manuscript and send it back to author, or 
  • Send it onward to the selected peer reviewer(s) 
  • Next, the peer review process occurs. The reviewer provides feedback, addressing any major or minor issues with the manuscript, and gives their advice regarding what edits should be made. 
  • Lastly, the edited manuscript is sent back to the author. They input the edits, and resubmit it to the editor for publication.

Exploratory research is often used when the issue you’re studying is new or when the data collection process is challenging for some reason.

You can use exploratory research if you have a general idea or a specific question that you want to study but there is no preexisting knowledge or paradigm with which to study it.

Exploratory research is a methodology approach that explores research questions that have not previously been studied in depth. It is often used when the issue you’re studying is new, or the data collection process is challenging in some way.

Explanatory research is used to investigate how or why a phenomenon occurs. Therefore, this type of research is often one of the first stages in the research process , serving as a jumping-off point for future research.

Exploratory research aims to explore the main aspects of an under-researched problem, while explanatory research aims to explain the causes and consequences of a well-defined problem.

Explanatory research is a research method used to investigate how or why something occurs when only a small amount of information is available pertaining to that topic. It can help you increase your understanding of a given topic.

Clean data are valid, accurate, complete, consistent, unique, and uniform. Dirty data include inconsistencies and errors.

Dirty data can come from any part of the research process, including poor research design , inappropriate measurement materials, or flawed data entry.

Data cleaning takes place between data collection and data analyses. But you can use some methods even before collecting data.

For clean data, you should start by designing measures that collect valid data. Data validation at the time of data entry or collection helps you minimize the amount of data cleaning you’ll need to do.

After data collection, you can use data standardization and data transformation to clean your data. You’ll also deal with any missing values, outliers, and duplicate values.

Every dataset requires different techniques to clean dirty data , but you need to address these issues in a systematic way. You focus on finding and resolving data points that don’t agree or fit with the rest of your dataset.

These data might be missing values, outliers, duplicate values, incorrectly formatted, or irrelevant. You’ll start with screening and diagnosing your data. Then, you’ll often standardize and accept or remove data to make your dataset consistent and valid.

Data cleaning is necessary for valid and appropriate analyses. Dirty data contain inconsistencies or errors , but cleaning your data helps you minimize or resolve these.

Without data cleaning, you could end up with a Type I or II error in your conclusion. These types of erroneous conclusions can be practically significant with important consequences, because they lead to misplaced investments or missed opportunities.

Data cleaning involves spotting and resolving potential data inconsistencies or errors to improve your data quality. An error is any value (e.g., recorded weight) that doesn’t reflect the true value (e.g., actual weight) of something that’s being measured.

In this process, you review, analyze, detect, modify, or remove “dirty” data to make your dataset “clean.” Data cleaning is also called data cleansing or data scrubbing.

Research misconduct means making up or falsifying data, manipulating data analyses, or misrepresenting results in research reports. It’s a form of academic fraud.

These actions are committed intentionally and can have serious consequences; research misconduct is not a simple mistake or a point of disagreement but a serious ethical failure.

Anonymity means you don’t know who the participants are, while confidentiality means you know who they are but remove identifying information from your research report. Both are important ethical considerations .

You can only guarantee anonymity by not collecting any personally identifying information—for example, names, phone numbers, email addresses, IP addresses, physical characteristics, photos, or videos.

You can keep data confidential by using aggregate information in your research report, so that you only refer to groups of participants rather than individuals.

Research ethics matter for scientific integrity, human rights and dignity, and collaboration between science and society. These principles make sure that participation in studies is voluntary, informed, and safe.

Ethical considerations in research are a set of principles that guide your research designs and practices. These principles include voluntary participation, informed consent, anonymity, confidentiality, potential for harm, and results communication.

Scientists and researchers must always adhere to a certain code of conduct when collecting data from others .

These considerations protect the rights of research participants, enhance research validity , and maintain scientific integrity.

In multistage sampling , you can use probability or non-probability sampling methods .

For a probability sample, you have to conduct probability sampling at every stage.

You can mix it up by using simple random sampling , systematic sampling , or stratified sampling to select units at different stages, depending on what is applicable and relevant to your study.

Multistage sampling can simplify data collection when you have large, geographically spread samples, and you can obtain a probability sample without a complete sampling frame.

But multistage sampling may not lead to a representative sample, and larger samples are needed for multistage samples to achieve the statistical properties of simple random samples .

These are four of the most common mixed methods designs :

  • Convergent parallel: Quantitative and qualitative data are collected at the same time and analyzed separately. After both analyses are complete, compare your results to draw overall conclusions. 
  • Embedded: Quantitative and qualitative data are collected at the same time, but within a larger quantitative or qualitative design. One type of data is secondary to the other.
  • Explanatory sequential: Quantitative data is collected and analyzed first, followed by qualitative data. You can use this design if you think your qualitative data will explain and contextualize your quantitative findings.
  • Exploratory sequential: Qualitative data is collected and analyzed first, followed by quantitative data. You can use this design if you think the quantitative data will confirm or validate your qualitative findings.

Triangulation in research means using multiple datasets, methods, theories and/or investigators to address a research question. It’s a research strategy that can help you enhance the validity and credibility of your findings.

Triangulation is mainly used in qualitative research , but it’s also commonly applied in quantitative research . Mixed methods research always uses triangulation.

In multistage sampling , or multistage cluster sampling, you draw a sample from a population using smaller and smaller groups at each stage.

This method is often used to collect data from a large, geographically spread group of people in national surveys, for example. You take advantage of hierarchical groupings (e.g., from state to city to neighborhood) to create a sample that’s less expensive and time-consuming to collect data from.

No, the steepness or slope of the line isn’t related to the correlation coefficient value. The correlation coefficient only tells you how closely your data fit on a line, so two datasets with the same correlation coefficient can have very different slopes.

To find the slope of the line, you’ll need to perform a regression analysis .

Correlation coefficients always range between -1 and 1.

The sign of the coefficient tells you the direction of the relationship: a positive value means the variables change together in the same direction, while a negative value means they change together in opposite directions.

The absolute value of a number is equal to the number without its sign. The absolute value of a correlation coefficient tells you the magnitude of the correlation: the greater the absolute value, the stronger the correlation.

These are the assumptions your data must meet if you want to use Pearson’s r :

  • Both variables are on an interval or ratio level of measurement
  • Data from both variables follow normal distributions
  • Your data have no outliers
  • Your data is from a random or representative sample
  • You expect a linear relationship between the two variables

Quantitative research designs can be divided into two main categories:

  • Correlational and descriptive designs are used to investigate characteristics, averages, trends, and associations between variables.
  • Experimental and quasi-experimental designs are used to test causal relationships .

Qualitative research designs tend to be more flexible. Common types of qualitative design include case study , ethnography , and grounded theory designs.

A well-planned research design helps ensure that your methods match your research aims, that you collect high-quality data, and that you use the right kind of analysis to answer your questions, utilizing credible sources . This allows you to draw valid , trustworthy conclusions.

The priorities of a research design can vary depending on the field, but you usually have to specify:

  • Your research questions and/or hypotheses
  • Your overall approach (e.g., qualitative or quantitative )
  • The type of design you’re using (e.g., a survey , experiment , or case study )
  • Your sampling methods or criteria for selecting subjects
  • Your data collection methods (e.g., questionnaires , observations)
  • Your data collection procedures (e.g., operationalization , timing and data management)
  • Your data analysis methods (e.g., statistical tests  or thematic analysis )

A research design is a strategy for answering your   research question . It defines your overall approach and determines how you will collect and analyze data.

Questionnaires can be self-administered or researcher-administered.

Self-administered questionnaires can be delivered online or in paper-and-pen formats, in person or through mail. All questions are standardized so that all respondents receive the same questions with identical wording.

Researcher-administered questionnaires are interviews that take place by phone, in-person, or online between researchers and respondents. You can gain deeper insights by clarifying questions for respondents or asking follow-up questions.

You can organize the questions logically, with a clear progression from simple to complex, or randomly between respondents. A logical flow helps respondents process the questionnaire easier and quicker, but it may lead to bias. Randomization can minimize the bias from order effects.

Closed-ended, or restricted-choice, questions offer respondents a fixed set of choices to select from. These questions are easier to answer quickly.

Open-ended or long-form questions allow respondents to answer in their own words. Because there are no restrictions on their choices, respondents can answer in ways that researchers may not have otherwise considered.

A questionnaire is a data collection tool or instrument, while a survey is an overarching research method that involves collecting and analyzing data from people using questionnaires.

The third variable and directionality problems are two main reasons why correlation isn’t causation .

The third variable problem means that a confounding variable affects both variables to make them seem causally related when they are not.

The directionality problem is when two variables correlate and might actually have a causal relationship, but it’s impossible to conclude which variable causes changes in the other.

Correlation describes an association between variables : when one variable changes, so does the other. A correlation is a statistical indicator of the relationship between variables.

Causation means that changes in one variable brings about changes in the other (i.e., there is a cause-and-effect relationship between variables). The two variables are correlated with each other, and there’s also a causal link between them.

While causation and correlation can exist simultaneously, correlation does not imply causation. In other words, correlation is simply a relationship where A relates to B—but A doesn’t necessarily cause B to happen (or vice versa). Mistaking correlation for causation is a common error and can lead to false cause fallacy .

Controlled experiments establish causality, whereas correlational studies only show associations between variables.

  • In an experimental design , you manipulate an independent variable and measure its effect on a dependent variable. Other variables are controlled so they can’t impact the results.
  • In a correlational design , you measure variables without manipulating any of them. You can test whether your variables change together, but you can’t be sure that one variable caused a change in another.

In general, correlational research is high in external validity while experimental research is high in internal validity .

A correlation is usually tested for two variables at a time, but you can test correlations between three or more variables.

A correlation coefficient is a single number that describes the strength and direction of the relationship between your variables.

Different types of correlation coefficients might be appropriate for your data based on their levels of measurement and distributions . The Pearson product-moment correlation coefficient (Pearson’s r ) is commonly used to assess a linear relationship between two quantitative variables.

A correlational research design investigates relationships between two variables (or more) without the researcher controlling or manipulating any of them. It’s a non-experimental type of quantitative research .

A correlation reflects the strength and/or direction of the association between two or more variables.

  • A positive correlation means that both variables change in the same direction.
  • A negative correlation means that the variables change in opposite directions.
  • A zero correlation means there’s no relationship between the variables.

Random error  is almost always present in scientific studies, even in highly controlled settings. While you can’t eradicate it completely, you can reduce random error by taking repeated measurements, using a large sample, and controlling extraneous variables .

You can avoid systematic error through careful design of your sampling , data collection , and analysis procedures. For example, use triangulation to measure your variables using multiple methods; regularly calibrate instruments or procedures; use random sampling and random assignment ; and apply masking (blinding) where possible.

Systematic error is generally a bigger problem in research.

With random error, multiple measurements will tend to cluster around the true value. When you’re collecting data from a large sample , the errors in different directions will cancel each other out.

Systematic errors are much more problematic because they can skew your data away from the true value. This can lead you to false conclusions ( Type I and II errors ) about the relationship between the variables you’re studying.

Random and systematic error are two types of measurement error.

Random error is a chance difference between the observed and true values of something (e.g., a researcher misreading a weighing scale records an incorrect measurement).

Systematic error is a consistent or proportional difference between the observed and true values of something (e.g., a miscalibrated scale consistently records weights as higher than they actually are).

On graphs, the explanatory variable is conventionally placed on the x-axis, while the response variable is placed on the y-axis.

  • If you have quantitative variables , use a scatterplot or a line graph.
  • If your response variable is categorical, use a scatterplot or a line graph.
  • If your explanatory variable is categorical, use a bar graph.

The term “ explanatory variable ” is sometimes preferred over “ independent variable ” because, in real world contexts, independent variables are often influenced by other variables. This means they aren’t totally independent.

Multiple independent variables may also be correlated with each other, so “explanatory variables” is a more appropriate term.

The difference between explanatory and response variables is simple:

  • An explanatory variable is the expected cause, and it explains the results.
  • A response variable is the expected effect, and it responds to other variables.

In a controlled experiment , all extraneous variables are held constant so that they can’t influence the results. Controlled experiments require:

  • A control group that receives a standard treatment, a fake treatment, or no treatment.
  • Random assignment of participants to ensure the groups are equivalent.

Depending on your study topic, there are various other methods of controlling variables .

There are 4 main types of extraneous variables :

  • Demand characteristics : environmental cues that encourage participants to conform to researchers’ expectations.
  • Experimenter effects : unintentional actions by researchers that influence study outcomes.
  • Situational variables : environmental variables that alter participants’ behaviors.
  • Participant variables : any characteristic or aspect of a participant’s background that could affect study results.

An extraneous variable is any variable that you’re not investigating that can potentially affect the dependent variable of your research study.

A confounding variable is a type of extraneous variable that not only affects the dependent variable, but is also related to the independent variable.

In a factorial design, multiple independent variables are tested.

If you test two variables, each level of one independent variable is combined with each level of the other independent variable to create different conditions.

Within-subjects designs have many potential threats to internal validity , but they are also very statistically powerful .

Advantages:

  • Only requires small samples
  • Statistically powerful
  • Removes the effects of individual differences on the outcomes

Disadvantages:

  • Internal validity threats reduce the likelihood of establishing a direct relationship between variables
  • Time-related effects, such as growth, can influence the outcomes
  • Carryover effects mean that the specific order of different treatments affect the outcomes

While a between-subjects design has fewer threats to internal validity , it also requires more participants for high statistical power than a within-subjects design .

  • Prevents carryover effects of learning and fatigue.
  • Shorter study duration.
  • Needs larger samples for high power.
  • Uses more resources to recruit participants, administer sessions, cover costs, etc.
  • Individual differences may be an alternative explanation for results.

Yes. Between-subjects and within-subjects designs can be combined in a single study when you have two or more independent variables (a factorial design). In a mixed factorial design, one variable is altered between subjects and another is altered within subjects.

In a between-subjects design , every participant experiences only one condition, and researchers assess group differences between participants in various conditions.

In a within-subjects design , each participant experiences all conditions, and researchers test the same participants repeatedly for differences between conditions.

The word “between” means that you’re comparing different conditions between groups, while the word “within” means you’re comparing different conditions within the same group.

Random assignment is used in experiments with a between-groups or independent measures design. In this research design, there’s usually a control group and one or more experimental groups. Random assignment helps ensure that the groups are comparable.

In general, you should always use random assignment in this type of experimental design when it is ethically possible and makes sense for your study topic.

To implement random assignment , assign a unique number to every member of your study’s sample .

Then, you can use a random number generator or a lottery method to randomly assign each number to a control or experimental group. You can also do so manually, by flipping a coin or rolling a dice to randomly assign participants to groups.

Random selection, or random sampling , is a way of selecting members of a population for your study’s sample.

In contrast, random assignment is a way of sorting the sample into control and experimental groups.

Random sampling enhances the external validity or generalizability of your results, while random assignment improves the internal validity of your study.

In experimental research, random assignment is a way of placing participants from your sample into different groups using randomization. With this method, every member of the sample has a known or equal chance of being placed in a control group or an experimental group.

“Controlling for a variable” means measuring extraneous variables and accounting for them statistically to remove their effects on other variables.

Researchers often model control variable data along with independent and dependent variable data in regression analyses and ANCOVAs . That way, you can isolate the control variable’s effects from the relationship between the variables of interest.

Control variables help you establish a correlational or causal relationship between variables by enhancing internal validity .

If you don’t control relevant extraneous variables , they may influence the outcomes of your study, and you may not be able to demonstrate that your results are really an effect of your independent variable .

A control variable is any variable that’s held constant in a research study. It’s not a variable of interest in the study, but it’s controlled because it could influence the outcomes.

Including mediators and moderators in your research helps you go beyond studying a simple relationship between two variables for a fuller picture of the real world. They are important to consider when studying complex correlational or causal relationships.

Mediators are part of the causal pathway of an effect, and they tell you how or why an effect takes place. Moderators usually help you judge the external validity of your study by identifying the limitations of when the relationship between variables holds.

If something is a mediating variable :

  • It’s caused by the independent variable .
  • It influences the dependent variable
  • When it’s taken into account, the statistical correlation between the independent and dependent variables is higher than when it isn’t considered.

A confounder is a third variable that affects variables of interest and makes them seem related when they are not. In contrast, a mediator is the mechanism of a relationship between two variables: it explains the process by which they are related.

A mediator variable explains the process through which two variables are related, while a moderator variable affects the strength and direction of that relationship.

There are three key steps in systematic sampling :

  • Define and list your population , ensuring that it is not ordered in a cyclical or periodic order.
  • Decide on your sample size and calculate your interval, k , by dividing your population by your target sample size.
  • Choose every k th member of the population as your sample.

Systematic sampling is a probability sampling method where researchers select members of the population at a regular interval – for example, by selecting every 15th person on a list of the population. If the population is in a random order, this can imitate the benefits of simple random sampling .

Yes, you can create a stratified sample using multiple characteristics, but you must ensure that every participant in your study belongs to one and only one subgroup. In this case, you multiply the numbers of subgroups for each characteristic to get the total number of groups.

For example, if you were stratifying by location with three subgroups (urban, rural, or suburban) and marital status with five subgroups (single, divorced, widowed, married, or partnered), you would have 3 x 5 = 15 subgroups.

You should use stratified sampling when your sample can be divided into mutually exclusive and exhaustive subgroups that you believe will take on different mean values for the variable that you’re studying.

Using stratified sampling will allow you to obtain more precise (with lower variance ) statistical estimates of whatever you are trying to measure.

For example, say you want to investigate how income differs based on educational attainment, but you know that this relationship can vary based on race. Using stratified sampling, you can ensure you obtain a large enough sample from each racial group, allowing you to draw more precise conclusions.

In stratified sampling , researchers divide subjects into subgroups called strata based on characteristics that they share (e.g., race, gender, educational attainment).

Once divided, each subgroup is randomly sampled using another probability sampling method.

Cluster sampling is more time- and cost-efficient than other probability sampling methods , particularly when it comes to large samples spread across a wide geographical area.

However, it provides less statistical certainty than other methods, such as simple random sampling , because it is difficult to ensure that your clusters properly represent the population as a whole.

There are three types of cluster sampling : single-stage, double-stage and multi-stage clustering. In all three types, you first divide the population into clusters, then randomly select clusters for use in your sample.

  • In single-stage sampling , you collect data from every unit within the selected clusters.
  • In double-stage sampling , you select a random sample of units from within the clusters.
  • In multi-stage sampling , you repeat the procedure of randomly sampling elements from within the clusters until you have reached a manageable sample.

Cluster sampling is a probability sampling method in which you divide a population into clusters, such as districts or schools, and then randomly select some of these clusters as your sample.

The clusters should ideally each be mini-representations of the population as a whole.

If properly implemented, simple random sampling is usually the best sampling method for ensuring both internal and external validity . However, it can sometimes be impractical and expensive to implement, depending on the size of the population to be studied,

If you have a list of every member of the population and the ability to reach whichever members are selected, you can use simple random sampling.

The American Community Survey  is an example of simple random sampling . In order to collect detailed data on the population of the US, the Census Bureau officials randomly select 3.5 million households per year and use a variety of methods to convince them to fill out the survey.

Simple random sampling is a type of probability sampling in which the researcher randomly selects a subset of participants from a population . Each member of the population has an equal chance of being selected. Data is then collected from as large a percentage as possible of this random subset.

Quasi-experimental design is most useful in situations where it would be unethical or impractical to run a true experiment .

Quasi-experiments have lower internal validity than true experiments, but they often have higher external validity  as they can use real-world interventions instead of artificial laboratory settings.

A quasi-experiment is a type of research design that attempts to establish a cause-and-effect relationship. The main difference with a true experiment is that the groups are not randomly assigned.

Blinding is important to reduce research bias (e.g., observer bias , demand characteristics ) and ensure a study’s internal validity .

If participants know whether they are in a control or treatment group , they may adjust their behavior in ways that affect the outcome that researchers are trying to measure. If the people administering the treatment are aware of group assignment, they may treat participants differently and thus directly or indirectly influence the final results.

  • In a single-blind study , only the participants are blinded.
  • In a double-blind study , both participants and experimenters are blinded.
  • In a triple-blind study , the assignment is hidden not only from participants and experimenters, but also from the researchers analyzing the data.

Blinding means hiding who is assigned to the treatment group and who is assigned to the control group in an experiment .

A true experiment (a.k.a. a controlled experiment) always includes at least one control group that doesn’t receive the experimental treatment.

However, some experiments use a within-subjects design to test treatments without a control group. In these designs, you usually compare one group’s outcomes before and after a treatment (instead of comparing outcomes between different groups).

For strong internal validity , it’s usually best to include a control group if possible. Without a control group, it’s harder to be certain that the outcome was caused by the experimental treatment and not by other variables.

An experimental group, also known as a treatment group, receives the treatment whose effect researchers wish to study, whereas a control group does not. They should be identical in all other ways.

Individual Likert-type questions are generally considered ordinal data , because the items have clear rank order, but don’t have an even distribution.

Overall Likert scale scores are sometimes treated as interval data. These scores are considered to have directionality and even spacing between them.

The type of data determines what statistical tests you should use to analyze your data.

A Likert scale is a rating scale that quantitatively assesses opinions, attitudes, or behaviors. It is made up of 4 or more questions that measure a single attitude or trait when response scores are combined.

To use a Likert scale in a survey , you present participants with Likert-type questions or statements, and a continuum of items, usually with 5 or 7 possible responses, to capture their degree of agreement.

In scientific research, concepts are the abstract ideas or phenomena that are being studied (e.g., educational achievement). Variables are properties or characteristics of the concept (e.g., performance at school), while indicators are ways of measuring or quantifying variables (e.g., yearly grade reports).

The process of turning abstract concepts into measurable variables and indicators is called operationalization .

There are various approaches to qualitative data analysis , but they all share five steps in common:

  • Prepare and organize your data.
  • Review and explore your data.
  • Develop a data coding system.
  • Assign codes to the data.
  • Identify recurring themes.

The specifics of each step depend on the focus of the analysis. Some common approaches include textual analysis , thematic analysis , and discourse analysis .

There are five common approaches to qualitative research :

  • Grounded theory involves collecting data in order to develop new theories.
  • Ethnography involves immersing yourself in a group or organization to understand its culture.
  • Narrative research involves interpreting stories to understand how people make sense of their experiences and perceptions.
  • Phenomenological research involves investigating phenomena through people’s lived experiences.
  • Action research links theory and practice in several cycles to drive innovative changes.

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.

Operationalization means turning abstract conceptual ideas into measurable observations.

For example, the concept of social anxiety isn’t directly observable, but it can be operationally defined in terms of self-rating scores, behavioral avoidance of crowded places, or physical anxiety symptoms in social situations.

Before collecting data , it’s important to consider how you will operationalize the variables that you want to measure.

When conducting research, collecting original data has significant advantages:

  • You can tailor data collection to your specific research aims (e.g. understanding the needs of your consumers or user testing your website)
  • You can control and standardize the process for high reliability and validity (e.g. choosing appropriate measurements and sampling methods )

However, there are also some drawbacks: data collection can be time-consuming, labor-intensive and expensive. In some cases, it’s more efficient to use secondary data that has already been collected by someone else, but the data might be less reliable.

Data collection is the systematic process by which observations or measurements are gathered in research. It is used in many different contexts by academics, governments, businesses, and other organizations.

There are several methods you can use to decrease the impact of confounding variables on your research: restriction, matching, statistical control and randomization.

In restriction , you restrict your sample by only including certain subjects that have the same values of potential confounding variables.

In matching , you match each of the subjects in your treatment group with a counterpart in the comparison group. The matched subjects have the same values on any potential confounding variables, and only differ in the independent variable .

In statistical control , you include potential confounders as variables in your regression .

In randomization , you randomly assign the treatment (or independent variable) in your study to a sufficiently large number of subjects, which allows you to control for all potential confounding variables.

A confounding variable is closely related to both the independent and dependent variables in a study. An independent variable represents the supposed cause , while the dependent variable is the supposed effect . A confounding variable is a third variable that influences both the independent and dependent variables.

Failing to account for confounding variables can cause you to wrongly estimate the relationship between your independent and dependent variables.

To ensure the internal validity of your research, you must consider the impact of confounding variables. If you fail to account for them, you might over- or underestimate the causal relationship between your independent and dependent variables , or even find a causal relationship where none exists.

Yes, but including more than one of either type requires multiple research questions .

For example, if you are interested in the effect of a diet on health, you can use multiple measures of health: blood sugar, blood pressure, weight, pulse, and many more. Each of these is its own dependent variable with its own research question.

You could also choose to look at the effect of exercise levels as well as diet, or even the additional effect of the two combined. Each of these is a separate independent variable .

To ensure the internal validity of an experiment , you should only change one independent variable at a time.

No. The value of a dependent variable depends on an independent variable, so a variable cannot be both independent and dependent at the same time. It must be either the cause or the effect, not both!

You want to find out how blood sugar levels are affected by drinking diet soda and regular soda, so you conduct an experiment .

  • The type of soda – diet or regular – is the independent variable .
  • The level of blood sugar that you measure is the dependent variable – it changes depending on the type of soda.

In non-probability sampling , the sample is selected based on non-random criteria, and not every member of the population has a chance of being included.

Common non-probability sampling methods include convenience sampling , voluntary response sampling, purposive sampling , snowball sampling, and quota sampling .

Probability sampling means that every member of the target population has a known chance of being included in the sample.

Probability sampling methods include simple random sampling , systematic sampling , stratified sampling , and cluster sampling .

Using careful research design and sampling procedures can help you avoid sampling bias . Oversampling can be used to correct undercoverage bias .

Some common types of sampling bias include self-selection bias , nonresponse bias , undercoverage bias , survivorship bias , pre-screening or advertising bias, and healthy user bias.

Sampling bias is a threat to external validity – it limits the generalizability of your findings to a broader group of people.

A sampling error is the difference between a population parameter and a sample statistic .

A statistic refers to measures about the sample , while a parameter refers to measures about the population .

Populations are used when a research question requires data from every member of the population. This is usually only feasible when the population is small and easily accessible.

Samples are used to make inferences about populations . Samples are easier to collect data from because they are practical, cost-effective, convenient, and manageable.

There are seven threats to external validity : selection bias , history, experimenter effect, Hawthorne effect , testing effect, aptitude-treatment and situation effect.

The two types of external validity are population validity (whether you can generalize to other groups of people) and ecological validity (whether you can generalize to other situations and settings).

The external validity of a study is the extent to which you can generalize your findings to different groups of people, situations, and measures.

Cross-sectional studies cannot establish a cause-and-effect relationship or analyze behavior over a period of time. To investigate cause and effect, you need to do a longitudinal study or an experimental study .

Cross-sectional studies are less expensive and time-consuming than many other types of study. They can provide useful insights into a population’s characteristics and identify correlations for further research.

Sometimes only cross-sectional data is available for analysis; other times your research question may only require a cross-sectional study to answer it.

Longitudinal studies can last anywhere from weeks to decades, although they tend to be at least a year long.

The 1970 British Cohort Study , which has collected data on the lives of 17,000 Brits since their births in 1970, is one well-known example of a longitudinal study .

Longitudinal studies are better to establish the correct sequence of events, identify changes over time, and provide insight into cause-and-effect relationships, but they also tend to be more expensive and time-consuming than other types of studies.

Longitudinal studies and cross-sectional studies are two different types of research design . In a cross-sectional study you collect data from a population at a specific point in time; in a longitudinal study you repeatedly collect data from the same sample over an extended period of time.

Longitudinal study Cross-sectional study
observations Observations at a in time
Observes the multiple times Observes (a “cross-section”) in the population
Follows in participants over time Provides of society at a given point

There are eight threats to internal validity : history, maturation, instrumentation, testing, selection bias , regression to the mean, social interaction and attrition .

Internal validity is the extent to which you can be confident that a cause-and-effect relationship established in a study cannot be explained by other factors.

In mixed methods research , you use both qualitative and quantitative data collection and analysis methods to answer your research question .

The research methods you use depend on the type of data you need to answer your research question .

  • If you want to measure something or test a hypothesis , use quantitative methods . If you want to explore ideas, thoughts and meanings, use qualitative methods .
  • If you want to analyze a large amount of readily-available data, use secondary data. If you want data specific to your purposes with control over how it is generated, collect primary data.
  • If you want to establish cause-and-effect relationships between variables , use experimental methods. If you want to understand the characteristics of a research subject, use descriptive methods.

A confounding variable , also called a confounder or confounding factor, is a third variable in a study examining a potential cause-and-effect relationship.

A confounding variable is related to both the supposed cause and the supposed effect of the study. It can be difficult to separate the true effect of the independent variable from the effect of the confounding variable.

In your research design , it’s important to identify potential confounding variables and plan how you will reduce their impact.

Discrete and continuous variables are two types of quantitative variables :

  • Discrete variables represent counts (e.g. the number of objects in a collection).
  • Continuous variables represent measurable amounts (e.g. water volume or weight).

Quantitative variables are any variables where the data represent amounts (e.g. height, weight, or age).

Categorical variables are any variables where the data represent groups. This includes rankings (e.g. finishing places in a race), classifications (e.g. brands of cereal), and binary outcomes (e.g. coin flips).

You need to know what type of variables you are working with to choose the right statistical test for your data and interpret your results .

You can think of independent and dependent variables in terms of cause and effect: an independent variable is the variable you think is the cause , while a dependent variable is the effect .

In an experiment, you manipulate the independent variable and measure the outcome in the dependent variable. For example, in an experiment about the effect of nutrients on crop growth:

  • The  independent variable  is the amount of nutrients added to the crop field.
  • The  dependent variable is the biomass of the crops at harvest time.

Defining your variables, and deciding how you will manipulate and measure them, is an important part of experimental design .

Experimental design means planning a set of procedures to investigate a relationship between variables . To design a controlled experiment, you need:

  • A testable hypothesis
  • At least one independent variable that can be precisely manipulated
  • At least one dependent variable that can be precisely measured

When designing the experiment, you decide:

  • How you will manipulate the variable(s)
  • How you will control for any potential confounding variables
  • How many subjects or samples will be included in the study
  • How subjects will be assigned to treatment levels

Experimental design is essential to the internal and external validity of your experiment.

I nternal validity is the degree of confidence that the causal relationship you are testing is not influenced by other factors or variables .

External validity is the extent to which your results can be generalized to other contexts.

The validity of your experiment depends on your experimental design .

Reliability and validity are both about how well a method measures something:

  • Reliability refers to the  consistency of a measure (whether the results can be reproduced under the same conditions).
  • Validity   refers to the  accuracy of a measure (whether the results really do represent what they are supposed to measure).

If you are doing experimental research, you also have to consider the internal and external validity of your experiment.

A sample is a subset of individuals from a larger population . Sampling means selecting the group that you will actually collect data from in your research. For example, if you are researching the opinions of students in your university, you could survey a sample of 100 students.

In statistics, sampling allows you to test a hypothesis about the characteristics of a population.

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to systematically measure variables and test hypotheses . Qualitative methods allow you to explore concepts and experiences in more detail.

Methodology refers to the overarching strategy and rationale of your research project . It involves studying the methods used in your field and the theories or principles behind them, in order to develop an approach that matches your objectives.

Methods are the specific tools and procedures you use to collect and analyze data (for example, experiments, surveys , and statistical tests ).

In shorter scientific papers, where the aim is to report the findings of a specific study, you might simply describe what you did in a methods section .

In a longer or more complex research project, such as a thesis or dissertation , you will probably include a methodology section , where you explain your approach to answering the research questions and cite relevant sources to support your choice of methods.

Ask our team

Want to contact us directly? No problem.  We  are always here for you.

Support team - Nina

Our team helps students graduate by offering:

  • A world-class citation generator
  • Plagiarism Checker software powered by Turnitin
  • Innovative Citation Checker software
  • Professional proofreading services
  • Over 300 helpful articles about academic writing, citing sources, plagiarism, and more

Scribbr specializes in editing study-related documents . We proofread:

  • PhD dissertations
  • Research proposals
  • Personal statements
  • Admission essays
  • Motivation letters
  • Reflection papers
  • Journal articles
  • Capstone projects

Scribbr’s Plagiarism Checker is powered by elements of Turnitin’s Similarity Checker , namely the plagiarism detection software and the Internet Archive and Premium Scholarly Publications content databases .

The add-on AI detector is powered by Scribbr’s proprietary software.

The Scribbr Citation Generator is developed using the open-source Citation Style Language (CSL) project and Frank Bennett’s citeproc-js . It’s the same technology used by dozens of other popular citation tools, including Mendeley and Zotero.

You can find all the citation styles and locales used in the Scribbr Citation Generator in our publicly accessible repository on Github .

significant role of variables in doing a research paper

PHILO-notes

Free Online Learning Materials

What are Variables and Why are They Important in Research?

In research, variables are crucial components that help to define and measure the concepts and phenomena under investigation. Variables are defined as any characteristic or attribute that can vary or change in some way. They can be measured, manipulated, or controlled to investigate the relationship between different factors and their impact on the research outcomes. In this essay, I will discuss the importance of variables in research, highlighting their role in defining research questions, designing studies, analyzing data, and drawing conclusions.

Defining Research Questions

Variables play a critical role in defining research questions. Research questions are formulated based on the variables that are under investigation. These questions guide the entire research process, including the selection of research methods, data collection procedures, and data analysis techniques. Variables help researchers to identify the key concepts and phenomena that they wish to investigate, and to formulate research questions that are specific, measurable, and relevant to the research objectives.

For example, in a study on the relationship between exercise and stress, the variables would be exercise and stress. The research question might be: “What is the relationship between the frequency of exercise and the level of perceived stress among young adults?”

Designing Studies

Variables also play a crucial role in the design of research studies. The selection of variables determines the type of research design that will be used, as well as the methods and procedures for collecting and analyzing data. Variables can be independent, dependent, or moderator variables, depending on their role in the research design.

Independent variables are the variables that are manipulated or controlled by the researcher. They are used to determine the effect of a particular factor on the dependent variable. Dependent variables are the variables that are measured or observed to determine the impact of the independent variable. Moderator variables are the variables that influence the relationship between the independent and dependent variables.

For example, in a study on the effect of caffeine on athletic performance, the independent variable would be caffeine, and the dependent variable would be athletic performance. The moderator variables could include factors such as age, gender, and fitness level.

Analyzing Data

Variables are also essential in the analysis of research data. Statistical methods are used to analyze the data and determine the relationships between the variables. The type of statistical analysis that is used depends on the nature of the variables, their level of measurement, and the research design.

For example, if the variables are categorical or nominal, chi-square tests or contingency tables can be used to determine the relationships between them. If the variables are continuous, correlation analysis or regression analysis can be used to determine the strength and direction of the relationship between them.

Drawing Conclusions

Finally, variables are crucial in drawing conclusions from research studies. The results of the study are based on the relationship between the variables and the conclusions drawn depend on the validity and reliability of the research methods and the accuracy of the statistical analysis. Variables help to establish the cause-and-effect relationships between different factors and to make predictions about the outcomes of future events.

For example, in a study on the effect of smoking on lung cancer, the independent variable would be smoking, and the dependent variable would be lung cancer. The conclusion would be that smoking is a risk factor for lung cancer, based on the strength and direction of the relationship between the variables.

In conclusion, variables play a crucial role in research across different fields and disciplines. They help to define research questions, design studies, analyze data, and draw conclusions. By understanding the importance of variables in research, researchers can design studies that are relevant, accurate, and reliable, and can provide valuable insights into the phenomena under investigation. Therefore, it is essential to consider variables carefully when designing, conducting, and interpreting research studies.

What Is Research, and Why Do People Do It?

  • Open Access
  • First Online: 03 December 2022

Cite this chapter

You have full access to this open access chapter

significant role of variables in doing a research paper

  • James Hiebert 6 ,
  • Jinfa Cai 7 ,
  • Stephen Hwang 7 ,
  • Anne K Morris 6 &
  • Charles Hohensee 6  

Part of the book series: Research in Mathematics Education ((RME))

18k Accesses

Abstractspiepr Abs1

Every day people do research as they gather information to learn about something of interest. In the scientific world, however, research means something different than simply gathering information. Scientific research is characterized by its careful planning and observing, by its relentless efforts to understand and explain, and by its commitment to learn from everyone else seriously engaged in research. We call this kind of research scientific inquiry and define it as “formulating, testing, and revising hypotheses.” By “hypotheses” we do not mean the hypotheses you encounter in statistics courses. We mean predictions about what you expect to find and rationales for why you made these predictions. Throughout this and the remaining chapters we make clear that the process of scientific inquiry applies to all kinds of research studies and data, both qualitative and quantitative.

You have full access to this open access chapter,  Download chapter PDF

Part I. What Is Research?

Have you ever studied something carefully because you wanted to know more about it? Maybe you wanted to know more about your grandmother’s life when she was younger so you asked her to tell you stories from her childhood, or maybe you wanted to know more about a fertilizer you were about to use in your garden so you read the ingredients on the package and looked them up online. According to the dictionary definition, you were doing research.

Recall your high school assignments asking you to “research” a topic. The assignment likely included consulting a variety of sources that discussed the topic, perhaps including some “original” sources. Often, the teacher referred to your product as a “research paper.”

Were you conducting research when you interviewed your grandmother or wrote high school papers reviewing a particular topic? Our view is that you were engaged in part of the research process, but only a small part. In this book, we reserve the word “research” for what it means in the scientific world, that is, for scientific research or, more pointedly, for scientific inquiry .

Exercise 1.1

Before you read any further, write a definition of what you think scientific inquiry is. Keep it short—Two to three sentences. You will periodically update this definition as you read this chapter and the remainder of the book.

This book is about scientific inquiry—what it is and how to do it. For starters, scientific inquiry is a process, a particular way of finding out about something that involves a number of phases. Each phase of the process constitutes one aspect of scientific inquiry. You are doing scientific inquiry as you engage in each phase, but you have not done scientific inquiry until you complete the full process. Each phase is necessary but not sufficient.

In this chapter, we set the stage by defining scientific inquiry—describing what it is and what it is not—and by discussing what it is good for and why people do it. The remaining chapters build directly on the ideas presented in this chapter.

A first thing to know is that scientific inquiry is not all or nothing. “Scientificness” is a continuum. Inquiries can be more scientific or less scientific. What makes an inquiry more scientific? You might be surprised there is no universally agreed upon answer to this question. None of the descriptors we know of are sufficient by themselves to define scientific inquiry. But all of them give you a way of thinking about some aspects of the process of scientific inquiry. Each one gives you different insights.

An image of the book's description with the words like research, science, and inquiry and what the word research meant in the scientific world.

Exercise 1.2

As you read about each descriptor below, think about what would make an inquiry more or less scientific. If you think a descriptor is important, use it to revise your definition of scientific inquiry.

Creating an Image of Scientific Inquiry

We will present three descriptors of scientific inquiry. Each provides a different perspective and emphasizes a different aspect of scientific inquiry. We will draw on all three descriptors to compose our definition of scientific inquiry.

Descriptor 1. Experience Carefully Planned in Advance

Sir Ronald Fisher, often called the father of modern statistical design, once referred to research as “experience carefully planned in advance” (1935, p. 8). He said that humans are always learning from experience, from interacting with the world around them. Usually, this learning is haphazard rather than the result of a deliberate process carried out over an extended period of time. Research, Fisher said, was learning from experience, but experience carefully planned in advance.

This phrase can be fully appreciated by looking at each word. The fact that scientific inquiry is based on experience means that it is based on interacting with the world. These interactions could be thought of as the stuff of scientific inquiry. In addition, it is not just any experience that counts. The experience must be carefully planned . The interactions with the world must be conducted with an explicit, describable purpose, and steps must be taken to make the intended learning as likely as possible. This planning is an integral part of scientific inquiry; it is not just a preparation phase. It is one of the things that distinguishes scientific inquiry from many everyday learning experiences. Finally, these steps must be taken beforehand and the purpose of the inquiry must be articulated in advance of the experience. Clearly, scientific inquiry does not happen by accident, by just stumbling into something. Stumbling into something unexpected and interesting can happen while engaged in scientific inquiry, but learning does not depend on it and serendipity does not make the inquiry scientific.

Descriptor 2. Observing Something and Trying to Explain Why It Is the Way It Is

When we were writing this chapter and googled “scientific inquiry,” the first entry was: “Scientific inquiry refers to the diverse ways in which scientists study the natural world and propose explanations based on the evidence derived from their work.” The emphasis is on studying, or observing, and then explaining . This descriptor takes the image of scientific inquiry beyond carefully planned experience and includes explaining what was experienced.

According to the Merriam-Webster dictionary, “explain” means “(a) to make known, (b) to make plain or understandable, (c) to give the reason or cause of, and (d) to show the logical development or relations of” (Merriam-Webster, n.d. ). We will use all these definitions. Taken together, they suggest that to explain an observation means to understand it by finding reasons (or causes) for why it is as it is. In this sense of scientific inquiry, the following are synonyms: explaining why, understanding why, and reasoning about causes and effects. Our image of scientific inquiry now includes planning, observing, and explaining why.

An image represents the observation required in the scientific inquiry including planning and explaining.

We need to add a final note about this descriptor. We have phrased it in a way that suggests “observing something” means you are observing something in real time—observing the way things are or the way things are changing. This is often true. But, observing could mean observing data that already have been collected, maybe by someone else making the original observations (e.g., secondary analysis of NAEP data or analysis of existing video recordings of classroom instruction). We will address secondary analyses more fully in Chap. 4 . For now, what is important is that the process requires explaining why the data look like they do.

We must note that for us, the term “data” is not limited to numerical or quantitative data such as test scores. Data can also take many nonquantitative forms, including written survey responses, interview transcripts, journal entries, video recordings of students, teachers, and classrooms, text messages, and so forth.

An image represents the data explanation as it is not limited and takes numerous non-quantitative forms including an interview, journal entries, etc.

Exercise 1.3

What are the implications of the statement that just “observing” is not enough to count as scientific inquiry? Does this mean that a detailed description of a phenomenon is not scientific inquiry?

Find sources that define research in education that differ with our position, that say description alone, without explanation, counts as scientific research. Identify the precise points where the opinions differ. What are the best arguments for each of the positions? Which do you prefer? Why?

Descriptor 3. Updating Everyone’s Thinking in Response to More and Better Information

This descriptor focuses on a third aspect of scientific inquiry: updating and advancing the field’s understanding of phenomena that are investigated. This descriptor foregrounds a powerful characteristic of scientific inquiry: the reliability (or trustworthiness) of what is learned and the ultimate inevitability of this learning to advance human understanding of phenomena. Humans might choose not to learn from scientific inquiry, but history suggests that scientific inquiry always has the potential to advance understanding and that, eventually, humans take advantage of these new understandings.

Before exploring these bold claims a bit further, note that this descriptor uses “information” in the same way the previous two descriptors used “experience” and “observations.” These are the stuff of scientific inquiry and we will use them often, sometimes interchangeably. Frequently, we will use the term “data” to stand for all these terms.

An overriding goal of scientific inquiry is for everyone to learn from what one scientist does. Much of this book is about the methods you need to use so others have faith in what you report and can learn the same things you learned. This aspect of scientific inquiry has many implications.

One implication is that scientific inquiry is not a private practice. It is a public practice available for others to see and learn from. Notice how different this is from everyday learning. When you happen to learn something from your everyday experience, often only you gain from the experience. The fact that research is a public practice means it is also a social one. It is best conducted by interacting with others along the way: soliciting feedback at each phase, taking opportunities to present work-in-progress, and benefitting from the advice of others.

A second implication is that you, as the researcher, must be committed to sharing what you are doing and what you are learning in an open and transparent way. This allows all phases of your work to be scrutinized and critiqued. This is what gives your work credibility. The reliability or trustworthiness of your findings depends on your colleagues recognizing that you have used all appropriate methods to maximize the chances that your claims are justified by the data.

A third implication of viewing scientific inquiry as a collective enterprise is the reverse of the second—you must be committed to receiving comments from others. You must treat your colleagues as fair and honest critics even though it might sometimes feel otherwise. You must appreciate their job, which is to remain skeptical while scrutinizing what you have done in considerable detail. To provide the best help to you, they must remain skeptical about your conclusions (when, for example, the data are difficult for them to interpret) until you offer a convincing logical argument based on the information you share. A rather harsh but good-to-remember statement of the role of your friendly critics was voiced by Karl Popper, a well-known twentieth century philosopher of science: “. . . if you are interested in the problem which I tried to solve by my tentative assertion, you may help me by criticizing it as severely as you can” (Popper, 1968, p. 27).

A final implication of this third descriptor is that, as someone engaged in scientific inquiry, you have no choice but to update your thinking when the data support a different conclusion. This applies to your own data as well as to those of others. When data clearly point to a specific claim, even one that is quite different than you expected, you must reconsider your position. If the outcome is replicated multiple times, you need to adjust your thinking accordingly. Scientific inquiry does not let you pick and choose which data to believe; it mandates that everyone update their thinking when the data warrant an update.

Doing Scientific Inquiry

We define scientific inquiry in an operational sense—what does it mean to do scientific inquiry? What kind of process would satisfy all three descriptors: carefully planning an experience in advance; observing and trying to explain what you see; and, contributing to updating everyone’s thinking about an important phenomenon?

We define scientific inquiry as formulating , testing , and revising hypotheses about phenomena of interest.

Of course, we are not the only ones who define it in this way. The definition for the scientific method posted by the editors of Britannica is: “a researcher develops a hypothesis, tests it through various means, and then modifies the hypothesis on the basis of the outcome of the tests and experiments” (Britannica, n.d. ).

An image represents the scientific inquiry definition given by the editors of Britannica and also defines the hypothesis on the basis of the experiments.

Notice how defining scientific inquiry this way satisfies each of the descriptors. “Carefully planning an experience in advance” is exactly what happens when formulating a hypothesis about a phenomenon of interest and thinking about how to test it. “ Observing a phenomenon” occurs when testing a hypothesis, and “ explaining ” what is found is required when revising a hypothesis based on the data. Finally, “updating everyone’s thinking” comes from comparing publicly the original with the revised hypothesis.

Doing scientific inquiry, as we have defined it, underscores the value of accumulating knowledge rather than generating random bits of knowledge. Formulating, testing, and revising hypotheses is an ongoing process, with each revised hypothesis begging for another test, whether by the same researcher or by new researchers. The editors of Britannica signaled this cyclic process by adding the following phrase to their definition of the scientific method: “The modified hypothesis is then retested, further modified, and tested again.” Scientific inquiry creates a process that encourages each study to build on the studies that have gone before. Through collective engagement in this process of building study on top of study, the scientific community works together to update its thinking.

Before exploring more fully the meaning of “formulating, testing, and revising hypotheses,” we need to acknowledge that this is not the only way researchers define research. Some researchers prefer a less formal definition, one that includes more serendipity, less planning, less explanation. You might have come across more open definitions such as “research is finding out about something.” We prefer the tighter hypothesis formulation, testing, and revision definition because we believe it provides a single, coherent map for conducting research that addresses many of the thorny problems educational researchers encounter. We believe it is the most useful orientation toward research and the most helpful to learn as a beginning researcher.

A final clarification of our definition is that it applies equally to qualitative and quantitative research. This is a familiar distinction in education that has generated much discussion. You might think our definition favors quantitative methods over qualitative methods because the language of hypothesis formulation and testing is often associated with quantitative methods. In fact, we do not favor one method over another. In Chap. 4 , we will illustrate how our definition fits research using a range of quantitative and qualitative methods.

Exercise 1.4

Look for ways to extend what the field knows in an area that has already received attention by other researchers. Specifically, you can search for a program of research carried out by more experienced researchers that has some revised hypotheses that remain untested. Identify a revised hypothesis that you might like to test.

Unpacking the Terms Formulating, Testing, and Revising Hypotheses

To get a full sense of the definition of scientific inquiry we will use throughout this book, it is helpful to spend a little time with each of the key terms.

We first want to make clear that we use the term “hypothesis” as it is defined in most dictionaries and as it used in many scientific fields rather than as it is usually defined in educational statistics courses. By “hypothesis,” we do not mean a null hypothesis that is accepted or rejected by statistical analysis. Rather, we use “hypothesis” in the sense conveyed by the following definitions: “An idea or explanation for something that is based on known facts but has not yet been proved” (Cambridge University Press, n.d. ), and “An unproved theory, proposition, or supposition, tentatively accepted to explain certain facts and to provide a basis for further investigation or argument” (Agnes & Guralnik, 2008 ).

We distinguish two parts to “hypotheses.” Hypotheses consist of predictions and rationales . Predictions are statements about what you expect to find when you inquire about something. Rationales are explanations for why you made the predictions you did, why you believe your predictions are correct. So, for us “formulating hypotheses” means making explicit predictions and developing rationales for the predictions.

“Testing hypotheses” means making observations that allow you to assess in what ways your predictions were correct and in what ways they were incorrect. In education research, it is rarely useful to think of your predictions as either right or wrong. Because of the complexity of most issues you will investigate, most predictions will be right in some ways and wrong in others.

By studying the observations you make (data you collect) to test your hypotheses, you can revise your hypotheses to better align with the observations. This means revising your predictions plus revising your rationales to justify your adjusted predictions. Even though you might not run another test, formulating revised hypotheses is an essential part of conducting a research study. Comparing your original and revised hypotheses informs everyone of what you learned by conducting your study. In addition, a revised hypothesis sets the stage for you or someone else to extend your study and accumulate more knowledge of the phenomenon.

We should note that not everyone makes a clear distinction between predictions and rationales as two aspects of hypotheses. In fact, common, non-scientific uses of the word “hypothesis” may limit it to only a prediction or only an explanation (or rationale). We choose to explicitly include both prediction and rationale in our definition of hypothesis, not because we assert this should be the universal definition, but because we want to foreground the importance of both parts acting in concert. Using “hypothesis” to represent both prediction and rationale could hide the two aspects, but we make them explicit because they provide different kinds of information. It is usually easier to make predictions than develop rationales because predictions can be guesses, hunches, or gut feelings about which you have little confidence. Developing a compelling rationale requires careful thought plus reading what other researchers have found plus talking with your colleagues. Often, while you are developing your rationale you will find good reasons to change your predictions. Developing good rationales is the engine that drives scientific inquiry. Rationales are essentially descriptions of how much you know about the phenomenon you are studying. Throughout this guide, we will elaborate on how developing good rationales drives scientific inquiry. For now, we simply note that it can sharpen your predictions and help you to interpret your data as you test your hypotheses.

An image represents the rationale and the prediction for the scientific inquiry and different types of information provided by the terms.

Hypotheses in education research take a variety of forms or types. This is because there are a variety of phenomena that can be investigated. Investigating educational phenomena is sometimes best done using qualitative methods, sometimes using quantitative methods, and most often using mixed methods (e.g., Hay, 2016 ; Weis et al. 2019a ; Weisner, 2005 ). This means that, given our definition, hypotheses are equally applicable to qualitative and quantitative investigations.

Hypotheses take different forms when they are used to investigate different kinds of phenomena. Two very different activities in education could be labeled conducting experiments and descriptions. In an experiment, a hypothesis makes a prediction about anticipated changes, say the changes that occur when a treatment or intervention is applied. You might investigate how students’ thinking changes during a particular kind of instruction.

A second type of hypothesis, relevant for descriptive research, makes a prediction about what you will find when you investigate and describe the nature of a situation. The goal is to understand a situation as it exists rather than to understand a change from one situation to another. In this case, your prediction is what you expect to observe. Your rationale is the set of reasons for making this prediction; it is your current explanation for why the situation will look like it does.

You will probably read, if you have not already, that some researchers say you do not need a prediction to conduct a descriptive study. We will discuss this point of view in Chap. 2 . For now, we simply claim that scientific inquiry, as we have defined it, applies to all kinds of research studies. Descriptive studies, like others, not only benefit from formulating, testing, and revising hypotheses, but also need hypothesis formulating, testing, and revising.

One reason we define research as formulating, testing, and revising hypotheses is that if you think of research in this way you are less likely to go wrong. It is a useful guide for the entire process, as we will describe in detail in the chapters ahead. For example, as you build the rationale for your predictions, you are constructing the theoretical framework for your study (Chap. 3 ). As you work out the methods you will use to test your hypothesis, every decision you make will be based on asking, “Will this help me formulate or test or revise my hypothesis?” (Chap. 4 ). As you interpret the results of testing your predictions, you will compare them to what you predicted and examine the differences, focusing on how you must revise your hypotheses (Chap. 5 ). By anchoring the process to formulating, testing, and revising hypotheses, you will make smart decisions that yield a coherent and well-designed study.

Exercise 1.5

Compare the concept of formulating, testing, and revising hypotheses with the descriptions of scientific inquiry contained in Scientific Research in Education (NRC, 2002 ). How are they similar or different?

Exercise 1.6

Provide an example to illustrate and emphasize the differences between everyday learning/thinking and scientific inquiry.

Learning from Doing Scientific Inquiry

We noted earlier that a measure of what you have learned by conducting a research study is found in the differences between your original hypothesis and your revised hypothesis based on the data you collected to test your hypothesis. We will elaborate this statement in later chapters, but we preview our argument here.

Even before collecting data, scientific inquiry requires cycles of making a prediction, developing a rationale, refining your predictions, reading and studying more to strengthen your rationale, refining your predictions again, and so forth. And, even if you have run through several such cycles, you still will likely find that when you test your prediction you will be partly right and partly wrong. The results will support some parts of your predictions but not others, or the results will “kind of” support your predictions. A critical part of scientific inquiry is making sense of your results by interpreting them against your predictions. Carefully describing what aspects of your data supported your predictions, what aspects did not, and what data fell outside of any predictions is not an easy task, but you cannot learn from your study without doing this analysis.

An image represents the cycle of events that take place before making predictions, developing the rationale, and studying the prediction and rationale multiple times.

Analyzing the matches and mismatches between your predictions and your data allows you to formulate different rationales that would have accounted for more of the data. The best revised rationale is the one that accounts for the most data. Once you have revised your rationales, you can think about the predictions they best justify or explain. It is by comparing your original rationales to your new rationales that you can sort out what you learned from your study.

Suppose your study was an experiment. Maybe you were investigating the effects of a new instructional intervention on students’ learning. Your original rationale was your explanation for why the intervention would change the learning outcomes in a particular way. Your revised rationale explained why the changes that you observed occurred like they did and why your revised predictions are better. Maybe your original rationale focused on the potential of the activities if they were implemented in ideal ways and your revised rationale included the factors that are likely to affect how teachers implement them. By comparing the before and after rationales, you are describing what you learned—what you can explain now that you could not before. Another way of saying this is that you are describing how much more you understand now than before you conducted your study.

Revised predictions based on carefully planned and collected data usually exhibit some of the following features compared with the originals: more precision, more completeness, and broader scope. Revised rationales have more explanatory power and become more complete, more aligned with the new predictions, sharper, and overall more convincing.

Part II. Why Do Educators Do Research?

Doing scientific inquiry is a lot of work. Each phase of the process takes time, and you will often cycle back to improve earlier phases as you engage in later phases. Because of the significant effort required, you should make sure your study is worth it. So, from the beginning, you should think about the purpose of your study. Why do you want to do it? And, because research is a social practice, you should also think about whether the results of your study are likely to be important and significant to the education community.

If you are doing research in the way we have described—as scientific inquiry—then one purpose of your study is to understand , not just to describe or evaluate or report. As we noted earlier, when you formulate hypotheses, you are developing rationales that explain why things might be like they are. In our view, trying to understand and explain is what separates research from other kinds of activities, like evaluating or describing.

One reason understanding is so important is that it allows researchers to see how or why something works like it does. When you see how something works, you are better able to predict how it might work in other contexts, under other conditions. And, because conditions, or contextual factors, matter a lot in education, gaining insights into applying your findings to other contexts increases the contributions of your work and its importance to the broader education community.

Consequently, the purposes of research studies in education often include the more specific aim of identifying and understanding the conditions under which the phenomena being studied work like the observations suggest. A classic example of this kind of study in mathematics education was reported by William Brownell and Harold Moser in 1949 . They were trying to establish which method of subtracting whole numbers could be taught most effectively—the regrouping method or the equal additions method. However, they realized that effectiveness might depend on the conditions under which the methods were taught—“meaningfully” versus “mechanically.” So, they designed a study that crossed the two instructional approaches with the two different methods (regrouping and equal additions). Among other results, they found that these conditions did matter. The regrouping method was more effective under the meaningful condition than the mechanical condition, but the same was not true for the equal additions algorithm.

What do education researchers want to understand? In our view, the ultimate goal of education is to offer all students the best possible learning opportunities. So, we believe the ultimate purpose of scientific inquiry in education is to develop understanding that supports the improvement of learning opportunities for all students. We say “ultimate” because there are lots of issues that must be understood to improve learning opportunities for all students. Hypotheses about many aspects of education are connected, ultimately, to students’ learning. For example, formulating and testing a hypothesis that preservice teachers need to engage in particular kinds of activities in their coursework in order to teach particular topics well is, ultimately, connected to improving students’ learning opportunities. So is hypothesizing that school districts often devote relatively few resources to instructional leadership training or hypothesizing that positioning mathematics as a tool students can use to combat social injustice can help students see the relevance of mathematics to their lives.

We do not exclude the importance of research on educational issues more removed from improving students’ learning opportunities, but we do think the argument for their importance will be more difficult to make. If there is no way to imagine a connection between your hypothesis and improving learning opportunities for students, even a distant connection, we recommend you reconsider whether it is an important hypothesis within the education community.

Notice that we said the ultimate goal of education is to offer all students the best possible learning opportunities. For too long, educators have been satisfied with a goal of offering rich learning opportunities for lots of students, sometimes even for just the majority of students, but not necessarily for all students. Evaluations of success often are based on outcomes that show high averages. In other words, if many students have learned something, or even a smaller number have learned a lot, educators may have been satisfied. The problem is that there is usually a pattern in the groups of students who receive lower quality opportunities—students of color and students who live in poor areas, urban and rural. This is not acceptable. Consequently, we emphasize the premise that the purpose of education research is to offer rich learning opportunities to all students.

One way to make sure you will be able to convince others of the importance of your study is to consider investigating some aspect of teachers’ shared instructional problems. Historically, researchers in education have set their own research agendas, regardless of the problems teachers are facing in schools. It is increasingly recognized that teachers have had trouble applying to their own classrooms what researchers find. To address this problem, a researcher could partner with a teacher—better yet, a small group of teachers—and talk with them about instructional problems they all share. These discussions can create a rich pool of problems researchers can consider. If researchers pursued one of these problems (preferably alongside teachers), the connection to improving learning opportunities for all students could be direct and immediate. “Grounding a research question in instructional problems that are experienced across multiple teachers’ classrooms helps to ensure that the answer to the question will be of sufficient scope to be relevant and significant beyond the local context” (Cai et al., 2019b , p. 115).

As a beginning researcher, determining the relevance and importance of a research problem is especially challenging. We recommend talking with advisors, other experienced researchers, and peers to test the educational importance of possible research problems and topics of study. You will also learn much more about the issue of research importance when you read Chap. 5 .

Exercise 1.7

Identify a problem in education that is closely connected to improving learning opportunities and a problem that has a less close connection. For each problem, write a brief argument (like a logical sequence of if-then statements) that connects the problem to all students’ learning opportunities.

Part III. Conducting Research as a Practice of Failing Productively

Scientific inquiry involves formulating hypotheses about phenomena that are not fully understood—by you or anyone else. Even if you are able to inform your hypotheses with lots of knowledge that has already been accumulated, you are likely to find that your prediction is not entirely accurate. This is normal. Remember, scientific inquiry is a process of constantly updating your thinking. More and better information means revising your thinking, again, and again, and again. Because you never fully understand a complicated phenomenon and your hypotheses never produce completely accurate predictions, it is easy to believe you are somehow failing.

The trick is to fail upward, to fail to predict accurately in ways that inform your next hypothesis so you can make a better prediction. Some of the best-known researchers in education have been open and honest about the many times their predictions were wrong and, based on the results of their studies and those of others, they continuously updated their thinking and changed their hypotheses.

A striking example of publicly revising (actually reversing) hypotheses due to incorrect predictions is found in the work of Lee J. Cronbach, one of the most distinguished educational psychologists of the twentieth century. In 1955, Cronbach delivered his presidential address to the American Psychological Association. Titling it “Two Disciplines of Scientific Psychology,” Cronbach proposed a rapprochement between two research approaches—correlational studies that focused on individual differences and experimental studies that focused on instructional treatments controlling for individual differences. (We will examine different research approaches in Chap. 4 ). If these approaches could be brought together, reasoned Cronbach ( 1957 ), researchers could find interactions between individual characteristics and treatments (aptitude-treatment interactions or ATIs), fitting the best treatments to different individuals.

In 1975, after years of research by many researchers looking for ATIs, Cronbach acknowledged the evidence for simple, useful ATIs had not been found. Even when trying to find interactions between a few variables that could provide instructional guidance, the analysis, said Cronbach, creates “a hall of mirrors that extends to infinity, tormenting even the boldest investigators and defeating even ambitious designs” (Cronbach, 1975 , p. 119).

As he was reflecting back on his work, Cronbach ( 1986 ) recommended moving away from documenting instructional effects through statistical inference (an approach he had championed for much of his career) and toward approaches that probe the reasons for these effects, approaches that provide a “full account of events in a time, place, and context” (Cronbach, 1986 , p. 104). This is a remarkable change in hypotheses, a change based on data and made fully transparent. Cronbach understood the value of failing productively.

Closer to home, in a less dramatic example, one of us began a line of scientific inquiry into how to prepare elementary preservice teachers to teach early algebra. Teaching early algebra meant engaging elementary students in early forms of algebraic reasoning. Such reasoning should help them transition from arithmetic to algebra. To begin this line of inquiry, a set of activities for preservice teachers were developed. Even though the activities were based on well-supported hypotheses, they largely failed to engage preservice teachers as predicted because of unanticipated challenges the preservice teachers faced. To capitalize on this failure, follow-up studies were conducted, first to better understand elementary preservice teachers’ challenges with preparing to teach early algebra, and then to better support preservice teachers in navigating these challenges. In this example, the initial failure was a necessary step in the researchers’ scientific inquiry and furthered the researchers’ understanding of this issue.

We present another example of failing productively in Chap. 2 . That example emerges from recounting the history of a well-known research program in mathematics education.

Making mistakes is an inherent part of doing scientific research. Conducting a study is rarely a smooth path from beginning to end. We recommend that you keep the following things in mind as you begin a career of conducting research in education.

First, do not get discouraged when you make mistakes; do not fall into the trap of feeling like you are not capable of doing research because you make too many errors.

Second, learn from your mistakes. Do not ignore your mistakes or treat them as errors that you simply need to forget and move past. Mistakes are rich sites for learning—in research just as in other fields of study.

Third, by reflecting on your mistakes, you can learn to make better mistakes, mistakes that inform you about a productive next step. You will not be able to eliminate your mistakes, but you can set a goal of making better and better mistakes.

Exercise 1.8

How does scientific inquiry differ from everyday learning in giving you the tools to fail upward? You may find helpful perspectives on this question in other resources on science and scientific inquiry (e.g., Failure: Why Science is So Successful by Firestein, 2015).

Exercise 1.9

Use what you have learned in this chapter to write a new definition of scientific inquiry. Compare this definition with the one you wrote before reading this chapter. If you are reading this book as part of a course, compare your definition with your colleagues’ definitions. Develop a consensus definition with everyone in the course.

Part IV. Preview of Chap. 2

Now that you have a good idea of what research is, at least of what we believe research is, the next step is to think about how to actually begin doing research. This means how to begin formulating, testing, and revising hypotheses. As for all phases of scientific inquiry, there are lots of things to think about. Because it is critical to start well, we devote Chap. 2 to getting started with formulating hypotheses.

Agnes, M., & Guralnik, D. B. (Eds.). (2008). Hypothesis. In Webster’s new world college dictionary (4th ed.). Wiley.

Google Scholar  

Britannica. (n.d.). Scientific method. In Encyclopaedia Britannica . Retrieved July 15, 2022 from https://www.britannica.com/science/scientific-method

Brownell, W. A., & Moser, H. E. (1949). Meaningful vs. mechanical learning: A study in grade III subtraction . Duke University Press..

Cai, J., Morris, A., Hohensee, C., Hwang, S., Robison, V., Cirillo, M., Kramer, S. L., & Hiebert, J. (2019b). Posing significant research questions. Journal for Research in Mathematics Education, 50 (2), 114–120. https://doi.org/10.5951/jresematheduc.50.2.0114

Article   Google Scholar  

Cambridge University Press. (n.d.). Hypothesis. In Cambridge dictionary . Retrieved July 15, 2022 from https://dictionary.cambridge.org/us/dictionary/english/hypothesis

Cronbach, J. L. (1957). The two disciplines of scientific psychology. American Psychologist, 12 , 671–684.

Cronbach, L. J. (1975). Beyond the two disciplines of scientific psychology. American Psychologist, 30 , 116–127.

Cronbach, L. J. (1986). Social inquiry by and for earthlings. In D. W. Fiske & R. A. Shweder (Eds.), Metatheory in social science: Pluralisms and subjectivities (pp. 83–107). University of Chicago Press.

Hay, C. M. (Ed.). (2016). Methods that matter: Integrating mixed methods for more effective social science research . University of Chicago Press.

Merriam-Webster. (n.d.). Explain. In Merriam-Webster.com dictionary . Retrieved July 15, 2022, from https://www.merriam-webster.com/dictionary/explain

National Research Council. (2002). Scientific research in education . National Academy Press.

Weis, L., Eisenhart, M., Duncan, G. J., Albro, E., Bueschel, A. C., Cobb, P., Eccles, J., Mendenhall, R., Moss, P., Penuel, W., Ream, R. K., Rumbaut, R. G., Sloane, F., Weisner, T. S., & Wilson, J. (2019a). Mixed methods for studies that address broad and enduring issues in education research. Teachers College Record, 121 , 100307.

Weisner, T. S. (Ed.). (2005). Discovering successful pathways in children’s development: Mixed methods in the study of childhood and family life . University of Chicago Press.

Download references

Author information

Authors and affiliations.

School of Education, University of Delaware, Newark, DE, USA

James Hiebert, Anne K Morris & Charles Hohensee

Department of Mathematical Sciences, University of Delaware, Newark, DE, USA

Jinfa Cai & Stephen Hwang

You can also search for this author in PubMed   Google Scholar

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/ ), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

© 2023 The Author(s)

About this chapter

Hiebert, J., Cai, J., Hwang, S., Morris, A.K., Hohensee, C. (2023). What Is Research, and Why Do People Do It?. In: Doing Research: A New Researcher’s Guide. Research in Mathematics Education. Springer, Cham. https://doi.org/10.1007/978-3-031-19078-0_1

Download citation

DOI : https://doi.org/10.1007/978-3-031-19078-0_1

Published : 03 December 2022

Publisher Name : Springer, Cham

Print ISBN : 978-3-031-19077-3

Online ISBN : 978-3-031-19078-0

eBook Packages : Education Education (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Saudi J Anaesth
  • v.13(Suppl 1); 2019 Apr

Writing the title and abstract for a research paper: Being concise, precise, and meticulous is the key

Milind s. tullu.

Department of Pediatrics, Seth G.S. Medical College and KEM Hospital, Parel, Mumbai, Maharashtra, India

This article deals with formulating a suitable title and an appropriate abstract for an original research paper. The “title” and the “abstract” are the “initial impressions” of a research article, and hence they need to be drafted correctly, accurately, carefully, and meticulously. Often both of these are drafted after the full manuscript is ready. Most readers read only the title and the abstract of a research paper and very few will go on to read the full paper. The title and the abstract are the most important parts of a research paper and should be pleasant to read. The “title” should be descriptive, direct, accurate, appropriate, interesting, concise, precise, unique, and should not be misleading. The “abstract” needs to be simple, specific, clear, unbiased, honest, concise, precise, stand-alone, complete, scholarly, (preferably) structured, and should not be misrepresentative. The abstract should be consistent with the main text of the paper, especially after a revision is made to the paper and should include the key message prominently. It is very important to include the most important words and terms (the “keywords”) in the title and the abstract for appropriate indexing purpose and for retrieval from the search engines and scientific databases. Such keywords should be listed after the abstract. One must adhere to the instructions laid down by the target journal with regard to the style and number of words permitted for the title and the abstract.

Introduction

This article deals with drafting a suitable “title” and an appropriate “abstract” for an original research paper. Because the “title” and the “abstract” are the “initial impressions” or the “face” of a research article, they need to be drafted correctly, accurately, carefully, meticulously, and consume time and energy.[ 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 ] Often, these are drafted after the complete manuscript draft is ready.[ 2 , 3 , 4 , 5 , 9 , 10 , 11 ] Most readers will read only the title and the abstract of a published research paper, and very few “interested ones” (especially, if the paper is of use to them) will go on to read the full paper.[ 1 , 2 ] One must remember to adhere to the instructions laid down by the “target journal” (the journal for which the author is writing) regarding the style and number of words permitted for the title and the abstract.[ 2 , 4 , 5 , 7 , 8 , 9 , 12 ] Both the title and the abstract are the most important parts of a research paper – for editors (to decide whether to process the paper for further review), for reviewers (to get an initial impression of the paper), and for the readers (as these may be the only parts of the paper available freely and hence, read widely).[ 4 , 8 , 12 ] It may be worth for the novice author to browse through titles and abstracts of several prominent journals (and their target journal as well) to learn more about the wording and styles of the titles and abstracts, as well as the aims and scope of the particular journal.[ 5 , 7 , 9 , 13 ]

The details of the title are discussed under the subheadings of importance, types, drafting, and checklist.

Importance of the title

When a reader browses through the table of contents of a journal issue (hard copy or on website), the title is the “ first detail” or “face” of the paper that is read.[ 2 , 3 , 4 , 5 , 6 , 13 ] Hence, it needs to be simple, direct, accurate, appropriate, specific, functional, interesting, attractive/appealing, concise/brief, precise/focused, unambiguous, memorable, captivating, informative (enough to encourage the reader to read further), unique, catchy, and it should not be misleading.[ 1 , 2 , 3 , 4 , 5 , 6 , 9 , 12 ] It should have “just enough details” to arouse the interest and curiosity of the reader so that the reader then goes ahead with studying the abstract and then (if still interested) the full paper.[ 1 , 2 , 4 , 13 ] Journal websites, electronic databases, and search engines use the words in the title and abstract (the “keywords”) to retrieve a particular paper during a search; hence, the importance of these words in accessing the paper by the readers has been emphasized.[ 3 , 4 , 5 , 6 , 12 , 14 ] Such important words (or keywords) should be arranged in appropriate order of importance as per the context of the paper and should be placed at the beginning of the title (rather than the later part of the title, as some search engines like Google may just display only the first six to seven words of the title).[ 3 , 5 , 12 ] Whimsical, amusing, or clever titles, though initially appealing, may be missed or misread by the busy reader and very short titles may miss the essential scientific words (the “keywords”) used by the indexing agencies to catch and categorize the paper.[ 1 , 3 , 4 , 9 ] Also, amusing or hilarious titles may be taken less seriously by the readers and may be cited less often.[ 4 , 15 ] An excessively long or complicated title may put off the readers.[ 3 , 9 ] It may be a good idea to draft the title after the main body of the text and the abstract are drafted.[ 2 , 3 , 4 , 5 ]

Types of titles

Titles can be descriptive, declarative, or interrogative. They can also be classified as nominal, compound, or full-sentence titles.

Descriptive or neutral title

This has the essential elements of the research theme, that is, the patients/subjects, design, interventions, comparisons/control, and outcome, but does not reveal the main result or the conclusion.[ 3 , 4 , 12 , 16 ] Such a title allows the reader to interpret the findings of the research paper in an impartial manner and with an open mind.[ 3 ] These titles also give complete information about the contents of the article, have several keywords (thus increasing the visibility of the article in search engines), and have increased chances of being read and (then) being cited as well.[ 4 ] Hence, such descriptive titles giving a glimpse of the paper are generally preferred.[ 4 , 16 ]

Declarative title

This title states the main finding of the study in the title itself; it reduces the curiosity of the reader, may point toward a bias on the part of the author, and hence is best avoided.[ 3 , 4 , 12 , 16 ]

Interrogative title

This is the one which has a query or the research question in the title.[ 3 , 4 , 16 ] Though a query in the title has the ability to sensationalize the topic, and has more downloads (but less citations), it can be distracting to the reader and is again best avoided for a research article (but can, at times, be used for a review article).[ 3 , 6 , 16 , 17 ]

From a sentence construct point of view, titles may be nominal (capturing only the main theme of the study), compound (with subtitles to provide additional relevant information such as context, design, location/country, temporal aspect, sample size, importance, and a provocative or a literary; for example, see the title of this review), or full-sentence titles (which are longer and indicate an added degree of certainty of the results).[ 4 , 6 , 9 , 16 ] Any of these constructs may be used depending on the type of article, the key message, and the author's preference or judgement.[ 4 ]

Drafting a suitable title

A stepwise process can be followed to draft the appropriate title. The author should describe the paper in about three sentences, avoiding the results and ensuring that these sentences contain important scientific words/keywords that describe the main contents and subject of the paper.[ 1 , 4 , 6 , 12 ] Then the author should join the sentences to form a single sentence, shorten the length (by removing redundant words or adjectives or phrases), and finally edit the title (thus drafted) to make it more accurate, concise (about 10–15 words), and precise.[ 1 , 3 , 4 , 5 , 9 ] Some journals require that the study design be included in the title, and this may be placed (using a colon) after the primary title.[ 2 , 3 , 4 , 14 ] The title should try to incorporate the Patients, Interventions, Comparisons and Outcome (PICO).[ 3 ] The place of the study may be included in the title (if absolutely necessary), that is, if the patient characteristics (such as study population, socioeconomic conditions, or cultural practices) are expected to vary as per the country (or the place of the study) and have a bearing on the possible outcomes.[ 3 , 6 ] Lengthy titles can be boring and appear unfocused, whereas very short titles may not be representative of the contents of the article; hence, optimum length is required to ensure that the title explains the main theme and content of the manuscript.[ 4 , 5 , 9 ] Abbreviations (except the standard or commonly interpreted ones such as HIV, AIDS, DNA, RNA, CDC, FDA, ECG, and EEG) or acronyms should be avoided in the title, as a reader not familiar with them may skip such an article and nonstandard abbreviations may create problems in indexing the article.[ 3 , 4 , 5 , 6 , 9 , 12 ] Also, too much of technical jargon or chemical formulas in the title may confuse the readers and the article may be skipped by them.[ 4 , 9 ] Numerical values of various parameters (stating study period or sample size) should also be avoided in the titles (unless deemed extremely essential).[ 4 ] It may be worthwhile to take an opinion from a impartial colleague before finalizing the title.[ 4 , 5 , 6 ] Thus, multiple factors (which are, at times, a bit conflicting or contrasting) need to be considered while formulating a title, and hence this should not be done in a hurry.[ 4 , 6 ] Many journals ask the authors to draft a “short title” or “running head” or “running title” for printing in the header or footer of the printed paper.[ 3 , 12 ] This is an abridged version of the main title of up to 40–50 characters, may have standard abbreviations, and helps the reader to navigate through the paper.[ 3 , 12 , 14 ]

Checklist for a good title

Table 1 gives a checklist/useful tips for drafting a good title for a research paper.[ 1 , 2 , 3 , 4 , 5 , 6 , 12 ] Table 2 presents some of the titles used by the author of this article in his earlier research papers, and the appropriateness of the titles has been commented upon. As an individual exercise, the reader may try to improvise upon the titles (further) after reading the corresponding abstract and full paper.

Checklist/useful tips for drafting a good title for a research paper

The title needs to be simple and direct
It should be interesting and informative
It should be specific, accurate, and functional (with essential scientific “keywords” for indexing)
It should be concise, precise, and should include the main theme of the paper
It should not be misleading or misrepresentative
It should not be too long or too short (or cryptic)
It should avoid whimsical or amusing words
It should avoid nonstandard abbreviations and unnecessary acronyms (or technical jargon)
Title should be SPICED, that is, it should include Setting, Population, Intervention, Condition, End-point, and Design
Place of the study and sample size should be mentioned only if it adds to the scientific value of the title
Important terms/keywords should be placed in the beginning of the title
Descriptive titles are preferred to declarative or interrogative titles
Authors should adhere to the word count and other instructions as specified by the target journal

Some titles used by author of this article in his earlier publications and remark/comment on their appropriateness

TitleComment/remark on the contents of the title
Comparison of Pediatric Risk of Mortality III, Pediatric Index of Mortality 2, and Pediatric Index of Mortality 3 Scores in Predicting Mortality in a Pediatric Intensive Care UnitLong title (28 words) capturing the main theme; site of study is mentioned
A Prospective Antibacterial Utilization Study in Pediatric Intensive Care Unit of a Tertiary Referral CenterOptimum number of words capturing the main theme; site of study is mentioned
Study of Ventilator-Associated Pneumonia in a Pediatric Intensive Care UnitThe words “study of” can be deleted
Clinical Profile, Co-Morbidities & Health Related Quality of Life in Pediatric Patients with Allergic Rhinitis & AsthmaOptimum number of words; population and intervention mentioned
Benzathine Penicillin Prophylaxis in Children with Rheumatic Fever (RF)/Rheumatic Heart Disease (RHD): A Study of ComplianceSubtitle used to convey the main focus of the paper. It may be preferable to use the important word “compliance” in the beginning of the title rather than at the end. Abbreviations RF and RHD can be deleted as corresponding full forms have already been mentioned in the title itself
Performance of PRISM (Pediatric Risk of Mortality) Score and PIM (Pediatric Index of Mortality) Score in a Tertiary Care Pediatric ICUAbbreviations used. “ICU” may be allowed as it is a commonly used abbreviation. Abbreviations PRISM and PIM can be deleted as corresponding full forms are already used in the title itself
Awareness of Health Care Workers Regarding Prophylaxis for Prevention of Transmission of Blood-Borne Viral Infections in Occupational ExposuresSlightly long title (18 words); theme well-captured
Isolated Infective Endocarditis of the Pulmonary Valve: An Autopsy Analysis of Nine CasesSubtitle used to convey additional details like “autopsy” (i.e., postmortem analysis) and “nine” (i.e., number of cases)
Atresia of the Common Pulmonary Vein - A Rare Congenital AnomalySubtitle used to convey importance of the paper/rarity of the condition
Psychological Consequences in Pediatric Intensive Care Unit Survivors: The Neglected OutcomeSubtitle used to convey importance of the paper and to make the title more interesting
Rheumatic Fever and Rheumatic Heart Disease: Clinical Profile of 550 patients in IndiaNumber of cases (550) emphasized because it is a large series; country (India) is mentioned in the title - will the clinical profile of patients with rheumatic fever and rheumatic heart disease vary from country to country? May be yes, as the clinical features depend on the socioeconomic and cultural background
Neurological Manifestations of HIV InfectionShort title; abbreviation “HIV” may be allowed as it is a commonly used abbreviation
Krabbe Disease - Clinical ProfileVery short title (only four words) - may miss out on the essential keywords required for indexing
Experience of Pediatric Tetanus Cases from MumbaiCity mentioned (Mumbai) in the title - one needs to think whether it is required in the title

The Abstract

The details of the abstract are discussed under the subheadings of importance, types, drafting, and checklist.

Importance of the abstract

The abstract is a summary or synopsis of the full research paper and also needs to have similar characteristics like the title. It needs to be simple, direct, specific, functional, clear, unbiased, honest, concise, precise, self-sufficient, complete, comprehensive, scholarly, balanced, and should not be misleading.[ 1 , 2 , 3 , 7 , 8 , 9 , 10 , 11 , 13 , 17 ] Writing an abstract is to extract and summarize (AB – absolutely, STR – straightforward, ACT – actual data presentation and interpretation).[ 17 ] The title and abstracts are the only sections of the research paper that are often freely available to the readers on the journal websites, search engines, and in many abstracting agencies/databases, whereas the full paper may attract a payment per view or a fee for downloading the pdf copy.[ 1 , 2 , 3 , 7 , 8 , 10 , 11 , 13 , 14 ] The abstract is an independent and stand-alone (that is, well understood without reading the full paper) section of the manuscript and is used by the editor to decide the fate of the article and to choose appropriate reviewers.[ 2 , 7 , 10 , 12 , 13 ] Even the reviewers are initially supplied only with the title and the abstract before they agree to review the full manuscript.[ 7 , 13 ] This is the second most commonly read part of the manuscript, and therefore it should reflect the contents of the main text of the paper accurately and thus act as a “real trailer” of the full article.[ 2 , 7 , 11 ] The readers will go through the full paper only if they find the abstract interesting and relevant to their practice; else they may skip the paper if the abstract is unimpressive.[ 7 , 8 , 9 , 10 , 13 ] The abstract needs to highlight the selling point of the manuscript and succeed in luring the reader to read the complete paper.[ 3 , 7 ] The title and the abstract should be constructed using keywords (key terms/important words) from all the sections of the main text.[ 12 ] Abstracts are also used for submitting research papers to a conference for consideration for presentation (as oral paper or poster).[ 9 , 13 , 17 ] Grammatical and typographic errors reflect poorly on the quality of the abstract, may indicate carelessness/casual attitude on part of the author, and hence should be avoided at all times.[ 9 ]

Types of abstracts

The abstracts can be structured or unstructured. They can also be classified as descriptive or informative abstracts.

Structured and unstructured abstracts

Structured abstracts are followed by most journals, are more informative, and include specific subheadings/subsections under which the abstract needs to be composed.[ 1 , 7 , 8 , 9 , 10 , 11 , 13 , 17 , 18 ] These subheadings usually include context/background, objectives, design, setting, participants, interventions, main outcome measures, results, and conclusions.[ 1 ] Some journals stick to the standard IMRAD format for the structure of the abstracts, and the subheadings would include Introduction/Background, Methods, Results, And (instead of Discussion) the Conclusion/s.[ 1 , 2 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 17 , 18 ] Structured abstracts are more elaborate, informative, easy to read, recall, and peer-review, and hence are preferred; however, they consume more space and can have same limitations as an unstructured abstract.[ 7 , 9 , 18 ] The structured abstracts are (possibly) better understood by the reviewers and readers. Anyway, the choice of the type of the abstract and the subheadings of a structured abstract depend on the particular journal style and is not left to the author's wish.[ 7 , 10 , 12 ] Separate subheadings may be necessary for reporting meta-analysis, educational research, quality improvement work, review, or case study.[ 1 ] Clinical trial abstracts need to include the essential items mentioned in the CONSORT (Consolidated Standards Of Reporting Trials) guidelines.[ 7 , 9 , 14 , 19 ] Similar guidelines exist for various other types of studies, including observational studies and for studies of diagnostic accuracy.[ 20 , 21 ] A useful resource for the above guidelines is available at www.equator-network.org (Enhancing the QUAlity and Transparency Of health Research). Unstructured (or non-structured) abstracts are free-flowing, do not have predefined subheadings, and are commonly used for papers that (usually) do not describe original research.[ 1 , 7 , 9 , 10 ]

The four-point structured abstract: This has the following elements which need to be properly balanced with regard to the content/matter under each subheading:[ 9 ]

Background and/or Objectives: This states why the work was undertaken and is usually written in just a couple of sentences.[ 3 , 7 , 8 , 9 , 10 , 12 , 13 ] The hypothesis/study question and the major objectives are also stated under this subheading.[ 3 , 7 , 8 , 9 , 10 , 12 , 13 ]

Methods: This subsection is the longest, states what was done, and gives essential details of the study design, setting, participants, blinding, sample size, sampling method, intervention/s, duration and follow-up, research instruments, main outcome measures, parameters evaluated, and how the outcomes were assessed or analyzed.[ 3 , 7 , 8 , 9 , 10 , 12 , 13 , 14 , 17 ]

Results/Observations/Findings: This subheading states what was found, is longer, is difficult to draft, and needs to mention important details including the number of study participants, results of analysis (of primary and secondary objectives), and include actual data (numbers, mean, median, standard deviation, “P” values, 95% confidence intervals, effect sizes, relative risks, odds ratio, etc.).[ 3 , 7 , 8 , 9 , 10 , 12 , 13 , 14 , 17 ]

Conclusions: The take-home message (the “so what” of the paper) and other significant/important findings should be stated here, considering the interpretation of the research question/hypothesis and results put together (without overinterpreting the findings) and may also include the author's views on the implications of the study.[ 3 , 7 , 8 , 9 , 10 , 12 , 13 , 14 , 17 ]

The eight-point structured abstract: This has the following eight subheadings – Objectives, Study Design, Study Setting, Participants/Patients, Methods/Intervention, Outcome Measures, Results, and Conclusions.[ 3 , 9 , 18 ] The instructions to authors given by the particular journal state whether they use the four- or eight-point abstract or variants thereof.[ 3 , 14 ]

Descriptive and Informative abstracts

Descriptive abstracts are short (75–150 words), only portray what the paper contains without providing any more details; the reader has to read the full paper to know about its contents and are rarely used for original research papers.[ 7 , 10 ] These are used for case reports, reviews, opinions, and so on.[ 7 , 10 ] Informative abstracts (which may be structured or unstructured as described above) give a complete detailed summary of the article contents and truly reflect the actual research done.[ 7 , 10 ]

Drafting a suitable abstract

It is important to religiously stick to the instructions to authors (format, word limit, font size/style, and subheadings) provided by the journal for which the abstract and the paper are being written.[ 7 , 8 , 9 , 10 , 13 ] Most journals allow 200–300 words for formulating the abstract and it is wise to restrict oneself to this word limit.[ 1 , 2 , 3 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 22 ] Though some authors prefer to draft the abstract initially, followed by the main text of the paper, it is recommended to draft the abstract in the end to maintain accuracy and conformity with the main text of the paper (thus maintaining an easy linkage/alignment with title, on one hand, and the introduction section of the main text, on the other hand).[ 2 , 7 , 9 , 10 , 11 ] The authors should check the subheadings (of the structured abstract) permitted by the target journal, use phrases rather than sentences to draft the content of the abstract, and avoid passive voice.[ 1 , 7 , 9 , 12 ] Next, the authors need to get rid of redundant words and edit the abstract (extensively) to the correct word count permitted (every word in the abstract “counts”!).[ 7 , 8 , 9 , 10 , 13 ] It is important to ensure that the key message, focus, and novelty of the paper are not compromised; the rationale of the study and the basis of the conclusions are clear; and that the abstract is consistent with the main text of the paper.[ 1 , 2 , 3 , 7 , 9 , 11 , 12 , 13 , 14 , 17 , 22 ] This is especially important while submitting a revision of the paper (modified after addressing the reviewer's comments), as the changes made in the main (revised) text of the paper need to be reflected in the (revised) abstract as well.[ 2 , 10 , 12 , 14 , 22 ] Abbreviations should be avoided in an abstract, unless they are conventionally accepted or standard; references, tables, or figures should not be cited in the abstract.[ 7 , 9 , 10 , 11 , 13 ] It may be worthwhile not to rush with the abstract and to get an opinion by an impartial colleague on the content of the abstract; and if possible, the full paper (an “informal” peer-review).[ 1 , 7 , 8 , 9 , 11 , 17 ] Appropriate “Keywords” (three to ten words or phrases) should follow the abstract and should be preferably chosen from the Medical Subject Headings (MeSH) list of the U.S. National Library of Medicine ( https://meshb.nlm.nih.gov/search ) and are used for indexing purposes.[ 2 , 3 , 11 , 12 ] These keywords need to be different from the words in the main title (the title words are automatically used for indexing the article) and can be variants of the terms/phrases used in the title, or words from the abstract and the main text.[ 3 , 12 ] The ICMJE (International Committee of Medical Journal Editors; http://www.icmje.org/ ) also recommends publishing the clinical trial registration number at the end of the abstract.[ 7 , 14 ]

Checklist for a good abstract

Table 3 gives a checklist/useful tips for formulating a good abstract for a research paper.[ 1 , 2 , 3 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 17 , 22 ]

Checklist/useful tips for formulating a good abstract for a research paper

The abstract should have simple language and phrases (rather than sentences)
It should be informative, cohesive, and adhering to the structure (subheadings) provided by the target journal. Structured abstracts are preferred over unstructured abstracts
It should be independent and stand-alone/complete
It should be concise, interesting, unbiased, honest, balanced, and precise
It should not be misleading or misrepresentative; it should be consistent with the main text of the paper (especially after a revision is made)
It should utilize the full word capacity allowed by the journal so that most of the actual scientific facts of the main paper are represented in the abstract
It should include the key message prominently
It should adhere to the style and the word count specified by the target journal (usually about 250 words)
It should avoid nonstandard abbreviations and (if possible) avoid a passive voice
Authors should list appropriate “keywords” below the abstract (keywords are used for indexing purpose)

Concluding Remarks

This review article has given a detailed account of the importance and types of titles and abstracts. It has also attempted to give useful hints for drafting an appropriate title and a complete abstract for a research paper. It is hoped that this review will help the authors in their career in medical writing.

Financial support and sponsorship

Conflicts of interest.

There are no conflicts of interest.

Acknowledgement

The author thanks Dr. Hemant Deshmukh - Dean, Seth G.S. Medical College & KEM Hospital, for granting permission to publish this manuscript.

IMAGES

  1. Types Of Variables In Research Ppt

    significant role of variables in doing a research paper

  2. Types of variables in scientific research

    significant role of variables in doing a research paper

  3. Types of Research Variable in Research with Example

    significant role of variables in doing a research paper

  4. 10 Types of Variables in Research

    significant role of variables in doing a research paper

  5. PPT

    significant role of variables in doing a research paper

  6. 27 Types of Variables in Research and Statistics (2024)

    significant role of variables in doing a research paper

VIDEO

  1. Research Profile 1: Why is it so important?

  2. Practical Research 2 Quarter 1 Module 3: Kinds of Variables and Their Uses

  3. PEARSON-R TO SOLVE FOR THE SIGNIFICANT RELATIONSHIP || VARIABLES WITH INDICATORS ||

  4. Types of variables in research|Controlled & extragenous variables|Intervening & moderating variables

  5. What you need to chamber rifles like a Pro

  6. Types of variables. #research #researchmethodology #biostatistics #statistics #variable #rockbritto

COMMENTS

  1. Importance of Variables in Stating the Research Objectives

    It would be a waste of time and energy to do a study to examine only one question: whether duration of current depression predicts treatment response. So, it is usual for research protocols to include many independent variables and many dependent variables in the generation of many hypotheses, as shown in Table 1. Pairing each variable in the ...

  2. Variables in Research: Breaking Down the Essentials of Experimental

    The Role of Variables in Research. In scientific research, variables serve several key functions: Define Relationships: Variables allow researchers to investigate the relationships between different factors and characteristics, providing insights into the underlying mechanisms that drive phenomena and outcomes. Establish Comparisons: By manipulating and comparing variables, scientists can ...

  3. Variables in Research

    Categorical Variable. This is a variable that can take on a limited number of values or categories. Categorical variables can be nominal or ordinal. Nominal variables have no inherent order, while ordinal variables have a natural order. Examples of categorical variables include gender, race, and educational level.

  4. Organizing Your Social Sciences Research Paper

    Dependent Variable The variable that depends on other factors that are measured. These variables are expected to change as a result of an experimental manipulation of the independent variable or variables. It is the presumed effect. Independent Variable The variable that is stable and unaffected by the other variables you are trying to measure.

  5. Roles of Independent and Dependent Variables in Research

    Explore the essential roles of independent and dependent variables in research. This guide delves into their definitions, significance in experiments, and their critical relationship. Learn how these variables are the foundation of research design, influencing hypothesis testing, theory development, and statistical analysis, empowering researchers to understand and predict outcomes of research ...

  6. Types of Variables in Research & Statistics

    In these cases you may call the preceding variable (i.e., the rainfall) the predictor variable and the following variable (i.e. the mud) the outcome variable. Other common types of variables Once you have defined your independent and dependent variables and determined whether they are categorical or quantitative, you will be able to choose the ...

  7. A Practical Guide to Writing Quantitative and Qualitative Research

    The answer is written in length in the discussion section of the paper. Thus, the research question gives a preview of the different parts and variables of the study meant to address the problem posed in the research question.1 An excellent research question clarifies the research writing while facilitating understanding of the research topic ...

  8. Variables in Research

    In research design, understanding the types of variables and their roles is crucial for developing hypotheses, designing methods, and interpreting results. This article outlines the the types of variables in research, including their definitions and examples, to provide a clear understanding of their use and significance in research studies.

  9. Introduction to Research Statistical Analysis: An Overview of the

    Introduction. Statistical analysis is necessary for any research project seeking to make quantitative conclusions. The following is a primer for research-based statistical analysis. It is intended to be a high-level overview of appropriate statistical testing, while not diving too deep into any specific methodology.

  10. Independent vs. Dependent Variables

    The independent variable is the cause. Its value is independent of other variables in your study. The dependent variable is the effect. Its value depends on changes in the independent variable. Example: Independent and dependent variables. You design a study to test whether changes in room temperature have an effect on math test scores.

  11. Fundamentals of Research Data and Variables: The Devil Is in ...

    TYPES OF RESEARCH VARIABLES. When undertaking research, there are 4 basic types of variables to consider and define: 1, 5-7. Independent variable: A variable that is believed to be the cause of some of the observed effect or association and one that is directly manipulated by the researcher during the study or experiment.

  12. Elements of Research : Variables

    Variables are important to understand because they are the basic units of the information studied and interpreted in research studies. Researchers carefully analyze and interpret the value (s) of each variable to make sense of how things relate to each other in a descriptive study or what has happened in an experiment. previous next.

  13. Research Variables: Types, Uses and Definition of Terms

    The purpose of research is to describe and explain variance in the world, that is, variance that. occurs naturally in the world or chang e that we create due to manipulation. Variables are ...

  14. Why are independent and dependent variables important?

    A confounding variable is related to both the supposed cause and the supposed effect of the study. It can be difficult to separate the true effect of the independent variable from the effect of the confounding variable. In your research design, it's important to identify potential confounding variables and plan how you will reduce their impact.

  15. Considerations for writing and including demographic variables in

    In others, they serve as important control variables to allow for a clearer view of the relationship between other predictor variables and the dependent variable of interest. 4 Using demographic variables can enable pharmacy educators to ask and explore critical questions under investigation in a more specific manner. 5

  16. PDF Why You Need to Use Statistics in Your Research

    The word 'statistics' is possibly the descendant of the word 'statist'. By 1837, statistics had moved into many areas beyond government. Statistics, used in the plural, were (and are) defined as numerical facts (data) collected and classified in systematic ways. In current use, statistics is the area of study that aims to collect and ...

  17. What are Variables and Why are They Important in Research ...

    Variables also play a crucial role in the design of research studies. The selection of variables determines the type of research design that will be used, as well as the methods and procedures for collecting and analyzing data. Variables can be independent, dependent, or moderator variables, depending on their role in the research design.

  18. Variable selection

    1. INTRODUCTION. Statistical models are useful tools applied in many research fields dealing with empirical data. They connect an outcome variable to one or several so‐called independent variables (IVs; a list of abbreviations can be found in the Supporting Information Table S1) and quantify the strength of association between IVs and outcome variable.

  19. The selection, use, and reporting of control variables in international

    1. Introduction. Control variables (CVs) constitute a central element of the research design of any empirical study. Confounding variables are likely to covary with the hypothesized focal independent variables thus limiting both the elucidation of causal inference as well as the explanatory power of the model (Pehazur & Schmelkin, 1991; Stone-Romero, 2009).

  20. Why are variables important in a research study?

    Variables play a critical role in the psychological research process. By systematically varying some variables and measuring the effects on other variables, researchers can determine if changes to one thing result in changes in something else.

  21. Variables, Hypotheses and Stages of Research 1

    In our earlier mentioned example of study of value awareness of students, researcher has classified. variables like Area, SES, Gender and Value awareness as shown in Table 1 to 4. But besides the ...

  22. What Is Research, and Why Do People Do It?

    Abstractspiepr Abs1. Every day people do research as they gather information to learn about something of interest. In the scientific world, however, research means something different than simply gathering information. Scientific research is characterized by its careful planning and observing, by its relentless efforts to understand and explain ...

  23. Writing the title and abstract for a research paper: Being concise

    Introduction. This article deals with drafting a suitable "title" and an appropriate "abstract" for an original research paper. Because the "title" and the "abstract" are the "initial impressions" or the "face" of a research article, they need to be drafted correctly, accurately, carefully, meticulously, and consume time and energy.[1,2,3,4,5,6,7,8,9,10] Often, these ...