All Subjects

study guides for every class

That actually explain what's on your next test, experimental manipulation, from class:, animal behavior.

Experimental manipulation refers to the intentional alteration of an independent variable by a researcher to observe its effects on a dependent variable in a controlled setting. This technique is crucial for establishing causal relationships and understanding how specific factors influence behavior, communication, and social dynamics among animals.

congrats on reading the definition of experimental manipulation . now let's actually learn it.

5 Must Know Facts For Your Next Test

  • In studies involving visual signals, researchers might manipulate the color or intensity of signals to understand how these changes affect animal communication and responses.
  • Manipulating physical traits associated with sexual dimorphism can help scientists observe differences in behavior or mating success between genders in various species.
  • In dominance hierarchies, experimental manipulation might involve altering social structures to see how changes impact aggression or submission behaviors among individuals.
  • Behavioral plasticity can be examined by manipulating environmental conditions, allowing researchers to observe how animals adapt their behaviors based on specific stimuli or stressors.
  • Through experimental manipulation, scientists can provide insights into the mechanisms behind evolutionary adaptations, allowing for predictions about future behavioral changes.

Review Questions

  • Experimental manipulation allows researchers to systematically change aspects of visual signals, such as color or size, and then measure animal responses. By observing how alterations in these signals affect communication, researchers can identify which traits are most effective in conveying information. This helps clarify the role of visual signals in social interactions and mating behaviors among different species.
  • By experimentally manipulating traits that differ between sexes, such as size or coloration, researchers can assess how these differences influence behaviors like mate choice or competition. For example, if male coloration is altered, scientists can observe changes in female preferences during mating rituals. This experimental approach sheds light on the evolutionary pressures that shape sexual dimorphism and its implications for reproductive strategies.
  • Experimental manipulation is crucial for studying behavioral plasticity because it allows researchers to create specific environmental scenarios and observe how different species adapt their behaviors in response. For instance, altering food availability or social structures can reveal insights into survival strategies and adaptability. This evaluation highlights how manipulative experiments contribute to our understanding of evolutionary biology, informing conservation efforts and predicting future behavioral trends in changing environments.

Related terms

Independent Variable : The factor that is changed or manipulated in an experiment to test its effects on the dependent variable.

Dependent Variable : The outcome or response that is measured in an experiment to assess the impact of the independent variable.

Controlled Experiment : An experimental setup where all variables except the independent variable are kept constant to ensure that any observed effects can be attributed solely to the manipulation.

" Experimental manipulation " also found in:

© 2024 fiveable inc. all rights reserved., ap® and sat® are trademarks registered by the college board, which is not affiliated with, and does not endorse this website..

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • What Is a Controlled Experiment? | Definitions & Examples

What Is a Controlled Experiment? | Definitions & Examples

Published on April 19, 2021 by Pritha Bhandari . Revised on June 22, 2023.

In experiments , researchers manipulate independent variables to test their effects on dependent variables. In a controlled experiment , all variables other than the independent variable are controlled or held constant so they don’t influence the dependent variable.

Controlling variables can involve:

  • holding variables at a constant or restricted level (e.g., keeping room temperature fixed).
  • measuring variables to statistically control for them in your analyses.
  • balancing variables across your experiment through randomization (e.g., using a random order of tasks).

Table of contents

Why does control matter in experiments, methods of control, problems with controlled experiments, other interesting articles, frequently asked questions about controlled experiments.

Control in experiments is critical for internal validity , which allows you to establish a cause-and-effect relationship between variables. Strong validity also helps you avoid research biases , particularly ones related to issues with generalizability (like sampling bias and selection bias .)

  • Your independent variable is the color used in advertising.
  • Your dependent variable is the price that participants are willing to pay for a standard fast food meal.

Extraneous variables are factors that you’re not interested in studying, but that can still influence the dependent variable. For strong internal validity, you need to remove their effects from your experiment.

  • Design and description of the meal,
  • Study environment (e.g., temperature or lighting),
  • Participant’s frequency of buying fast food,
  • Participant’s familiarity with the specific fast food brand,
  • Participant’s socioeconomic status.

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

define manipulative experiment

You can control some variables by standardizing your data collection procedures. All participants should be tested in the same environment with identical materials. Only the independent variable (e.g., ad color) should be systematically changed between groups.

Other extraneous variables can be controlled through your sampling procedures . Ideally, you’ll select a sample that’s representative of your target population by using relevant inclusion and exclusion criteria (e.g., including participants from a specific income bracket, and not including participants with color blindness).

By measuring extraneous participant variables (e.g., age or gender) that may affect your experimental results, you can also include them in later analyses.

After gathering your participants, you’ll need to place them into groups to test different independent variable treatments. The types of groups and method of assigning participants to groups will help you implement control in your experiment.

Control groups

Controlled experiments require control groups . Control groups allow you to test a comparable treatment, no treatment, or a fake treatment (e.g., a placebo to control for a placebo effect ), and compare the outcome with your experimental treatment.

You can assess whether it’s your treatment specifically that caused the outcomes, or whether time or any other treatment might have resulted in the same effects.

To test the effect of colors in advertising, each participant is placed in one of two groups:

  • A control group that’s presented with red advertisements for a fast food meal.
  • An experimental group that’s presented with green advertisements for the same fast food meal.

Random assignment

To avoid systematic differences and selection bias between the participants in your control and treatment groups, you should use random assignment .

This helps ensure that any extraneous participant variables are evenly distributed, allowing for a valid comparison between groups .

Random assignment is a hallmark of a “true experiment”—it differentiates true experiments from quasi-experiments .

Masking (blinding)

Masking in experiments means hiding condition assignment from participants or researchers—or, in a double-blind study , from both. It’s often used in clinical studies that test new treatments or drugs and is critical for avoiding several types of research bias .

Sometimes, researchers may unintentionally encourage participants to behave in ways that support their hypotheses , leading to observer bias . In other cases, cues in the study environment may signal the goal of the experiment to participants and influence their responses. These are called demand characteristics . If participants behave a particular way due to awareness of being observed (called a Hawthorne effect ), your results could be invalidated.

Using masking means that participants don’t know whether they’re in the control group or the experimental group. This helps you control biases from participants or researchers that could influence your study results.

You use an online survey form to present the advertisements to participants, and you leave the room while each participant completes the survey on the computer so that you can’t tell which condition each participant was in.

Although controlled experiments are the strongest way to test causal relationships, they also involve some challenges.

Difficult to control all variables

Especially in research with human participants, it’s impossible to hold all extraneous variables constant, because every individual has different experiences that may influence their perception, attitudes, or behaviors.

But measuring or restricting extraneous variables allows you to limit their influence or statistically control for them in your study.

Risk of low external validity

Controlled experiments have disadvantages when it comes to external validity —the extent to which your results can be generalized to broad populations and settings.

The more controlled your experiment is, the less it resembles real world contexts. That makes it harder to apply your findings outside of a controlled setting.

There’s always a tradeoff between internal and external validity . It’s important to consider your research aims when deciding whether to prioritize control or generalizability in your experiment.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Student’s  t -distribution
  • Normal distribution
  • Null and Alternative Hypotheses
  • Chi square tests
  • Confidence interval
  • Quartiles & Quantiles
  • Cluster sampling
  • Stratified sampling
  • Data cleansing
  • Reproducibility vs Replicability
  • Peer review
  • Prospective cohort study

Research bias

  • Implicit bias
  • Cognitive bias
  • Placebo effect
  • Hawthorne effect
  • Hindsight bias
  • Affect heuristic
  • Social desirability bias

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

In a controlled experiment , all extraneous variables are held constant so that they can’t influence the results. Controlled experiments require:

  • A control group that receives a standard treatment, a fake treatment, or no treatment.
  • Random assignment of participants to ensure the groups are equivalent.

Depending on your study topic, there are various other methods of controlling variables .

An experimental group, also known as a treatment group, receives the treatment whose effect researchers wish to study, whereas a control group does not. They should be identical in all other ways.

Experimental design means planning a set of procedures to investigate a relationship between variables . To design a controlled experiment, you need:

  • A testable hypothesis
  • At least one independent variable that can be precisely manipulated
  • At least one dependent variable that can be precisely measured

When designing the experiment, you decide:

  • How you will manipulate the variable(s)
  • How you will control for any potential confounding variables
  • How many subjects or samples will be included in the study
  • How subjects will be assigned to treatment levels

Experimental design is essential to the internal and external validity of your experiment.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bhandari, P. (2023, June 22). What Is a Controlled Experiment? | Definitions & Examples. Scribbr. Retrieved September 18, 2024, from https://www.scribbr.com/methodology/controlled-experiment/

Is this article helpful?

Pritha Bhandari

Pritha Bhandari

Other students also liked, extraneous variables | examples, types & controls, guide to experimental design | overview, steps, & examples, how to write a lab report, get unlimited documents corrected.

✔ Free APA citation check included ✔ Unlimited document corrections ✔ Specialized in correcting academic texts

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

The PMC website is updating on October 15, 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • HHS Author Manuscripts

Logo of nihpa

Construct Validation of Experimental Manipulations in Social Psychology: Current Practices and Recommendations for the Future

Associated data.

Experimental manipulations in social psychology must exhibit construct validity by influencing their intended psychological constructs. Yet how do experimenters in social psychology attempt to establish the construct validity of their manipulations? Following a preregistered plan, we coded 348 experimental manipulations from the 2017 issues of the Journal of Personality and Social Psychology . Representing a reliance upon ‘on the fly’ experimentation, the vast majority of these manipulations were created ad hoc for a given study and not previously validated prior to implementation. A minority of manipulations had their construct validity evaluated by pilot testing prior to implementation or via a manipulation check. Of the manipulation checks administered, most were face-valid, single item self-reports and only a few met criteria for ‘true’ validation. In aggregate, roughly two-fifths of manipulations relied solely on face validity. To the extent that they are representative of the field, these results suggest that best practices for validating manipulations are not commonplace — a potential contributor to replicability issues. These issues can be remedied by validating manipulations prior to implementation, using validated manipulation checks, standardizing manipulation protocols, estimating the size and duration of manipulations’ effects, and estimating each manipulation’s effects on multiple constructs within the target nomological network.

Introduction

Social psychology emphasizes the power of the situation ( Lewin, 1939 ). To examine the causal effects of situational variables, social psychological studies often employ experimental manipulations of such factors and examine their impact on human thoughts, feelings, and behaviors ( Campbell, 1957 ; Cook & Campbell, 1979 ). However, experimental manipulations are only as useful as the extent to which they exhibit construct validity (i.e., that they meaningfully affect the psychological processes that they are intended to affect; Brewer, 2000 ; Garner, Hake, & Eriksen, 1956 ; Wilson, Aronson, & Carlsmith, 2010 ). Yet few recent studies have systematically documented the approaches that social psychological experiments use to estimate and establish the construct validity of their manipulations. Towards addressing this limitation in our understanding, we meta-analyzed the frequency with which various manipulation validation practices were adopted (or not adopted) by a representative sample of studies from what is widely perceived as the flagship publication for experimental social psychology: the Journal of Personality and Social Psychology ( JPSP ).

Validity in Experimental Manipulations of Psychological Processes

Experimental social psychologists often focus on ‘internal validity’ and ‘external validity’ ( Haslam & McGarty, 2004 ). Internal validity is present when experimenters (I) eliminate extraneous variables that might incidentally influence the outcome-of-interest and (II) maximize features of the experimental manipulation that ensure a precise, causal conduit from manipulation to outcome ( Brewer, 2000 ). Experimenters establish internal validity via practices such as removing sources of experimenter bias and demand characteristics and by cultivating ‘experimental realism’, which maximize the chances that the manipulation is the source of experimental effects and not some unwanted artifact of design ( Cook & Campbell, 1979 ; Wilson et al., 2010 ). Other efforts are directed toward maximizing ‘external validity’, ensuring that the experiment captures effects that exist in the ‘real world and that findings of the experiment are able to generalize to other settings, populations, time periods, and cultures ( Highhouse, 2009 ; c.f. Berkowitz & Donnerstein, 1982 ; Mook, 1983 ). Integral to both internal and external validity is a concept most often invoked in the context of clinical assessments and personality questionnaires — construct validity .

Psychological Constructs and the Nomological Network

Psychological scientists often seek to measure and manipulate psychological constructs — so called because they are psychological entities constructed by people, they are not objective realities ( Cronbach & Meehl, 1955 ). Such constructs are considered latent as they are readily imperceptible, as compared to their associated manifestations that are designed to capture (e.g., psychological questionnaires) or influence (e.g., experimental manipulations) them. Latent constructs exist in a nomological (i.e., lawful) network, which is a prescribed array of relationships (or lack thereof) with other constructs ( Cronbach & Meehl, 1955 ). In a nomological network, constructs exist in varying degrees of proximity to one another, with closer proximities reflecting stronger patterns of association. Each construct has its own idiographic network, including construct-specific arrays of associated constructs and construct-specific patterns of associations with those constructs. The constellations of constructs within each nomological network are articulated by psychological theory ( Gray, 2017 ). Nomological networks, when distilled accurately from strong theory, are the basis of construct validity ( Messick, 1995 ).

Construct Validity of Psychological Measures

Construct validity is a methodological and philosophical property that largely reflects how accurately a given manifestation of a study has mapped onto a construct’s latent nomological network ( Borsboom, Mellenbergh, & van Heerden, 2004 ; Embretson, 1983 ; Strauss & Smith, 2009 ). Conventionally, construct validity has been largely invoked in the context of psychological measurement, assessment, and tests. In this context, construct validity is present when a manifest psychological measure (I) accurately quantifies its intended latent psychological construct, (II) shares theoretically-appropriate associations with other latent variables in that construct’s nomological network, and (III) does not capture confounding extraneous latent constructs ( Cronbach & Meehl, 1955 ; Messick, 1995 ; Figure 1 ). According to modern standards in psychology, construct validity is not a property of a given measure or the scores derived from it, but instead such validity pertains to the uses and interpretations of the scores that are derived from the measure ( AERA, APA, & NCME, 2014 ).

An external file that holds a picture, illustration, etc.
Object name is nihms-1615850-f0001.jpg

Schematic depiction of a hypothetical nomological network surrounding the construct of ‘rejection’. Plus signs depict positive associations and minus signs depict negative associations. Greater numbers of plus signs and thicker arrows depict stronger associations and effects.

As depicted in the above schematic, a measure of a given construct (e.g., a scale that measures feelings of rejection), should exhibit a pattern of associations with theoretically-linked variables (e.g., positive correlations with pain and shame, negative correlation with happiness) and null associations with variables outside of the nomological network (e.g., awe).

Estimating the Construct Validity of Psychological Measures

The process of testing the construct validity of measures is well defined (for an overview see Flake, Pek, & Hehman, 2017 ). First, investigators should conduct a comprehensive literature review to define the properties of the construct, prominent theories of the construct, and its associated nomological network ( Simms, 2008 ). This substantive portion of construct validation and research design more broadly is perhaps the most crucial (and oft neglected) aspect. Rigorous theoretical work prior to measure construction is needed to ensure that the manifestation of the measure accurately captures the full range of the construct, distinguishes it from related constructs, and includes measures of other constructs to test the construct’s nomological network ( Benson, 1998 ; Loevinger, 1957 ; Zumbo & Chan, 2014 ).

Second, researchers apply their theoretical understanding to design the content of the measure to capture the breadth and depth of the construct (i.e., content validity; Haynes, Richard, & Kubany, 1995), often in consultation with experts outside the study team. Third, this preliminary measure is administered and empirical analyses (e.g., item response theory, exploratory and confirmatory factor analyses) are used on the resulting data to (A) ensure that the measure’s data structure exhibits the expected form, to (B) select content with good empirical qualities, and to (C) ensure the measure is invariant across groups it should be invariant across ( Clark & Watson, 2019 ). Fourth, a refined version of the measure is administered alongside other measures to ensure that it (A) positively corresponds to measures of the same or similar constructs (i.e., convergent validity), it (B) negatively or weakly corresponds to measures of different or dissimilar constructs (i.e., discriminant validity), it (C) is linked to theoretically-appropriate real-world outcomes (i.e., criterion validity), and that it (D) differs across groups that it should differ across ( Smith, 2005 ). Measures that meet these stringent psychometric criteria can be said to exhibit construct validity (i.e., they measure the construct they are intended to measure and do not capture problematically large amounts of unintended constructs). Yet how do these concepts and practices translate to experimental manipulations of psychological processes?

Construct Validity of Psychological Manipulations

Construct validity is not confined to psychometrics and is a crucial element in experimental psychology ( Cook & Campbell, 1979 ). Translated to an experimental setting, construct validity is present when a manifest psychological manipulation (I) accurately and causally affects its intended latent psychological construct in the intended direction, (II) exerts theoretically-appropriate effects upon other latent variables in that construct’s nomological network, and (III) does not affect or weakly affects confounding extraneous latent constructs ( Campbell, 1957 ; Shadish, Cook, & Campbell, 2002 ). This desired pattern of effects is illustrated in a phenomenon we deem the nomological shockwave .

The nomological shockwave.

In a nomological shockwave, a psychological manipulation (e.g., a social rejection manipulation; Chester, DeWall, & Pond, 2016 ) exerts its initial and strongest causal effects on the target latent construct in the intended direction (e.g., greatly increased feelings of rejection; Figure 2 ). This change in the target construct then ripples out through that construct’s latent nomological network — causally affecting related constructs in ways that reflect the degree and strength of their latent associations with the target construct. More specifically, the shockwave exerts stronger effects upon constructs that are closer to the manipulation’s point of impact (e.g., moderately increased pain). Conversely, the shockwave’s effects get progressively weaker as the theoretical distance from the target construct increases (e.g., modestly increased shame, modestly reduced happiness). The shockwave will not reach constructs that lie beyond the target construct’s nomological network (e.g., no effect on awe). Back in the manifest domain, these latent shockwave effects are then captured with manipulation check and the various discriminant validity checks that are causally affected by the latent nomological shockwave.

An external file that holds a picture, illustration, etc.
Object name is nihms-1615850-f0002.jpg

Schematic depiction of a hypothetical nomological shockwave elicited by a construct valid social rejection manipulation. Plus signs depict positive effects and minus signs depict negative effects. Greater numbers of plus signs and thicker arrows depict stronger associations and effects.

Internal versus construct validity.

Construct validity differs from another type of validity that is critical for experimental manipulations — internal validity. Internal validity reflects the extent to which the intended aspects of the manifest experimental manipulation — and not some artifact(s) of the research methodology — exerted a causal effect on an outcome ( Campbell, 1957 ; Shadish et al., 2002 ; Wilson et al., 2010 ). Threats to internal validity include unintended differences between the participants in the experimental conditions, participant attrition and fatigue over the course of the experiment, environmental and experimenter effects that undermine the manipulation, measures that are not valid or reliable, and participant awareness (of the experiment’s hypotheses, of deceptive elements of the study, or that they are being studied; Shadish et al., 2002 ; Wilson et al., 2010 ). Each of these issues can elicit spurious effects that are not due to the intended aspects of the experimental manipulation.

Although construct validity requires that the causal chain of events from manipulation to outcome effect was intact (i.e., that the manipulation possessed internal validity), its focus is on the ability of the manipulation to impact the intended constructs in the intended manner ( Shadish et al., 2002 ). In other words, internal validity ensures that the manipulation’s effect was causal and construct validity ensures that the manipulation’s effect was accurate. Threats to a manipulation’s construct validity are ‘instrumental incidentals’ --- or confounding aspects of the manipulation that elicited the intended cause in the targeted constructs but were not the aspects of the manipulation that were intended to elicit that effect ( Campbell, 1969 ). For instance, imagine that an experimental condition (e.g., writing an essay that recalls an experience of rejection) was compared to an inappropriate control condition (e.g., writing an essay that tells a story of a brave and adorable otter). This manipulation design would cause an intended increase in rejection, but this effect would be due to both the intended aspect of the manipulation (i.e., the rejection-related content of the essay) and unintended, confounding aspects as well (e.g., positive attitudes towards brave and adorable otters, ease of writing about a fictional character). Another threat to construct validity is a lack specificity, in which a manipulation exerts a similarly-sized impact on a broad array of constructs instead of isolating the target construct (e.g., a rejection manipulation that also increases sadness and anger to the same extent as it does feelings of rejection). A construct valid experimental manipulation will exert its intended, targeted effects on the intended, specific constructs only through theoretically-appropriate aspects of the manipulation ( Reichardt, 2006 ).

Whereas internal validity can be established prior to testing the construct validity of a manipulation, construct validity first requires that a manipulation exhibit internal validity. Indeed, if an experimental artifact caused by some other aspect of the experiment (e.g., participant selection bias caused by a lack of random assignment) was the actual and unintended source of an observed experimental effect, then it is impossible to claim that the manipulation is what affected the target construct ( Cook & Campbell, 1979 ). This is akin to how psychological questionnaires can have internal consistency among their items without exhibiting construct validity, yet the construct validity of this measure requires the presence of internal consistency. The process through which measures are validated can be instructive for determining how to establish the construct validity of experimental manipulations.

Current Construct Validity Practices for Psychological Manipulations

A survey of the literature on experimental manipulation in social psychology revealed three primary approaches to establishing that a given manipulation has construct validity. These approaches do not map neatly onto the process through which psychological measures are validated, an issue we return to in the Discussion.

Employ previously validated manipulations.

The simplest means to establish the validity of a manipulation is to replicate one that has been already validated in previous research. Many experimental paradigms are frequently re-used in other investigations and modified for other purposes. For instance, the seminal article that introduced the Cyberball social rejection paradigm has been cited over 1,900 times ( Williams, Cheung, & Choi, 2000 ). However, the value of employing previously-used manipulations is predicated on the extent to which they were adequately validated in such pre-existing work. Previously-used manipulations, whether they have been validated or not, are often modified prior to implementation (e.g., the identities of the Cyberball partners are varied; Gonsalkorale & Williams, 2007) or are conceptually-replicated by implementing the manipulation through an entirely different paradigm (e.g., being left out of an online chatroom instead of a ball-tossing game; Donate et al., 2017 ). These conceptual replications are important means to establish the ability of the manipulated construct’s ability to exert its effects irrespective of the manifest characteristics of the manipulation. However, conceptual replication cannot alone establish construct validity.

Pilot validity studies.

Whether a manipulation is newly created or acquired from a prior publication, authors often ‘pilot test’ them prior to implementation in hypothesis testing. This practice entails conducting at least one separate, ‘pilot study’ of the manipulation outside of the context of the full study procedure ( Ellsworth & Gonzalez, 2003 ). Such pilot studies are used to examine various aspects of the manipulation, from its feasibility to participant comprehension of the instructions to various forms of validity. Of particular interest to the present research, pilot validity studies (a subset of the broader ‘pilot study’ category) estimate the manipulation’s effect on the target construct (i.e., they pilot test the manipulation’s construct validity). In this way, pilot validity studies are a hybrid of experimental pilot studies and the ‘validation studies’ used by clinical and personality psychologists who examine the psychometric properties of new measures using the steps we previously outlined.

Pilot validity testing of a new manipulation is an essential step to ensure that the manipulation has the intended effect on a target manipulation check and to rule out confounding processes ( Wilson et al., 2010 ). Pilot validity testing can also estimate the magnitude and duration of the intended effect. If the effect is so small or transient that it is nearly impossible to detect or if the effect is so strong or long-lasting that it produces ceiling effects or excessive distress among your participants, then the manipulation can be altered to address these issues and re-piloted. If deception is used, suspicion probes can be included in a pilot study to estimate whether the deception was perceived by your participants ( Blackhart, Brown, Clark, Pierce, & Shell, 2012 ). Even if the manipulation has been acquired from previous work, pilot validity testing is a crucial way to ensure that you have accurately recreated the protocol and replicated the validity of the manipulation ( Ellsworth & Gonzalez, 2003 ). As all of these factors have an immense impact on whether a given manipulation will affect its target construct, pilot validity studies are an important means of ensuring the construct validity of a manipulation.

Manipulation checks.

A diverse array of measurements fall under the umbrella term of ‘manipulation check’. The over-arching theme of such measures is to ensure that a given manipulation had its intended effect ( Hauser, Ellsworth, & Gonzalez, 2018 ). We adopt a more narrow definition to conform to the topic of construct validity — manipulation checks are measures of the construct that the manipulation is intended to affect. This definition excludes attention checks, comprehension checks, and other forms of instructional manipulation checks ( Oppenheimer, Meyvis, & Davidenko, 2009 ), as they do not explicitly quantify the target construct. These instructional manipulation checks are useful tools, especially because they can identify construct irrelevant variance that is caused by the manipulation. However, our present focus on construct validity entails that we apply the label of ‘manipulation check’ to measures of a manipulation’s target construct. Measures of different constructs that are used to ensure that a given manipulation did not exert similarly robust effects onto other, non-target constructs we refer to as ‘discriminant validity checks’. Such discriminant validity checks are specific to each investigation and should include theoretically-related constructs to the target construct so that the manipulation’s specificity and nomological shockwave can be estimated.

Many articles have debated the utility and validity of manipulation checks, with some scholars arguing for their exclusion ( Fayant, Sigall, Lemonnier, Retsin, & Alexopoulos, 2017 ; Sigall & Mills, 1998 ). Indeed, manipulation checks can have unintended consequences (e.g., drawing participants’ attention to deceptive elements of the experiment, interrupting naturally unfolding psychological processes). Minimally intrusive validation assessments are thus preferable to overt self-report scales ( Hauser et al., 2018 ). Although many such challenges remain with the use of manipulation checks, they are a necessary source of construct validity data that an empirical science cannot forego. Without manipulation checks, the validity of experimental manipulations would be asserted by weaker forms of validity (e.g., face validity), which provide deeply flawed footing when used as the sole basis for construct validity ( Grand, Ryan, Schmitt, & Hmurovic, 2010 ). In an ideal world, such manipulation checks would be validated according to best psychometric practices (see Flake et al., 2017 ). Without validated manipulation checks, it is uncertain what construct the given check is capturing. As such, an apparently ‘successful’ manipulation check could be an artifact of another construct entirely.

The Present Research

The present research was purposed with a central, descriptive research aim related to construct validation practices for experimental manipulations in social psychology: document the frequency with which manipulations were (I) acquired from previous research or newly created, (II) paired with a pilot validity study, and/or (III) paired with a manipulation check. It was impractical to estimate whether each manipulation that was acquired from previous research was adequately validated by that prior work, so we gave authors the benefit of the doubt and assumed that the research that they cited alongside their manipulations presented sufficient evidence of the manipulation’s construct validity. Based on findings from the present research, it is likely that many of these cited papers did not report sufficient evidence for the manipulation’s construct validity. Therefore, this is a relatively liberal criterion that probably overestimates the extent to which manipulations have been truly validated.

We focused on social psychology given its heavy reliance upon experimental manipulations, our membership in this field, and this field’s ongoing reckoning with replication issues that may result, in part, from experimental practices. We hope that other experimentally-focused fields such as cognitive and developmental psychology, economics, management, marketing, and neuroscience may glean insights into their own manipulation validation practices and standards from this investigation. Further, clinical and counseling psychologists might learn approaches to improving the construct validity of clinical trials, which are similar to experiments in many ways.

In addition to these descriptive analyses, we also empirically examined several important qualities of pilot validity studies and manipulation checks. There is only a sparse literature on these topics and we aimed to fill this gap in our understanding. Given the widespread evidence for publication bias in the field of psychology ( Head, Holman, Lanfear, Kahn, & Jennions, 2015 ), our primary goal in these analyses was to estimate the extent to which pilot and manipulation check effects are impacted by such biases. First, we tested the evidentiary value of these effects via p -curve analyses in order to estimate the extent to which pilot validity studies and manipulation checks capture ‘true’ underlying effects and are not merely the result of publication bias and questionable research practices ( Simonsohn, Nelson, & Simmons, 2014 ). Second, p -curve analyses estimated the statistical power of these reported pilot validity and check effects to examine whether long-standing claims that pilot validity studies in social psychology are underpowered ( Albers & Lakens, 2018 ; Kraemer, Mintz, Noda, Tinklenberg, & Yesavage, 2006). Third, we employed conventional meta-analyses to estimate the average size and heterogeneity of pilot validity study and manipulation check effects, useful information for future power analyses. Fourth, these meta-analyses also estimated the presence of publication bias to establish the extent to which pilot validity studies and manipulation checks are selectively reported based on the favorability of their results.

Finally, we returned to our descriptive approach to examine the presence of suspicion probes in the literature. Given the crucial role of suspicion probes in much of social psychological experiments ( Blackhart et al., 2012 ; Nichols & Edlund, 2015 ), we examined whether manipulations were associated with a suspicion probe and whether suspicious participants were retained or excluded from analyses.

Open Science Statement

This project was intended to capture an exploratory snapshot of the literature and therefore no hypotheses were advanced a priori . The preregistration plan for the present research is publicly available online (original plan: https://osf.io/rtbwj ; amendment: https://osf.io/zvg3a ), as is the disclosure table of all included studies and their associated codes ( https://osf.io/je9xu/files/ ).

Literature Search Strategy

We conducted our literature search within a journal that is often reputed to be the flagship journal of experimental social psychology, JPSP . We limited our literature search to a single year of publication (as in Flake et al., 2017 ), selecting the year 2017 because it was recent enough to reflect current practices in the field. Our preregistration plan stated that we would examine volume 113 of JPSP , limiting our coding procedures to the two experimentally focused sections: Attitudes and Social Cognition ( ASC ) and Interpersonal Relations and Group Processes ( IRGP ). We excluded the Personality Processes and Individual Differences ( PPID ) section of JPSP due to its focus on measurement and not manipulation. However, we deviated from our preregistration plan by also including volume 112 in our analysis in order to increase our sample size and therefore our confidence in our findings.

Inclusion Criteria

We sought to first identify every experimental manipulation within the articles that fell within our literature search. In our initial preregistration plan, we defined experimental manipulations as “any systematic alteration of a study’s procedure meant to change a specific psychological construct.” However, this definition did not always provide clear guidance in many instances in which a systematically-altered aspect of a given study might or might not constitute an experimental manipulation. The ambiguity around many of these early decisions caused us to rapidly deem it impossible to implement this definition in any rigorous or objective manner. Instead, we revised our preregistration plan to follow two, simple heuristics. First, we decided that a study aspect would be deemed an experimental manipulation if it was described by the authors as a ‘manipulation’. This approach lifted the burden of determining whether a given aspect of a study was a ‘true’ manipulation from the coders and instead allowed a given article’s authors, their peer reviewers, and editor to determine whether something could be accurately described as an experimental manipulation. Second, if participants were ‘randomly assigned’ to different treatments or conditions, this aspect of the study procedure would be considered an experimental manipulation, as random assignment is the core aspect of experimental manipulation ( Wilson et al., 2010 ). We deviated from our preregistration plans by deciding to exclude studies from our analyses that were not presented as part of the main sequence of hypothesis-testing studies in each paper (e.g., pilot studies). This deviation was motivated by the realization that pilot validity studies were often provided as the very sources of purported validity evidence we sought to identify for each paper’s main experiments, and therefore should be examined separately.

Coding Strategy

We coded every experimental manipulation for several criteria that either provided descriptive detail or spoke to the evidence put forward for the construct validity of the manipulation.

Coding process.

All manipulations were coded independently by the first and last author, who each possess considerable expertise and training in experimental social psychology, research methodology, and construct validation. The first and last authors met frequently throughout the coding process to identify coding discrepancies. Such discrepancies were reviewed by both authors until both authors agreed upon one coding outcome (as in Flake et al., 2017 ). Prior to such discrepancy reviews and meetings, the authors each created 459 codes of the nine key coded variables of our meta-analysis (e.g., whether a given study included a manipulation, how many manipulations were included in each study, whether a manipulation was paired with a manipulation check) from the first 11 articles in our literature review. In an exploratory fashion, we examined the inter-rater agreement in these initial codes (459 codes per rater × 2 raters = 918 codes; 102 codes per coded variable), which were uncontaminated because the authors had yet to meet and conduct a discrepancy review. These initial codes exhibited substantial inter-rater agreement across all coded variables, κ = .89. Inter-rater agreement estimates for each of the uncontaminated coded variables are presented below.

Condition number and type.

Each manipulation was coded for the number of conditions it contained, κ = .94, and whether it was administered in a between- or within-participants fashion, κ = .92. Deviation from our preregistration plan, we also coded whether each of the between-participants manipulations were described as randomly-assigning participants to each condition of the manipulation, κ = .63.

Use in prior research.

We coded each manipulation for whether the manipulation was paired with a citation that indicated the manipulation was acquired from previously published research, κ = .84. If this was not the case, we assumed that the manipulation was uniquely created for the given study. Manipulations that were acquired from prior publications were then coded for whether or not the authors stated that the manipulations were modified from the referenced version of the manipulation, κ = .75. Crucially, we did not code for or select manipulations based on whether that manipulation had been previously validated by the cited work. We refrained from doing so for two reasons. First, because each cited manipulation could have required a laborious search through a trail of citations in order to find evidence of validation. Second, because simply citing a paper in which the manipulation was previously used is likely an implicit argument that the manipulation has been validated by that work.

As a deviation from our preregistration plans, we also coded each manipulation for whether the manipulation’s construct validity was pilot tested. More specifically, we coded whether each manipulation was paired with any pilot validity studies that empirically tested the effect of the manipulation on the intended construct (i.e., tested the manipulation’s construct validity), κ = .91.

Each manipulation was coded for whether a manipulation check was employed, κ = .88. If such a check was employed, we coded the form of the manipulation check (e.g., self-report measure) and whether it was validated in previously published research or was created uniquely for the given study and not validated. We did not rely on authors to make this determination (i.e., we did not deem a measure a manipulation check simply because the authors of an article referred to it as such, and we did not exclude a measure from consideration as a manipulation check simply because the authors did not refer to it as a manipulation check). Instead, we defined a manipulation check as any measure of the construct that the given manipulation was intended to influence ( Hauser et al., 2018 ; Lench, Taylor, & Bench, 2014 ) and included any measure that met this criterion. This process therefore excluded instructional manipulation checks and other measures that authors deemed ‘manipulation checks’, but did not actually assess the construct that the manipulation was designed to alter (as in Lench et al., 2014 ). For each manipulation check we identified, we then coded the form that it took (e.g., self-report questionnaire) and the number of measurements that comprised it (e.g., the number of items in the questionnaire).

Suspicion probes.

We also coded for whether investigators assessed for participant suspicion of their manipulation, κ = .92. If such a suspicion probe was used, we coded the form that it took and whether participants who were deemed ‘suspicious’ were excluded from analyses, κ = .92.

Volumes 112 and 113 of the ASC and IRGP sections of JPSP contained 58 articles. Four of these articles were excluded as they were meta-analyses or non-empirical, leaving 54 articles that summarized 355 independent studies. Of these studies, 244 (68.73%) presented at least one experimental manipulation for a total of 348 experimental manipulations acquired from 49 articles.

Manipulations Per Study

The majority of studies that contained experimental manipulations reported one (66.80%) or two (25.00%) manipulations, though there was considerable variability in the number of manipulations per study: M = 1.43, SD = 0.68, mode = 1, range = 1 – 4.

Conditions Per Manipulation

The majority of studies reported two (82.18%) or three (12.64%) conditions for each manipulation, though we observed wide variation in the number of conditions per manipulation: M = 2.30, SD = 0.98, mode = 2, range = 2 – 13).

Between- Versus Within-Participants Designs

The overwhelming majority of manipulations were conducted in a between-participants manner (94.54%), as opposed to a within-participants (5.46%) approach. Variability in the number of conditions was observed in both within- and between-participants manipulations. These frequencies are depicted in Figure 3 , an alluvial plot created with SankeyMATIC: https://github.com/nowthis/sankeymatic . Alluvial plots visually mimic the flow of rivers into an alluvial fan of smaller tributaries. These flowing figures depict how frequency distributions fall from left to right into a hierarchy of categories. In each plot, a full distribution originates on the left-hand side that then ‘flows’ to the right into different categories whose width is based on the proportion assigned to that initial category. These streams then flow into even more specific sub-categories based on their proportions in an additional category.

An external file that holds a picture, illustration, etc.
Object name is nihms-1615850-f0003.jpg

Alluvial plot of condition frequencies by condition type.

Manipulation Validation Practices

Of the manipulations, only a modest majority of 202 (58.04%) were accompanied by at least one of the following sources of purported validity evidence: a citation indicating that the manipulation was used in prior research, a pilot validity study, and/or a manipulation check (see Table 1 and Figure 4 for a breakdown of these statistics). Pilot validity study analyses were not preregistered and therefore, exploratory.

An external file that holds a picture, illustration, etc.
Object name is nihms-1615850-f0004.jpg

Alluvial plot depicting distributions of the types of purported validity evidence reported for each manipulation.

Frequencies and percentages (in parentheses) of the number of manipulations that were presented alongside each type of purported validity evidence (i.e., a citation indicating published research that the manipulation had been acquired from, a pilot validity study, and/or a manipulation check measure).

No CitationWith Citation
Not PilotedPilotedNot PilotedPiloted
No Check146 (41.96%)35 (10.06%)36 (10.34%)4 (1.15%)
With Check63 (18.10%)37 (10.63%)26 (7.47%)1 (0.29%)

Citations from previous publications.

Of all manipulations, 67 (19.25%) were paired with a citation that indicated the manipulation was used in previously published research. Of these cited manipulations, 16 (23.88%) were described as being modified in some way from their original version. The majority of the remaining 51 cited manipulations were not described in a way in which it was clear whether they had been modified from the original citation or not. Therefore, the number of modified manipulations provided here may be an underestimate of their presence in the larger literature.

Across all manipulations, 127 (36.49%) were accompanied by a manipulation check measure. These 127 manipulation checks took the form of self-report questionnaires ( n = 105; 82.68%), coded behavior ( n = 3; 2.36%), behavioral task performance ( n = 9; 7.09%), or an unspecified format ( n = 10; 7.87%; Figure 5 ). Of the 105 self-report manipulation check questionnaires, 68 (64.76%) were comprised of a single item and the rest included a range of items: M = 1.68, SD = 1.27, range = 1 – 10 ( Figure 5 ).

An external file that holds a picture, illustration, etc.
Object name is nihms-1615850-f0005.jpg

Alluvial plot depicting distributions of the types of manipulation check measures reported for each manipulation and numbers of self-report items.

Suspicion Probes

Of all manipulations, only 31 (8.90%) were accompanied by a suspicion probe. Probing procedures were invariably described in vague terms (e.g., ‘a funnel interview’) and no experimenter scripts or sample materials were provided that gave any further detail. Of these probed manipulations, only five (16.10%) from two articles reported that they excluded ‘suspicious’ participants from analyses. The exact criteria for what determined whether a participant was ‘suspicious’ or not was not provided in any of these cases nor was the impact of excluding these participants estimated.

Exploratory Analyses

Random assignment..

We found that 205 (62.31%) of between-participants manipulations declared that participants were randomly assigned to conditions. No articles described the method they used to randomly assign participants.

Pilot validity study meta-analyses.

Pilot validity studies were reported as purported validity evidence for 77 (22.13%) of all manipulations. However, the majority of these studies either did not report inferential statistics, described the results too vaguely to identify the target effect, or were drawn from overlapping samples of participants. Often, the results of pilot validity studies were summarized in a qualitative fashion without accompanying inferential statistics or methodological details (e.g., “Pilot testing suggested that the effect … tended to be large”; Gill & Cerce, 2017 , p. 364). Based on the 15 pilot validity study effects that we could extract, p -curve analyses revealed that pilot validity studies exhibited remarkable evidentiary value and were statistically powered at 99% ( Figure 6 ).

An external file that holds a picture, illustration, etc.
Object name is nihms-1615850-f0006.jpg

Results of the p -curve analysis on pilot validity study effects.

Exploratory random-effects meta-analyses on 14 of the Fisher’s Z -transformed pilot validity effects (one effect could not be translated into an effect size estimate) revealed an overall medium-to-large effect size, r = .46 [ 95% CI = .34, .59], SE = 0.06, Z = 7.28, p < .001, with significant underlying inter-study heterogeneity, Q (13) = 136.70, p < .001. The average sample size of these studies was N = 186.47, which explains the high statistical power we observed for such relatively strong effects. Given that the Little evidence was found for publication bias in pilot validity studies (see Supplemental Document 1 ).

Manipulation check meta-analyses.

Of the 127 manipulations with manipulation checks, six did not report the results of the manipulation check and 14 others reported incomplete inferential statistics (e.g., a range of p-values, no test statistics) such that it was difficult to verify the veracity of their claims. From these manipulation checks, 82 independent manipulation check effects were extracted and submitted to exploratory p -curve analyses, which revealed that manipulation checks exhibited remarkable evidentiary value and were statistically powered at 99% ( Figure 7 ).

An external file that holds a picture, illustration, etc.
Object name is nihms-1615850-f0007.jpg

Results of the p-curve analysis of manipulation check effects.

Exploratory random-effects meta-analyses on these Fisher’s Z -transformed manipulation check effects revealed an overall medium-to-large effect size, r = .55 [ 95% CI = .48, .62], SE = 0.03, Z = 16.31, p < .001, with significant underlying inter-study heterogeneity, Q (81) = 2,167.90, p < .001. The average sample size of these studies was N = 304.79, which explains the high statistical power we observed for such relatively strong effects. No evidence was found for publication bias (see Supplemental Document 1 ).

Internal consistency of manipulation checks.

Among the 37 manipulation checks that took the form of multiple item self-report scales, exact Cronbach’s alphas were provided for 18 (48.65%) of them and these estimates by-and-large exhibited sufficient internal consistency: M = .83, SD = .12, range = .49 – .98.

Validity of manipulation checks.

Crucially, only eight of all of the manipulation checks (6.30%) were accompanied by a citation indicating that the check was acquired from previous research. After reading the cited validity evidence for each case, only six (4.27%) manipulation checks actually met the criteria for established validation, taking the forms of the Need Threat Scale (NTS; Williams, 2009 ) and the Positive Affect Negative Affect Schedule (PANAS; Watson, Clark, & Tellegen, 1988 ).

Construct valid measures in psychology are able to accurately capture the target construct and not extraneous variables ( Borsboom et al., 2004 ; Cronbach & Meehl, 1955 ; Embretson, 1983 ; Strauss & Smith, 2009 ). Such construct validity is not limited to psychometrics but applies equally to experimental manipulations of psychological processes. Indeed, construct valid manipulations must affect their intended construct in the intended way, and not exert their effect via confounding variables ( Cook & Campbell, 1979 ). To better understand the current practices through which experimental social psychologists provide evidence that their manipulations possess construct validity, we examined published articles from the field’s flagship journal: JPSP .

Chief among our findings was that approximately 42% of experimental manipulations were paired with no evidence beyond face validity of their underlying construct validity — no citations, no pilot validity testing, and no manipulation checks. Indeed, the most common approach in our review was that of presenting no construct validity evidence whatsoever. To the extent that this estimate generalizes across the field, this suggests that social psychology’s experimental foundations rest upon considerably unknown ground instead of empirical adamant. In what follows, we highlight other key findings from each domain of our meta-analysis, while providing recommendations for future practice in the hope of improving the state of experimental psychological science.

Prevalence and Complexity of Experimental Manipulations

At a first glance, we find that experimental manipulation is alive and well in social psychology. A little more than two-thirds of the studies we reviewed had at least one experimental manipulation. Suggesting a preference for simplicity, over 90% of studies with manipulations employed only one or two manipulations, and a similar number of manipulations contained only two or three conditions. This prevalence of relatively simple experimental designs is promising as exceedingly complex designs (e.g., a 2 × 3 × 2 factorial design) undermine statistical power and inflate type I and II error rates ( Smith, Levine, Lachlan, & Fediuk, 2002 ).

Over 90% of manipulations were conducted in a between-participant manner, demonstrating a neglect of within-participants experimental designs. Within-participants designs are able to maximize statistical power, as compared to between-participants designs ( Aberson, 2019 ). As such, the over-reliance we observed on between-participants designs may undermine the overall power of the findings from experimental social psychology. However, many manipulations may simply be impossible to present in a repeated-measures fashion without undermining the internal validity thereof.

Random Assignment and the Lack of Detail in Descriptions of Manipulations

Of the between-participants manipulations, a considerable number (approximately two-fifths) failed to mention whether participants were randomly assigned to their experimental conditions. Given that random assignment is a necessary condition for a true experimental manipulation ( Cook & Campbell, 1979 ; Wilson et al., 2010 ), explicit statements of what assignment procedure was used to place participants in their given condition should be included in every report of experimental results. Furthermore, none of the manipulations that did mention random assignment to condition described precisely what procedure was used to randomize the assignment process. Without this information, it is impossible to know if condition assignment was truly randomized or perhaps the randomization procedure could have introduced a systematic bias of some kind. Relatedly, we did not learn about whether or how within-participants manipulations randomized the order of the conditions across participants. Future research would benefit from examining the prevalence of these practices and their impact on the construct validity of within-participants manipulations.

This lack of information about random assignment reflected a much more general lack of basic information that authors provided about their manipulations. It was often the case where manuscripts did not even mention the validity information we sought. Pilot validity studies and manipulation checks were frequently described in a cursory fashion, absent necessary methodological detail and inferential statistics. More transparency is needed in order to evaluate each manipulation’s validity and for researchers to replicate the procedure in their own labs. Towards this end, we have created a checklist of information that we hope peer reviewers will apply to new research in order to ensure that each manipulation, manipulation check, and pilot validity study is described in sufficient detail ( Appendix A ). We further encourage experimenters to use this checklist to adequately detail these important aspects of their experimental methodology.

Previously Used vs. ‘On The Fly’ Manipulations

Approximately 80% of manipulations were not acquired from previous research and were instead created ad hoc for a given study. This suggests that researchers heavily rely upon ‘on the fly’ manipulation (term adapted from Flake et al., 2017 ), in which ad hoc manipulations are routinely created from scratch to fit the parameters of a given study. The prevalence of this ‘on the fly’ manipulation is almost twice that of ‘on the fly’ measurement in social and personality psychology (~46%; Flake et al., 2017 ). This prevalence rate may be inflated by a tendency for authors to simply fail to provide such citations for manipulations that have, in fact, been implemented in prior publications. We encourage experimenters to cite publications that empirically examine the validity of their manipulations, whenever they exist. These ad hoc procedures appear to acutely afflict experimental designs and future work is needed to determine the reasons underlying this disproportionate practice.

The field’s reliance on creating manipulations de novo is concerning. This practice entails that much time and resources are spent on creating new manipulations instead of implementing and improving upon existing, validated manipulations. This tendency towards ‘on the fly’ manipulation may reflect psychological science’s bias towards novelty and away from replicating past research ( Neuliep & Crandall, 1993 ), which has known adverse consequences ( Open Science Collaboration, 2015 ). We therefore recommend that experimenters avoid ‘on the fly’ manipulation and instead employ existing, previously validated manipulations whenever possible (Recommendation 1), though we note few of such manipulations are likely available.

Of the relatively small number of manipulations that were acquired from previous research, roughly one-fourth of them were modified from their original form. This is likely an underestimate of modification rates, as none of the articles we coded explicitly stated that their manipulation was not modified in any way. As such, modification rates may be considerably higher. This practice can have consequences as modifying a manipulation undermines the established validity of that manipulation, just as modifying a questionnaire often requires it to be re-validated ( Flake et al., 2017 ). This practice of unvalidated modification compounds these issues when the original manipulation that has been modified was never validated itself. We therefore recommend that experimenters avoid modifying previously validated manipulations whenever possible (Recommendation 2A). When modification is unavoidable, we recommend that investigators re-validate the modified manipulation prior to implementation (Recommendation 2B).

We realize that Recommendations 1 and 2 are likely to be difficult to adhere to given the pessimistic nature of our findings. Indeed, it is difficult to avoid ‘on the fly’ manipulation development and modification when there are no validated versions of a given manipulation already in existence. However, we are optimistic that if experimenters begin to improve their validation practices, this will not be an issue for long. These recommendations are given with that bright future in mind.

Pilot Validity Testing

Approximately one in five manipulations were associated with a pilot validity study prior to implementation in hypothesis testing. This low adoption rate of pilot validity studies suggests that the practice of pilot validity testing is somewhat rare, which is problematic as such testing is a critical means of establishing the construct validity of a manipulation ( Ellsworth & Gonzalez, 2003 ; Wilson et al., 2010 ). Pilot validity testing has several advantages over simply including manipulation checks during hypothesis testing. First, pilot validity testing prevents unwanted effects of a manipulation check from intruding upon other aspects of the study ( Hauser et al., 2018 ). Second, pilot validity studies allow for changes to be made to the manipulation to optimize its effects before it is implemented. Pilot validity testing would further ensure that time and resources are not wasted on testing hypotheses with manipulations of unknown construct validity. We therefore recommend that experimenters conduct well-powered pilot validity studies for each manipulation prior to implementation in hypothesis testing (Recommendation 3A).

These relatively rare reports of pilot validity studies may have been artificially suppressed by the practice of not publishing pilot validity evidence ( Westlund & Stuart, 2017 ). However, all pilot validity evidence should be published alongside the later studies it was used to develop in order to transparently communicate the evidence for and against the validity of the given manipulation ( Asendorpf et al., 2013 ). Keeping pilot validity studies behind a veil may also reflect a broader culture that under-values this crucial phase of the manipulation validation process. Pilot validity studies should not be viewed as mere ‘dress rehearsals’ for the main event (i.e., hypothesis testing), but should be granted the same importance, resources, and time as the studies in which they are subsequently employed. Robust training, investment, and transparency in pilot validity testing will produce more valid manipulations and therefore, more valid experimental findings. We therefore recommend that the results of pilot validity studies should be published as validation articles (Recommendation 3B) and these validation articles should be accompanied by detailed protocols and stimuli needed to replicate the manipulation (Recommendation 3C).

On an optimistic note, meta-analyses revealed that pilot validity studies exhibited substantial evidentiary value and a robust meta-analytic effect size. These findings imply that researchers are conducting pilot validity tests that capture real and impactful effects and are not just capitalizing on sources of flexibility or variability. Little evidence of p -hacking ( Simonsohn et al., 2014 ) or publication bias were observed, suggesting that researchers are not simply selectively reporting their pilot validity data to artificially evince an underlying effect, nor are they merely submitting unsuccessful pilot validity studies to the ‘file drawer’ and cherry picking those that obtain effects. These meta-analyses also revealed that these studies were statistically powered to a maximal degree, arguing against characterizations of pilot validity studies as underpowered ( Albers & Lakens, 2018 ; Kraemer et al., 2006).

Manipulation Checks

Approximately one-third of manipulations were paired with a manipulation check measure. This estimate is much lower than those from other meta-analyses. Hauser and colleagues (2018) reported that 63% of articles in the Attitudes & Social Cognition section of 2016 JPSP included at least one manipulation check. Sigall and Mills (1998) reported that 68% of JPSP articles in 1998 reported an experimental manipulation. The differences in our estimates are likely due to our focus at the manipulation-level, rather than the article-level, which we employed because articles present multiple studies with multiple manipulations and article-level analyses obscure these statistics. We also applied a strict definition of a manipulation check, whereas the authors of these other investigations may have counted any measure that the authors referred to as a ‘manipulation check’. It is also possible that manipulation check prevalence rates have actually decreased in recent years, due to published critiques of manipulation checks (e.g., Fayant et al., 2017 ; Sigall & Mills, 1998 ).

A central issue with manipulation checks is that they intrude upon the experiment, calling participants’ attention and suspicion to the manipulation and subsequently to the construct under study ( Hauser et al. 2018 ). For instance, asking participants how rejected they felt may raise suspicions about the ball-tossing task they were just excluded from. Such effects can be manifold and insidious, causing participants to guess at the experimenters’ hypotheses, heighten their suspicion, change their thoughts or feelings by reflecting upon them, or change the nature of the manipulation itself ( Hauser et al., 2018 ). However, the concerns raised by these critiques are obviated if the manipulation check is administered during the pilot validation of the manipulation and excluded during implementation of the manipulation in hypothesis testing. We therefore recommend that experimenters administer manipulation checks during the pilot validity testing of each manipulation (Recommendation 4A) and post-pilot manipulation checks should only be administered if they do not negatively impact other aspects of the study (Recommendation 4B).

Pilot validity studies may differ substantially from the primary experiments that employ the manipulations that they seek to validate. Indeed, the presence of other manipulations, measures, and environmental factors might lead a manipulation that exhibited evidence of possessing construct validity to no longer exert its ‘established’ effect on the target construct. When such differences occur between pilot validity studies and focal experiments, including a manipulation check in the focal experiment could establish whether these changes have affected the manipulation’s construct validity. If there are legitimate concerns that including a manipulation check could negatively impact the validity of the manipulation, then experimenters could randomly-assign participants to either receive the check or not in order to estimate the effect that the check has on the manipulation’s hypothesized effects (assuming sufficient power to detect such effects).

As with the manipulations themselves, the overwhelming majority of manipulation checks were created ad hoc for the given manipulation. The purported validity evidence provided for the manipulation checks was often simple face validity and in some cases, a Cronbach’s α . Many were single-item self-report measures. These forms of purported validity evidence are insufficient to establish the construct validity of a measure ( Flake et al., 2017 ). Not knowing whether the check captured the latent construct of interest, or instead tapped into some other construct(s), renders any inferences drawn upon such measures theoretically compromised. We therefore recommend that experimenters validate the instruments they use as manipulation checks prior to use in pilot validity testing (Recommendation 4C). Requiring that manipulation checks be validated would entail a large-scale shift in the practices of experimental social psychologists, who would now often find themselves having to preempt new experiments with the task of creating and validating a new state measure. This would require a new emphasis on training in psychometrics, resources devoted to the manipulation check validation process, and rewards given to those who do so.

Meta-analyses revealed that manipulation checks exhibited evidentiary value and a robust meta-analytic effect size. Though these findings are promising indicators that the manipulations employed in these studies exerted true effects that these checks were able to capture, they cannot speak to the underlying construct validity of these manipulation effects. Indeed, just because manipulations are exerting some effect on their manipulation checks, these findings do not tell us whether the intended aspect of the manipulation exerted the observed effect or whether the manipulation checks measured the target construct. Manipulation check effects were also maximally statistically powered, which implies that manipulations are at least well powered enough to influence their intended constructs. As with pilot validity studies, there was no evidence for publication bias.

Only approximately one-tenth of manipulations assessed the extent to which participants were suspicious of the deceptive elements of the study. Though studies vary in the extent to which they are deceptive, almost all experimental manipulations entail some degree of deception in that participants are being influenced without their explicit awareness of the full nature and intent of the manipulation. As such, the majority of studies were unable to estimate the extent to which participants detected their manipulation procedures. Even fewer adequately described how suspicion was assessed, often referring vaguely to an experimenter interview or an open-ended survey question. No specific criteria were given for what delineated ‘suspicious’ from ‘non-suspicious’ participants, and only five studies excluded participants from the former group. Given that no well-validated, standardized suspicion assessment procedures exist and there is little in the way of data on what effect that removing ‘suspicious’ participants from analyses might have on subsequent results ( Blackhart et al., 2012 ), we do not make any recommendations in this domain. Much work is needed to establish the best practices of suspicion assessment and analysis.

Size and Duration of Manipulation Effects

Although many articles established the size of a manipulation’s effect on the manipulation check, no manipulation checks repeatedly assessed any manipulation’s effect in order to estimate the timecourse of these effects. The effect of a given experimental manipulation wanes over time (e.g., Zadro, Boland, & Richardson, 2006 ) and its timecourse is a critical element to determine for several reasons. First, experimenters need to know if the manipulation’s effect is still psychologically active at the time point in which they administer their outcome measures, and its strength at that given timepoint. This would allow experimenters to identify an experimental ‘sweet spot’ when the manipulation’s effect is strongest. Second, for ethical reasons it is crucial to ensure that the manipulation’s effect has adequately decayed by the time the study has ended and participants are returned to the real world. This is especially important when the manipulated process is distressing or interferes with daily functioning ( Miketta & Friese, 2019 ). We therefore recommend that whenever possible, that experimenters estimate the timecourse of their manipulation’s effect by repeatedly administering manipulation checks during pilot validity testing (Recommendation 5).

Estimating the Nomological Shockwave via Discriminant Validity Checks

Across the manipulations we surveyed, construct validity was most often assessed (when it was assessed) by estimating the manipulation’s effect on the construct that the manipulation was primarily intended to affect. However, a requisite of construct validity is discriminant validity, such that the given manipulation influences the target construct and not a different, confounding construct ( Cronbach & Meehl, 1955 ). Absent this practice, ‘successful’ manipulation checks may obscure the possibility that although the manipulation influences the desired construct, it also impacts a related, non-targeted variable to a confounding degree. In this context, discriminant validity can be established by examining the manipulation’s nomological shockwave (i.e., the manipulation’s effect on other constructs that exist in within the target construct’s nomological network). This can be done by administering discriminant validity checks, which are measures of constructs within the target construct’s nomological network. In its simplest form, the nomological shockwave can empirically established by demonstrating that the manipulation’s largest effect is upon the target construct and then exerts progressively weaker and non-overlapping effects on theoretically-related constructs as a function of their proximity to the target construct in the nomological network. We therefore recommend that experimenters administer measures of theoretically related constructs in pilot testing (i.e., discriminant validity checks; Recommendation 6A) and that these are used to estimate the nomological shockwave of the manipulation (Recommendation 6B).

Estimating the nomological shockwave by simply comparing effect sizes and their confidence intervals is admittedly a crude empirical approach. Inherently, the shockwave rests on the assumption that the manipulation exerts a causal effect on the target construct, this target construct then exerts a causal effect on the discriminant validity constructs by virtue of their latent associations. Ideally, causal models could test this sequence of effects, though such quantitative approaches are often limited in their abilities to do so ( Fiedler, Schott, & Meiser, 2011 ). Future research is needed to understand the accuracy and utility of employing causal modeling to estimate nomological shockwaves.

Limitations and Future Directions

This project only examined articles from JPSP and did not include a wider array of publication outlets in social psychology. It may be that our assessment of validation practices would change if we had cast a wider meta-analytic net. Future work should test whether our findings replicate in other journals and in other subfields of psychology. Other experimentally focused fields such as cognitive, developmental, and biological psychology may also vary in their approaches to the validation of their experimental manipulations. Future research is needed in these areas to see if this is the case. We also used subjective codes and definitions of the manipulation features that we coded, allowing for our own biases to have influenced our findings. We have made all of our codes publicly available so that interested parties might review them for such biases and modify the codes according to their own sensibilities and examine their effect on our results. Indeed, we do not see our findings as conclusive but that the coded dataset we have created will be a resource for other investigators to examine in the future.

Experimental manipulations are the methodological foundation of much of social psychology. Our meta-analytic review suggests that the construct validity of such manipulations rests on practices that could be improved. We have made recommendations for how to make such changes, which largely revolve around translating the validation approach taken towards personality questionnaires to experimental manipulations. This new model would entail that validated manipulations are used whenever available and when new manipulations are created, they are validated (i.e., pilot validated) prior to implementation in hypothesis testing. Validity would then be established by demonstrating that the manipulation has its strongest effect on the target construct and theoretically appropriate effects on the nomological network surrounding it. Adopting this model would mean a dramatic change in practices for most laboratories in experimental social psychology. The costs inherent in doing so should be counteracted by a rise in replicability and veridicality of the field’s findings. We hope that our assessment of the field’s practices is an important initial step in that direction.

Supplementary Material

Acknowledgments.

Research reported in this publication was supported by the National Institute on Alcohol Abuse and Alcoholism (NIAAA) of the National Institutes of Health under award number K01AA026647 (PI: Chester).

Appendix A. Peer Reviewer Manipulation Information Checklist

Below are pieces of information that should be included for research using experimental manipulations in psychology. If you don’t see them mentioned, consider requesting that the authors ensure that this information is explicitly stated in the manuscript.

  • The number of manipulations in each study.
  • The number of conditions in each manipulation.
  • The definition of the construct that each manipulation was intended to affect.
  • Whether each manipulation was administered between- or within-participants.
  • Whether random assignment (for between-participants designs) or counterbalancing (for within-participants designs) were used in each manipulation.
  • How random assignment or counterbalancing was conducted in each manipulation.
  • Whether each manipulation was acquired from previous research or newly-created for the study.
  • The pre-existing validity evidence for each manipulation that was acquired from previous research.
  • Whether each manipulation that was acquired from previous research was modified from the version of the manipulation detailed in the previous research.
  • The validity evidence for each manipulation that was modified from previous research.
  • Whether each manipulation was pilot tested prior to implementation.
  • The validity evidence for each measure employed in each pilot study.
  • The pilot validity evidence for each manipulation that was pilot tested.
  • The detailed Methods and Results of each pilot study.
  • Whether each manipulation was paired with a manipulation check that quantified the manipulation’s target construct.
  • The validity evidence for each manipulation check.
  • Whether each manipulation was paired with a discriminant validity check that quantified potentially confounding constructs.
  • The validity evidence for each discriminant validity check.
  • Whether deception-by-omission was used for each manipulation (i.e., facts about the manipulation were withheld from participants).
  • Whether deception-by-commission was used for each manipulation (i.e., untrue information about the manipulation was provided to participants).
  • Whether each deceptive manipulation was paired with a suspicion probe.
  • The methodological details of each suspicion probe.
  • The validity evidence for each suspicion probe.
  • How each suspicion probe was scored.
  • How participants were deemed to be suspicious or not for each suspicion probe.
  • How suspicious participants were handled (e.g., excluded from analysis, suspicion used as a covariate) in each manipulation study.
  • Aberson CL (2019). Applied power analysis for the behavioral sciences . Routledge. [ Google Scholar ]
  • AERA (American Educational Research Association), APA (American Psychological Association), & NCME (National Council on Measurement in Education). (2014). Standards for educational and psychological testing . American Educational Research Association. [ Google Scholar ]
  • Albers C, & Lakens D (2018). When power analyses based on pilot data are biased: Inaccurate effect size estimators and follow-up bias . Journal of Experimental Social Psychology , 74 , 187–195. [ Google Scholar ]
  • Asendorpf JB, Conner M, De Fruyt F, De Houwer J, Denissen JJ, Fiedler K, … & Perugini M (2013). Recommendations for increasing replicability in psychology . European Journal of Personality , 27 ( 2 ), 108–119. [ Google Scholar ]
  • Begg CB, & Mazumdar M (1994). Operating characteristics of a rank correlation test for publication bias . Biometrics , 1088–1101. [ PubMed ] [ Google Scholar ]
  • Benson J (1998). Developing a strong program of construct validation: A test anxiety example. Educational Measurement : Issues and Practice , 17 ( 1 ), 10–17. [ Google Scholar ]
  • Berkowitz L, & Donnerstein E (1982). External validity is more than skin deep: Some answers to criticisms of laboratory experiments . American Psychologist , 37 ( 3 ), 245–257. [ Google Scholar ]
  • Blackhart GC, Brown KE, Clark T, Pierce DL, & Shell K (2012). Assessing the adequacy of postexperimental inquiries in deception research and the factors that promote participant honesty . Behavior Research Methods , 44 ( 1 ), 24–40. [ PubMed ] [ Google Scholar ]
  • Borsboom D, Mellenbergh GJ, & van Heerden J (2004). The concept of validity . Psychological Review , 111 ( 4 ), 1061–1071. [ PubMed ] [ Google Scholar ]
  • Brewer MB (2000). Research design and issues of validity. In Reis HT & Judd CM (Eds). Handbook of research: Methods in social and personality psychology (pp. 3–39). Cambridge University Press. [ Google Scholar ]
  • Campbell DT (1957). Factors relevant to the validity of experiments in social settings . Psychological Bulletin , 54 ( 4 ), 297–312. [ PubMed ] [ Google Scholar ]
  • Campbell DT (1969). Prospective: Artifact and control. In Rosenthal R & Rosnow RL (Eds.), Artifact in behavioral research (pp. 351–382). Academic Press. [ Google Scholar ]
  • Chester DS, DeWall CN, & Pond RS (2016). The push of social pain: Does rejection’s sting motivate subsequent social reconnection? Cognitive, Affective, & Behavioral Neuroscience , 16 ( 3 ), 541–550. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Clark LA, & Watson D (2019). Constructing validity: New developments in creating objective measuring instruments . Psychological Assessment , 31 ( 12 ), 1412. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Cook TD, & Campbell DT (1979). Quasi-experimentation: Design & analysis issues for field settings . Rand McNally. [ Google Scholar ]
  • Cronbach L, & Meehl P (1955). Construct validity in psychological tests . Psychological Bulletin , 52 ( 4 ), 281–302. [ PubMed ] [ Google Scholar ]
  • Donate APG, Marques LM, Lapenta OM, Asthana MK, Amodio D, & Boggio PS (2017). Ostracism via virtual chat room: Effects on basic needs, anger and pain . PLoS One , 12 ( 9 ), e0184215. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Duval S, & Tweedie R (2000). Trim and fill: A simple funnel-plot–based method of testing and adjusting for publication bias in meta-analysis . Biometrics , 56 ( 2 ), 455–463. [ PubMed ] [ Google Scholar ]
  • Egger M, Smith GD, Schneider M, & Minder C (1997). Bias in meta-analysis detected by a simple, graphical test . British Medical Journal , 315 ( 7109 ), 629. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Ellsworth PC, & Gonzalez R (2003). Questions and comparisons: Methods of research in social psychology. In Hogg M & Cooper J (Eds.), The sage handbook of social psychology (pp. 24–42). Sage. [ Google Scholar ]
  • Embretson S (1983). Construct validity: Construct representation versus nomothetic span . Psychological Bulletin , 93 ( 1 ), 179–197. [ Google Scholar ]
  • Fayant MP, Sigall H, Lemonnier A, Retsin E, & Alexopoulos T (2017). On the limitations of manipulation checks: An obstacle toward cumulative science . Psychology , 30 ( 1 ), 125–130. [ Google Scholar ]
  • Fiedler K, Schott M, & Meiser T (2011). What mediation analysis can (not) do . Journal of Experimental Social Psychology , 47 ( 6 ), 1231–1236. [ Google Scholar ]
  • Flake JK, & Fried EI (2019). Measurement schmeasurement: Questionable measurement practices and how to avoid them . Unpublished preprint available at https://psyarxiv.com/hs7wm/
  • Flake JK, Pek J, & Hehman E (2017). Construct validation in social and personality research: Current practice and recommendations . Social Psychological and Personality Science , 8 ( 4 ), 370–378. [ Google Scholar ]
  • Garner W, Hake H, & Eriksen C (1956). Operationism and the concept of perception . Psychological Review , 63 ( 3 ), 149–159. [ PubMed ] [ Google Scholar ]
  • Gill M, & Cerce S (2017). He never willed to have the will he has: Historicist narratives, “civilized” blame, and the need to distinguish two notions of free will . Journal of Personality and Social Psychology , 112 ( 3 ), 361–382. [ PubMed ] [ Google Scholar ]
  • Grand JA, Ryan AM, Schmitt N, & Hmurovic J (2010). How far does stereotype threat reach? The potential detriment of face validity in cognitive ability testing . Human Performance , 24 ( 1 ), 1–28. [ Google Scholar ]
  • Gray K (2017). How to map theory: Reliable methods are fruitless without rigorous theory . Perspectives on Psychological Science , 12 ( 5 ), 731–741. [ PubMed ] [ Google Scholar ]
  • Haslam SA, & McGarty C (2004). Experimental design and causality in social psychological research. In Sanson C, Morf CC, & Panter AT (Eds.), Handbook of methods in social psychology (pp. 235–264). Sage. [ Google Scholar ]
  • Hauser DJ, Ellsworth PC, & Gonzalez R (2018). Are manipulation checks necessary? Frontiers in Psychology , 9 . [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Head ML, Holman L, Lanfear R, Kahn AT, & Jennions MD (2015). The extent and consequences of p -hacking in science . PLoS Biology , 13 ( 3 ). [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Highhouse S (2009). Designing experiments that generalize . Organizational Research Methods , 12 ( 3 ), 554–566. [ Google Scholar ]
  • Lench HC, Taylor AB, & Bench SW (2014). An alternative approach to analysis of mental states in experimental social cognition research . Behavior Research Methods , 46 ( 1 ), 215–228. [ PubMed ] [ Google Scholar ]
  • Lewin K (1939). Field theory and experiment in social psychology: Concepts and methods . American Journal of Sociology , 44 ( 6 ), 868–896. [ Google Scholar ]
  • Loevinger J (1957). Objective tests as instruments of psychological theory . Psychological Reports , 3 , 635–694. [ Google Scholar ]
  • Messick S (1995). Validity of psychological assessment: Validation of inferences from persons’ responses and performances as scientific inquiry into score meaning . American Psychologist , 50 ( 9 ), 741–749. [ Google Scholar ]
  • Miketta S, & Friese M (2019). Debriefed but still troubled? About the (in) effectiveness of postexperimental debriefings after ego threat . Journal of Personality and Social Psychology , 117 ( 2 ), 282–309. [ PubMed ] [ Google Scholar ]
  • Mook D (1983). In defense of external invalidity . American Psychologist , 38 ( 4 ), 379–387. [ Google Scholar ]
  • Neuliep JW, & Crandall R (1993). Reviewer bias against replication research . Journal of Social Behavior and Personality , 8 ( 6 ), 21–29. [ Google Scholar ]
  • Nichols AL, & Edlund JE (2015). Practicing what we preach (and sometimes study): Methodological issues in experimental laboratory research . Review of General Psychology , 19 ( 2 ), 191–202. [ Google Scholar ]
  • Open Science Collaboration. (2015). Estimating the reproducibility of psychological science . Science , 349 ( 6251 ), 253–267. [ PubMed ] [ Google Scholar ]
  • Oppenheimer DM, Meyvis T, & Davidenko N (2009). Instructional manipulation checks: Detecting satisficing to increase statistical power . Journal of Experimental Social Psychology , 45 ( 4 ), 867–872. [ Google Scholar ]
  • Orwin RG (1983). A fail-safe N for effect size in meta-analysis . Journal of Educational Statistics , 8 ( 2 ), 157–159. [ Google Scholar ]
  • Reichardt CS (2006). The principle of parallelism in the design of studies to estimate treatment effects . Psychological Methods , 11 ( 1 ), 1–18. [ PubMed ] [ Google Scholar ]
  • Shadish WR, Cook TD, & Campbell DT (2002). Experimental and quasi-experimental designs for generalized causal inference . Houghton Mifflin. [ Google Scholar ]
  • Sigall H, & Mills J (1998). Measures of independent variables and mediators are useful in social psychology experiments: But are they necessary? Personality and Social Psychology Review , 2 ( 3 ), 218–226. [ PubMed ] [ Google Scholar ]
  • Simms LJ (2008). Classical and modern methods of psychological scale construction . Social and Personality Psychology Compass , 2 ( 1 ), 414–433. [ Google Scholar ]
  • Simonsohn U, Nelson LD, & Simmons JP (2014). P-curve: A key to the file-drawer . Journal of Experimental Psychology. General , 143 ( 2 ), 534–547. [ PubMed ] [ Google Scholar ]
  • Smith GT (2005). On construct validity: Issues of method and measurement . Psychological Assessment , 17 ( 4 ), 396–408 [ PubMed ] [ Google Scholar ]
  • Smith RA, Levine TR, Lachlan KA, & Fediuk TA (2002). The high cost of complexity in experimental design and data analysis: Type I and type II error rates in multiway ANOVA . Human Communication Research , 28 ( 4 ), 515–530. [ Google Scholar ]
  • Strauss ME, & Smith GT (2009). Construct validity: Advances in theory and methodology . Annual Review of Clinical Psychology , 5 , 1–25. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Watson D, Clark LA, & Tellegen A (1988). Development and validation of brief measures of positive and negative affect: The PANAS scales . Journal of Personality and Social Psychology , 54 ( 6 ), 1063–1070. [ PubMed ] [ Google Scholar ]
  • Westlund E, & Stuart EA (2017). The nonuse, misuse, and proper use of pilot studies in experimental evaluation research . American Journal of Evaluation , 38 ( 2 ), 246–261. [ Google Scholar ]
  • Williams KD (2009). Ostracism: Effects of being ignored and excluded. In Zanna M (Ed.), Advances in experimental social psychology (Vol. 41 , pp. 279–314). Academic Press. [ Google Scholar ]
  • Williams K, Cheung C, & Choi W (2000). Cyberostracism: Effects of being ignored over the internet . Journal of Personality and Social Psychology , 79 ( 5 ), 748–762. [ PubMed ] [ Google Scholar ]
  • Wilson TD, Aronson E, & Carlsmith K (2010). The art of laboratory experimentation. In Fiske ST, Gilbert DT, & Lindzey G (Eds.), Handbook of social psychology (Vol. 1 , pp. 51–81). Wiley. [ Google Scholar ]
  • Zadro L, Boland C, & Richardson R (2006). How long does it last? The persistence of the effects of ostracism in the socially anxious . Journal of Experimental Social Psychology , 42 ( 5 ), 692–697. [ Google Scholar ]
  • Zumbo BD, & Chan EKH (2014). Setting the stage for validity and validation in social, behavioral, and health sciences: Trends in validation practices. In Zumbo BD & Chan EKH (Eds.), Validity and validation in social, behavioral, and health sciences (pp. 3–8). Springer. [ Google Scholar ]
  • Bipolar Disorder
  • Therapy Center
  • When To See a Therapist
  • Types of Therapy
  • Best Online Therapy
  • Best Couples Therapy
  • Managing Stress
  • Sleep and Dreaming
  • Understanding Emotions
  • Self-Improvement
  • Healthy Relationships
  • Student Resources
  • Personality Types
  • Sweepstakes
  • Guided Meditations
  • Verywell Mind Insights
  • 2024 Verywell Mind 25
  • Mental Health in the Classroom
  • Editorial Process
  • Meet Our Review Board
  • Crisis Support

How the Experimental Method Works in Psychology

sturti/Getty Images

The Experimental Process

Types of experiments, potential pitfalls of the experimental method.

The experimental method is a type of research procedure that involves manipulating variables to determine if there is a cause-and-effect relationship. The results obtained through the experimental method are useful but do not prove with 100% certainty that a singular cause always creates a specific effect. Instead, they show the probability that a cause will or will not lead to a particular effect.

At a Glance

While there are many different research techniques available, the experimental method allows researchers to look at cause-and-effect relationships. Using the experimental method, researchers randomly assign participants to a control or experimental group and manipulate levels of an independent variable. If changes in the independent variable lead to changes in the dependent variable, it indicates there is likely a causal relationship between them.

What Is the Experimental Method in Psychology?

The experimental method involves manipulating one variable to determine if this causes changes in another variable. This method relies on controlled research methods and random assignment of study subjects to test a hypothesis.

For example, researchers may want to learn how different visual patterns may impact our perception. Or they might wonder whether certain actions can improve memory . Experiments are conducted on many behavioral topics, including:

The scientific method forms the basis of the experimental method. This is a process used to determine the relationship between two variables—in this case, to explain human behavior .

Positivism is also important in the experimental method. It refers to factual knowledge that is obtained through observation, which is considered to be trustworthy.

When using the experimental method, researchers first identify and define key variables. Then they formulate a hypothesis, manipulate the variables, and collect data on the results. Unrelated or irrelevant variables are carefully controlled to minimize the potential impact on the experiment outcome.

History of the Experimental Method

The idea of using experiments to better understand human psychology began toward the end of the nineteenth century. Wilhelm Wundt established the first formal laboratory in 1879.

Wundt is often called the father of experimental psychology. He believed that experiments could help explain how psychology works, and used this approach to study consciousness .

Wundt coined the term "physiological psychology." This is a hybrid of physiology and psychology, or how the body affects the brain.

Other early contributors to the development and evolution of experimental psychology as we know it today include:

  • Gustav Fechner (1801-1887), who helped develop procedures for measuring sensations according to the size of the stimulus
  • Hermann von Helmholtz (1821-1894), who analyzed philosophical assumptions through research in an attempt to arrive at scientific conclusions
  • Franz Brentano (1838-1917), who called for a combination of first-person and third-person research methods when studying psychology
  • Georg Elias Müller (1850-1934), who performed an early experiment on attitude which involved the sensory discrimination of weights and revealed how anticipation can affect this discrimination

Key Terms to Know

To understand how the experimental method works, it is important to know some key terms.

Dependent Variable

The dependent variable is the effect that the experimenter is measuring. If a researcher was investigating how sleep influences test scores, for example, the test scores would be the dependent variable.

Independent Variable

The independent variable is the variable that the experimenter manipulates. In the previous example, the amount of sleep an individual gets would be the independent variable.

A hypothesis is a tentative statement or a guess about the possible relationship between two or more variables. In looking at how sleep influences test scores, the researcher might hypothesize that people who get more sleep will perform better on a math test the following day. The purpose of the experiment, then, is to either support or reject this hypothesis.

Operational definitions are necessary when performing an experiment. When we say that something is an independent or dependent variable, we must have a very clear and specific definition of the meaning and scope of that variable.

Extraneous Variables

Extraneous variables are other variables that may also affect the outcome of an experiment. Types of extraneous variables include participant variables, situational variables, demand characteristics, and experimenter effects. In some cases, researchers can take steps to control for extraneous variables.

Demand Characteristics

Demand characteristics are subtle hints that indicate what an experimenter is hoping to find in a psychology experiment. This can sometimes cause participants to alter their behavior, which can affect the results of the experiment.

Intervening Variables

Intervening variables are factors that can affect the relationship between two other variables. 

Confounding Variables

Confounding variables are variables that can affect the dependent variable, but that experimenters cannot control for. Confounding variables can make it difficult to determine if the effect was due to changes in the independent variable or if the confounding variable may have played a role.

Psychologists, like other scientists, use the scientific method when conducting an experiment. The scientific method is a set of procedures and principles that guide how scientists develop research questions, collect data, and come to conclusions.

The five basic steps of the experimental process are:

  • Identifying a problem to study
  • Devising the research protocol
  • Conducting the experiment
  • Analyzing the data collected
  • Sharing the findings (usually in writing or via presentation)

Most psychology students are expected to use the experimental method at some point in their academic careers. Learning how to conduct an experiment is important to understanding how psychologists prove and disprove theories in this field.

There are a few different types of experiments that researchers might use when studying psychology. Each has pros and cons depending on the participants being studied, the hypothesis, and the resources available to conduct the research.

Lab Experiments

Lab experiments are common in psychology because they allow experimenters more control over the variables. These experiments can also be easier for other researchers to replicate. The drawback of this research type is that what takes place in a lab is not always what takes place in the real world.

Field Experiments

Sometimes researchers opt to conduct their experiments in the field. For example, a social psychologist interested in researching prosocial behavior might have a person pretend to faint and observe how long it takes onlookers to respond.

This type of experiment can be a great way to see behavioral responses in realistic settings. But it is more difficult for researchers to control the many variables existing in these settings that could potentially influence the experiment's results.

Quasi-Experiments

While lab experiments are known as true experiments, researchers can also utilize a quasi-experiment. Quasi-experiments are often referred to as natural experiments because the researchers do not have true control over the independent variable.

A researcher looking at personality differences and birth order, for example, is not able to manipulate the independent variable in the situation (personality traits). Participants also cannot be randomly assigned because they naturally fall into pre-existing groups based on their birth order.

So why would a researcher use a quasi-experiment? This is a good choice in situations where scientists are interested in studying phenomena in natural, real-world settings. It's also beneficial if there are limits on research funds or time.

Field experiments can be either quasi-experiments or true experiments.

Examples of the Experimental Method in Use

The experimental method can provide insight into human thoughts and behaviors, Researchers use experiments to study many aspects of psychology.

A 2019 study investigated whether splitting attention between electronic devices and classroom lectures had an effect on college students' learning abilities. It found that dividing attention between these two mediums did not affect lecture comprehension. However, it did impact long-term retention of the lecture information, which affected students' exam performance.

An experiment used participants' eye movements and electroencephalogram (EEG) data to better understand cognitive processing differences between experts and novices. It found that experts had higher power in their theta brain waves than novices, suggesting that they also had a higher cognitive load.

A study looked at whether chatting online with a computer via a chatbot changed the positive effects of emotional disclosure often received when talking with an actual human. It found that the effects were the same in both cases.

One experimental study evaluated whether exercise timing impacts information recall. It found that engaging in exercise prior to performing a memory task helped improve participants' short-term memory abilities.

Sometimes researchers use the experimental method to get a bigger-picture view of psychological behaviors and impacts. For example, one 2018 study examined several lab experiments to learn more about the impact of various environmental factors on building occupant perceptions.

A 2020 study set out to determine the role that sensation-seeking plays in political violence. This research found that sensation-seeking individuals have a higher propensity for engaging in political violence. It also found that providing access to a more peaceful, yet still exciting political group helps reduce this effect.

While the experimental method can be a valuable tool for learning more about psychology and its impacts, it also comes with a few pitfalls.

Experiments may produce artificial results, which are difficult to apply to real-world situations. Similarly, researcher bias can impact the data collected. Results may not be able to be reproduced, meaning the results have low reliability .

Since humans are unpredictable and their behavior can be subjective, it can be hard to measure responses in an experiment. In addition, political pressure may alter the results. The subjects may not be a good representation of the population, or groups used may not be comparable.

And finally, since researchers are human too, results may be degraded due to human error.

What This Means For You

Every psychological research method has its pros and cons. The experimental method can help establish cause and effect, and it's also beneficial when research funds are limited or time is of the essence.

At the same time, it's essential to be aware of this method's pitfalls, such as how biases can affect the results or the potential for low reliability. Keeping these in mind can help you review and assess research studies more accurately, giving you a better idea of whether the results can be trusted or have limitations.

Colorado State University. Experimental and quasi-experimental research .

American Psychological Association. Experimental psychology studies human and animals .

Mayrhofer R, Kuhbandner C, Lindner C. The practice of experimental psychology: An inevitably postmodern endeavor . Front Psychol . 2021;11:612805. doi:10.3389/fpsyg.2020.612805

Mandler G. A History of Modern Experimental Psychology .

Stanford University. Wilhelm Maximilian Wundt . Stanford Encyclopedia of Philosophy.

Britannica. Gustav Fechner .

Britannica. Hermann von Helmholtz .

Meyer A, Hackert B, Weger U. Franz Brentano and the beginning of experimental psychology: implications for the study of psychological phenomena today . Psychol Res . 2018;82:245-254. doi:10.1007/s00426-016-0825-7

Britannica. Georg Elias Müller .

McCambridge J, de Bruin M, Witton J.  The effects of demand characteristics on research participant behaviours in non-laboratory settings: A systematic review .  PLoS ONE . 2012;7(6):e39116. doi:10.1371/journal.pone.0039116

Laboratory experiments . In: The Sage Encyclopedia of Communication Research Methods. Allen M, ed. SAGE Publications, Inc. doi:10.4135/9781483381411.n287

Schweizer M, Braun B, Milstone A. Research methods in healthcare epidemiology and antimicrobial stewardship — quasi-experimental designs . Infect Control Hosp Epidemiol . 2016;37(10):1135-1140. doi:10.1017/ice.2016.117

Glass A, Kang M. Dividing attention in the classroom reduces exam performance . Educ Psychol . 2019;39(3):395-408. doi:10.1080/01443410.2018.1489046

Keskin M, Ooms K, Dogru AO, De Maeyer P. Exploring the cognitive load of expert and novice map users using EEG and eye tracking . ISPRS Int J Geo-Inf . 2020;9(7):429. doi:10.3390.ijgi9070429

Ho A, Hancock J, Miner A. Psychological, relational, and emotional effects of self-disclosure after conversations with a chatbot . J Commun . 2018;68(4):712-733. doi:10.1093/joc/jqy026

Haynes IV J, Frith E, Sng E, Loprinzi P. Experimental effects of acute exercise on episodic memory function: Considerations for the timing of exercise . Psychol Rep . 2018;122(5):1744-1754. doi:10.1177/0033294118786688

Torresin S, Pernigotto G, Cappelletti F, Gasparella A. Combined effects of environmental factors on human perception and objective performance: A review of experimental laboratory works . Indoor Air . 2018;28(4):525-538. doi:10.1111/ina.12457

Schumpe BM, Belanger JJ, Moyano M, Nisa CF. The role of sensation seeking in political violence: An extension of the significance quest theory . J Personal Social Psychol . 2020;118(4):743-761. doi:10.1037/pspp0000223

By Kendra Cherry, MSEd Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

Research Design

                                                                                   

Unlike a descriptive study, an experiment is a study in which a treatment, procedure, or program is intentionally introduced and a result or outcome is observed. The American Heritage Dictionary of the English Language defines an experiment as “A test under controlled conditions that is made to demonstrate a known truth, to examine the validity of a hypothesis, or to determine the efficacy of something previously untried.”

True experiments have four elements: , , , and . The most important of these elements are manipulation and control. Manipulation means that something is purposefully changed by the researcher in the environment. Control is used to prevent outside factors from influencing the study outcome. When something is manipulated and controlled and then the outcome happens, it makes us more confident that the manipulation “caused” the outcome. In addition, experiments involve highly controlled and procedures in an effort to minimize and which also increases our confidence that the manipulation “caused” the outcome.

Another key element of a true experiment is random assignment. Random assignment means that if there are groups or treatments in the experiment, participants are assigned to these groups or treatments, or randomly (like the flip of a coin). This means that no matter who the participant is, he/she has an equal chance of getting into all of the groups or treatments in an experiment. This process helps to ensure that the groups or treatments are similar at the beginning of the study so that there is more confidence that the manipulation (group or treatment) “caused” the outcome. More information about random assignment may be found in section

                                

                                                                                                         

 

Experimental Studies and Observational Studies

  • Reference work entry
  • First Online: 01 January 2022
  • pp 1748–1756
  • Cite this reference work entry

define manipulative experiment

  • Martin Pinquart 3  

878 Accesses

1 Citations

Experimental studies: Experiments, Randomized controlled trials (RCTs) ; Observational studies: Non-experimental studies, Non-manipulation studies, Naturalistic studies

Definitions

The experimental study is a powerful methodology for testing causal relations between one or more explanatory variables (i.e., independent variables) and one or more outcome variables (i.e., dependent variable). In order to accomplish this goal, experiments have to meet three basic criteria: (a) experimental manipulation (variation) of the independent variable(s), (b) randomization – the participants are randomly assigned to one of the experimental conditions, and (c) experimental control for the effect of third variables by eliminating them or keeping them constant.

In observational studies, investigators observe or assess individuals without manipulation or intervention. Observational studies are used for assessing the mean levels, the natural variation, and the structure of variables, as well as...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Atalay K, Barrett GF (2015) The impact of age pension eligibility age on retirement and program dependence: evidence from an Australian experiment. Rev Econ Stat 97:71–87. https://doi.org/10.1162/REST_a_00443

Article   Google Scholar  

Bergeman L, Boker SM (eds) (2016) Methodological issues in aging research. Psychology Press, Hove

Google Scholar  

Byrkes CR, Bielak AMA (under review) Evaluation of publication bias and statistical power in gerontological psychology. Manuscript submitted for publication

Campbell DT, Stanley JC (1966) Experimental and quasi-experimental designs for research. Rand-McNally, Chicago

Carpenter D (2010) Reputation and power: organizational image and pharmaceutical regulation at the FDA. Princeton University Press, Princeton

Cavanaugh JC, Blanchard-Fields F (2019) Adult development and aging, 8th edn. Cengage, Boston

Fölster M, Hess U, Hühnel I et al (2015) Age-related response bias in the decoding of sad facial expressions. Behav Sci 5:443–460. https://doi.org/10.3390/bs5040443

Freund AM, Isaacowitz DM (2013) Beyond age comparisons: a plea for the use of a modified Brunswikian approach to experimental designs in the study of adult development and aging. Hum Dev 56:351–371. https://doi.org/10.1159/000357177

Haslam C, Morton TA, Haslam A et al (2012) “When the age is in, the wit is out”: age-related self-categorization and deficit expectations reduce performance on clinical tests used in dementia assessment. Psychol Aging 27:778–784. https://doi.org/10.1037/a0027754

Institute for Social Research (2018) The health and retirement study. Aging in the 21st century: Challenges and opportunities for americans. Survey Research Center, University of Michigan

Jung J (1971) The experimenter’s dilemma. Harper & Row, New York

Leary MR (2001) Introduction to behavioral research methods, 3rd edn. Allyn & Bacon, Boston

Lindenberger U, Scherer H, Baltes PB (2001) The strong connection between sensory and cognitive performance in old age: not due to sensory acuity reductions operating during cognitive assessment. Psychol Aging 16:196–205. https://doi.org/10.1037//0882-7974.16.2.196

Löckenhoff CE, Carstensen LL (2004) Socioemotional selectivity theory, aging, and health: the increasingly delicate balance between regulating emotions and making tough choices. J Pers 72:1395–1424. https://doi.org/10.1111/j.1467-6494.2004.00301.x

Maxwell SE (2015) Is psychology suffering from a replication crisis? What does “failure to replicate” really mean? Am Psychol 70:487–498. https://doi.org/10.1037/a0039400

Menard S (2002) Longitudinal research (2nd ed.). Sage, Thousand Oaks, CA

Mitchell SJ, Scheibye-Knudsen M, Longo DL et al (2015) Animal models of aging research: implications for human aging and age-related diseases. Ann Rev Anim Biosci 3:283–303. https://doi.org/10.1146/annurev-animal-022114-110829

Moher D (1998) CONSORT: an evolving tool to help improve the quality of reports of randomized controlled trials. JAMA 279:1489–1491. https://doi.org/10.1001/jama.279.18.1489

Oxford Centre for Evidence-Based Medicine (2011) OCEBM levels of evidence working group. The Oxford Levels of Evidence 2. Available at: https://www.cebm.net/category/ebm-resources/loe/ . Retrieved 2018-12-12

Patten ML, Newhart M (2018) Understanding research methods: an overview of the essentials, 10th edn. Routledge, New York

Piccinin AM, Muniz G, Sparks C et al (2011) An evaluation of analytical approaches for understanding change in cognition in the context of aging and health. J Geront 66B(S1):i36–i49. https://doi.org/10.1093/geronb/gbr038

Pinquart M, Silbereisen RK (2006) Socioemotional selectivity in cancer patients. Psychol Aging 21:419–423. https://doi.org/10.1037/0882-7974.21.2.419

Redman LM, Ravussin E (2011) Caloric restriction in humans: impact on physiological, psychological, and behavioral outcomes. Antioxid Redox Signal 14:275–287. https://doi.org/10.1089/ars.2010.3253

Rutter M (2007) Proceeding from observed correlation to causal inference: the use of natural experiments. Perspect Psychol Sci 2:377–395. https://doi.org/10.1111/j.1745-6916.2007.00050.x

Schaie W, Caskle CI (2005) Methodological issues in aging research. In: Teti D (ed) Handbook of research methods in developmental science. Blackwell, Malden, pp 21–39

Shadish WR, Cook TD, Campbell DT (2002) Experimental and quasi-experimental designs for generalized causal inference. Houghton Mifflin, Boston

Sonnega A, Faul JD, Ofstedal MB et al (2014) Cohort profile: the health and retirement study (HRS). Int J Epidemiol 43:576–585. https://doi.org/10.1093/ije/dyu067

Weil J (2017) Research design in aging and social gerontology: quantitative, qualitative, and mixed methods. Routledge, New York

Download references

Author information

Authors and affiliations.

Psychology, Philipps University, Marburg, Germany

Martin Pinquart

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Martin Pinquart .

Editor information

Editors and affiliations.

Population Division, Department of Economics and Social Affairs, United Nations, New York, NY, USA

Department of Population Health Sciences, Department of Sociology, Duke University, Durham, NC, USA

Matthew E. Dupre

Section Editor information

Department of Sociology and Center for Population Health and Aging, Duke University, Durham, NC, USA

Kenneth C. Land

Department of Sociology, University of Kentucky, Lexington, KY, USA

Anthony R. Bardo

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this entry

Cite this entry.

Pinquart, M. (2021). Experimental Studies and Observational Studies. In: Gu, D., Dupre, M.E. (eds) Encyclopedia of Gerontology and Population Aging. Springer, Cham. https://doi.org/10.1007/978-3-030-22009-9_573

Download citation

DOI : https://doi.org/10.1007/978-3-030-22009-9_573

Published : 24 May 2022

Publisher Name : Springer, Cham

Print ISBN : 978-3-030-22008-2

Online ISBN : 978-3-030-22009-9

eBook Packages : Social Sciences Reference Module Humanities and Social Sciences Reference Module Business, Economics and Social Sciences

Share this entry

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Controlled Experiment

Saul McLeod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul McLeod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Learn about our Editorial Process

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

This is when a hypothesis is scientifically tested.

In a controlled experiment, an independent variable (the cause) is systematically manipulated, and the dependent variable (the effect) is measured; any extraneous variables are controlled.

The researcher can operationalize (i.e., define) the studied variables so they can be objectively measured. The quantitative data can be analyzed to see if there is a difference between the experimental and control groups.

controlled experiment cause and effect

What is the control group?

In experiments scientists compare a control group and an experimental group that are identical in all respects, except for one difference – experimental manipulation.

Unlike the experimental group, the control group is not exposed to the independent variable under investigation and so provides a baseline against which any changes in the experimental group can be compared.

Since experimental manipulation is the only difference between the experimental and control groups, we can be sure that any differences between the two are due to experimental manipulation rather than chance.

Randomly allocating participants to independent variable groups means that all participants should have an equal chance of participating in each condition.

The principle of random allocation is to avoid bias in how the experiment is carried out and limit the effects of participant variables.

control group experimental group

What are extraneous variables?

The researcher wants to ensure that the manipulation of the independent variable has changed the changes in the dependent variable.

Hence, all the other variables that could affect the dependent variable to change must be controlled. These other variables are called extraneous or confounding variables.

Extraneous variables should be controlled were possible, as they might be important enough to provide alternative explanations for the effects.

controlled experiment extraneous variables

In practice, it would be difficult to control all the variables in a child’s educational achievement. For example, it would be difficult to control variables that have happened in the past.

A researcher can only control the current environment of participants, such as time of day and noise levels.

controlled experiment variables

Why conduct controlled experiments?

Scientists use controlled experiments because they allow for precise control of extraneous and independent variables. This allows a cause-and-effect relationship to be established.

Controlled experiments also follow a standardized step-by-step procedure. This makes it easy for another researcher to replicate the study.

Key Terminology

Experimental group.

The group being treated or otherwise manipulated for the sake of the experiment.

Control Group

They receive no treatment and are used as a comparison group.

Ecological validity

The degree to which an investigation represents real-life experiences.

Experimenter effects

These are the ways that the experimenter can accidentally influence the participant through their appearance or behavior.

Demand characteristics

The clues in an experiment lead the participants to think they know what the researcher is looking for (e.g., the experimenter’s body language).

Independent variable (IV)

The variable the experimenter manipulates (i.e., changes) – is assumed to have a direct effect on the dependent variable.

Dependent variable (DV)

Variable the experimenter measures. This is the outcome (i.e., the result) of a study.

Extraneous variables (EV)

All variables that are not independent variables but could affect the results (DV) of the experiment. Extraneous variables should be controlled where possible.

Confounding variables

Variable(s) that have affected the results (DV), apart from the IV. A confounding variable could be an extraneous variable that has not been controlled.

Random Allocation

Randomly allocating participants to independent variable conditions means that all participants should have an equal chance of participating in each condition.

Order effects

Changes in participants’ performance due to their repeating the same or similar test more than once. Examples of order effects include:

(i) practice effect: an improvement in performance on a task due to repetition, for example, because of familiarity with the task;

(ii) fatigue effect: a decrease in performance of a task due to repetition, for example, because of boredom or tiredness.

What is the control in an experiment?

In an experiment , the control is a standard or baseline group not exposed to the experimental treatment or manipulation. It serves as a comparison group to the experimental group, which does receive the treatment or manipulation.

The control group helps to account for other variables that might influence the outcome, allowing researchers to attribute differences in results more confidently to the experimental treatment.

Establishing a cause-and-effect relationship between the manipulated variable (independent variable) and the outcome (dependent variable) is critical in establishing a cause-and-effect relationship between the manipulated variable.

What is the purpose of controlling the environment when testing a hypothesis?

Controlling the environment when testing a hypothesis aims to eliminate or minimize the influence of extraneous variables. These variables other than the independent variable might affect the dependent variable, potentially confounding the results.

By controlling the environment, researchers can ensure that any observed changes in the dependent variable are likely due to the manipulation of the independent variable, not other factors.

This enhances the experiment’s validity, allowing for more accurate conclusions about cause-and-effect relationships.

It also improves the experiment’s replicability, meaning other researchers can repeat the experiment under the same conditions to verify the results.

Why are hypotheses important to controlled experiments?

Hypotheses are crucial to controlled experiments because they provide a clear focus and direction for the research. A hypothesis is a testable prediction about the relationship between variables.

It guides the design of the experiment, including what variables to manipulate (independent variables) and what outcomes to measure (dependent variables).

The experiment is then conducted to test the validity of the hypothesis. If the results align with the hypothesis, they provide evidence supporting it.

The hypothesis may be revised or rejected if the results do not align. Thus, hypotheses are central to the scientific method, driving the iterative inquiry, experimentation, and knowledge advancement process.

What is the experimental method?

The experimental method is a systematic approach in scientific research where an independent variable is manipulated to observe its effect on a dependent variable, under controlled conditions.

Print Friendly, PDF & Email

Logo for Kwantlen Polytechnic University

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Experimental Research

23 Experiment Basics

Learning objectives.

  • Explain what an experiment is and recognize examples of studies that are experiments and studies that are not experiments.
  • Distinguish between the manipulation of the independent variable and control of extraneous variables and explain the importance of each.
  • Recognize examples of confounding variables and explain how they affect the internal validity of a study.
  • Define what a control condition is, explain its purpose in research on treatment effectiveness, and describe some alternative types of control conditions.

What Is an Experiment?

As we saw earlier in the book, an  experiment is a type of study designed specifically to answer the question of whether there is a causal relationship between two variables. In other words, whether changes in one variable (referred to as an independent variable ) cause a change in another variable (referred to as a dependent variable ). Experiments have two fundamental features. The first is that the researchers manipulate, or systematically vary, the level of the independent variable. The different levels of the independent variable are called conditions . For example, in Darley and Latané’s experiment, the independent variable was the number of witnesses that participants believed to be present. The researchers manipulated this independent variable by telling participants that there were either one, two, or five other students involved in the discussion, thereby creating three conditions. For a new researcher, it is easy to confuse these terms by believing there are three independent variables in this situation: one, two, or five students involved in the discussion, but there is actually only one independent variable (number of witnesses) with three different levels or conditions (one, two or five students). The second fundamental feature of an experiment is that the researcher exerts control over, or minimizes the variability in, variables other than the independent and dependent variable. These other variables are called extraneous variables . Darley and Latané tested all their participants in the same room, exposed them to the same emergency situation, and so on. They also randomly assigned their participants to conditions so that the three groups would be similar to each other to begin with. Notice that although the words  manipulation  and  control  have similar meanings in everyday language, researchers make a clear distinction between them. They manipulate  the independent variable by systematically changing its levels and control  other variables by holding them constant.

Manipulation of the Independent Variable

Again, to  manipulate an independent variable means to change its level systematically so that different groups of participants are exposed to different levels of that variable, or the same group of participants is exposed to different levels at different times. For example, to see whether expressive writing affects people’s health, a researcher might instruct some participants to write about traumatic experiences and others to write about neutral experiences. The different levels of the independent variable are referred to as conditions , and researchers often give the conditions short descriptive names to make it easy to talk and write about them. In this case, the conditions might be called the “traumatic condition” and the “neutral condition.”

Notice that the manipulation of an independent variable must involve the active intervention of the researcher. Comparing groups of people who differ on the independent variable before the study begins is not the same as manipulating that variable. For example, a researcher who compares the health of people who already keep a journal with the health of people who do not keep a journal has not manipulated this variable and therefore has not conducted an experiment. This distinction  is important because groups that already differ in one way at the beginning of a study are likely to differ in other ways too. For example, people who choose to keep journals might also be more conscientious, more introverted, or less stressed than people who do not. Therefore, any observed difference between the two groups in terms of their health might have been caused by whether or not they keep a journal, or it might have been caused by any of the other differences between people who do and do not keep journals. Thus the active manipulation of the independent variable is crucial for eliminating potential alternative explanations for the results.

Of course, there are many situations in which the independent variable cannot be manipulated for practical or ethical reasons and therefore an experiment is not possible. For example, whether or not people have a significant early illness experience cannot be manipulated, making it impossible to conduct an experiment on the effect of early illness experiences on the development of hypochondriasis. This caveat does not mean it is impossible to study the relationship between early illness experiences and hypochondriasis—only that it must be done using nonexperimental approaches. We will discuss this type of methodology in detail later in the book.

Independent variables can be manipulated to create two conditions and experiments involving a single independent variable with two conditions are often referred to as a single factor two-level design .  However, sometimes greater insights can be gained by adding more conditions to an experiment. When an experiment has one independent variable that is manipulated to produce more than two conditions it is referred to as a single factor multi level design .  So rather than comparing a condition in which there was one witness to a condition in which there were five witnesses (which would represent a single-factor two-level design), Darley and Latané’s experiment used a single factor multi-level design, by manipulating the independent variable to produce three conditions (a one witness, a two witnesses, and a five witnesses condition).

Control of Extraneous Variables

As we have seen previously in the chapter, an  extraneous variable  is anything that varies in the context of a study other than the independent and dependent variables. In an experiment on the effect of expressive writing on health, for example, extraneous variables would include participant variables (individual differences) such as their writing ability, their diet, and their gender. They would also include situational or task variables such as the time of day when participants write, whether they write by hand or on a computer, and the weather. Extraneous variables pose a problem because many of them are likely to have some effect on the dependent variable. For example, participants’ health will be affected by many things other than whether or not they engage in expressive writing. This influencing factor can make it difficult to separate the effect of the independent variable from the effects of the extraneous variables, which is why it is important to control extraneous variables by holding them constant.

Extraneous Variables as “Noise”

Extraneous variables make it difficult to detect the effect of the independent variable in two ways. One is by adding variability or “noise” to the data. Imagine a simple experiment on the effect of mood (happy vs. sad) on the number of happy childhood events people are able to recall. Participants are put into a negative or positive mood (by showing them a happy or sad video clip) and then asked to recall as many happy childhood events as they can. The two leftmost columns of  Table 5.1 show what the data might look like if there were no extraneous variables and the number of happy childhood events participants recalled was affected only by their moods. Every participant in the happy mood condition recalled exactly four happy childhood events, and every participant in the sad mood condition recalled exactly three. The effect of mood here is quite obvious. In reality, however, the data would probably look more like those in the two rightmost columns of  Table 5.1 . Even in the happy mood condition, some participants would recall fewer happy memories because they have fewer to draw on, use less effective recall strategies, or are less motivated. And even in the sad mood condition, some participants would recall more happy childhood memories because they have more happy memories to draw on, they use more effective recall strategies, or they are more motivated. Although the mean difference between the two groups is the same as in the idealized data, this difference is much less obvious in the context of the greater variability in the data. Thus one reason researchers try to control extraneous variables is so their data look more like the idealized data in  Table 5.1 , which makes the effect of the independent variable easier to detect (although real data never look quite  that  good).

4 3 3 1
4 3 6 3
4 3 2 4
4 3 4 0
4 3 5 5
4 3 2 7
4 3 3 2
4 3 1 5
4 3 6 1
4 3 8 2
 = 4  = 3  = 4  = 3

One way to control extraneous variables is to hold them constant. This technique can mean holding situation or task variables constant by testing all participants in the same location, giving them identical instructions, treating them in the same way, and so on. It can also mean holding participant variables constant. For example, many studies of language limit participants to right-handed people, who generally have their language areas isolated in their left cerebral hemispheres [1] . Left-handed people are more likely to have their language areas isolated in their right cerebral hemispheres or distributed across both hemispheres, which can change the way they process language and thereby add noise to the data.

In principle, researchers can control extraneous variables by limiting participants to one very specific category of person, such as 20-year-old, heterosexual, female, right-handed psychology majors. The obvious downside to this approach is that it would lower the external validity of the study—in particular, the extent to which the results can be generalized beyond the people actually studied. For example, it might be unclear whether results obtained with a sample of younger lesbian women would apply to older gay men. In many situations, the advantages of a diverse sample (increased external validity) outweigh the reduction in noise achieved by a homogeneous one.

Extraneous Variables as Confounding Variables

The second way that extraneous variables can make it difficult to detect the effect of the independent variable is by becoming confounding variables. A confounding variable  is an extraneous variable that differs on average across  levels of the independent variable (i.e., it is an extraneous variable that varies systematically with the independent variable). For example, in almost all experiments, participants’ intelligence quotients (IQs) will be an extraneous variable. But as long as there are participants with lower and higher IQs in each condition so that the average IQ is roughly equal across the conditions, then this variation is probably acceptable (and may even be desirable). What would be bad, however, would be for participants in one condition to have substantially lower IQs on average and participants in another condition to have substantially higher IQs on average. In this case, IQ would be a confounding variable.

To confound means to confuse , and this effect is exactly why confounding variables are undesirable. Because they differ systematically across conditions—just like the independent variable—they provide an alternative explanation for any observed difference in the dependent variable.  Figure 5.1  shows the results of a hypothetical study, in which participants in a positive mood condition scored higher on a memory task than participants in a negative mood condition. But if IQ is a confounding variable—with participants in the positive mood condition having higher IQs on average than participants in the negative mood condition—then it is unclear whether it was the positive moods or the higher IQs that caused participants in the first condition to score higher. One way to avoid confounding variables is by holding extraneous variables constant. For example, one could prevent IQ from becoming a confounding variable by limiting participants only to those with IQs of exactly 100. But this approach is not always desirable for reasons we have already discussed. A second and much more general approach—random assignment to conditions—will be discussed in detail shortly.

Figure 5.1 Hypothetical Results From a Study on the Effect of Mood on Memory. Because IQ also differs across conditions, it is a confounding variable.

Treatment and Control Conditions

In psychological research, a treatment is any intervention meant to change people’s behavior for the better. This intervention includes psychotherapies and medical treatments for psychological disorders but also interventions designed to improve learning, promote conservation, reduce prejudice, and so on. To determine whether a treatment works, participants are randomly assigned to either a treatment condition , in which they receive the treatment, or a control condition , in which they do not receive the treatment. If participants in the treatment condition end up better off than participants in the control condition—for example, they are less depressed, learn faster, conserve more, express less prejudice—then the researcher can conclude that the treatment works. In research on the effectiveness of psychotherapies and medical treatments, this type of experiment is often called a randomized clinical trial .

There are different types of control conditions. In a no-treatment control condition , participants receive no treatment whatsoever. One problem with this approach, however, is the existence of placebo effects. A placebo is a simulated treatment that lacks any active ingredient or element that should make it effective, and a placebo effect is a positive effect of such a treatment. Many folk remedies that seem to work—such as eating chicken soup for a cold or placing soap under the bed sheets to stop nighttime leg cramps—are probably nothing more than placebos. Although placebo effects are not well understood, they are probably driven primarily by people’s expectations that they will improve. Having the expectation to improve can result in reduced stress, anxiety, and depression, which can alter perceptions and even improve immune system functioning (Price, Finniss, & Benedetti, 2008) [2] .

Placebo effects are interesting in their own right (see Note “The Powerful Placebo” ), but they also pose a serious problem for researchers who want to determine whether a treatment works. Figure 5.2 shows some hypothetical results in which participants in a treatment condition improved more on average than participants in a no-treatment control condition. If these conditions (the two leftmost bars in Figure 5.2 ) were the only conditions in this experiment, however, one could not conclude that the treatment worked. It could be instead that participants in the treatment group improved more because they expected to improve, while those in the no-treatment control condition did not.

Figure 5.2 Hypothetical Results From a Study Including Treatment, No-Treatment, and Placebo Conditions

Fortunately, there are several solutions to this problem. One is to include a placebo control condition , in which participants receive a placebo that looks much like the treatment but lacks the active ingredient or element thought to be responsible for the treatment’s effectiveness. When participants in a treatment condition take a pill, for example, then those in a placebo control condition would take an identical-looking pill that lacks the active ingredient in the treatment (a “sugar pill”). In research on psychotherapy effectiveness, the placebo might involve going to a psychotherapist and talking in an unstructured way about one’s problems. The idea is that if participants in both the treatment and the placebo control groups expect to improve, then any improvement in the treatment group over and above that in the placebo control group must have been caused by the treatment and not by participants’ expectations. This difference is what is shown by a comparison of the two outer bars in Figure 5.4 .

Of course, the principle of informed consent requires that participants be told that they will be assigned to either a treatment or a placebo control condition—even though they cannot be told which until the experiment ends. In many cases the participants who had been in the control condition are then offered an opportunity to have the real treatment. An alternative approach is to use a wait-list control condition , in which participants are told that they will receive the treatment but must wait until the participants in the treatment condition have already received it. This disclosure allows researchers to compare participants who have received the treatment with participants who are not currently receiving it but who still expect to improve (eventually). A final solution to the problem of placebo effects is to leave out the control condition completely and compare any new treatment with the best available alternative treatment. For example, a new treatment for simple phobia could be compared with standard exposure therapy. Because participants in both conditions receive a treatment, their expectations about improvement should be similar. This approach also makes sense because once there is an effective treatment, the interesting question about a new treatment is not simply “Does it work?” but “Does it work better than what is already available?

The Powerful Placebo

Many people are not surprised that placebos can have a positive effect on disorders that seem fundamentally psychological, including depression, anxiety, and insomnia. However, placebos can also have a positive effect on disorders that most people think of as fundamentally physiological. These include asthma, ulcers, and warts (Shapiro & Shapiro, 1999) [3] . There is even evidence that placebo surgery—also called “sham surgery”—can be as effective as actual surgery.

Medical researcher J. Bruce Moseley and his colleagues conducted a study on the effectiveness of two arthroscopic surgery procedures for osteoarthritis of the knee (Moseley et al., 2002) [4] . The control participants in this study were prepped for surgery, received a tranquilizer, and even received three small incisions in their knees. But they did not receive the actual arthroscopic surgical procedure. Note that the IRB would have carefully considered the use of deception in this case and judged that the benefits of using it outweighed the risks and that there was no other way to answer the research question (about the effectiveness of a placebo procedure) without it. The surprising result was that all participants improved in terms of both knee pain and function, and the sham surgery group improved just as much as the treatment groups. According to the researchers, “This study provides strong evidence that arthroscopic lavage with or without débridement [the surgical procedures used] is not better than and appears to be equivalent to a placebo procedure in improving knee pain and self-reported function” (p. 85).

  • Knecht, S., Dräger, B., Deppe, M., Bobe, L., Lohmann, H., Flöel, A., . . . Henningsen, H. (2000). Handedness and hemispheric language dominance in healthy humans. Brain: A Journal of Neurology, 123 (12), 2512-2518. http://dx.doi.org/10.1093/brain/123.12.2512 ↵
  • Price, D. D., Finniss, D. G., & Benedetti, F. (2008). A comprehensive review of the placebo effect: Recent advances and current thought. Annual Review of Psychology, 59 , 565–590. ↵
  • Shapiro, A. K., & Shapiro, E. (1999). The powerful placebo: From ancient priest to modern physician . Baltimore, MD: Johns Hopkins University Press. ↵
  • Moseley, J. B., O’Malley, K., Petersen, N. J., Menke, T. J., Brody, B. A., Kuykendall, D. H., … Wray, N. P. (2002). A controlled trial of arthroscopic surgery for osteoarthritis of the knee. The New England Journal of Medicine, 347 , 81–88. ↵

A type of study designed specifically to answer the question of whether there is a causal relationship between two variables.

The variable the experimenter manipulates.

The variable the experimenter measures (it is the presumed effect).

The different levels of the independent variable to which participants are assigned.

Holding extraneous variables constant in order to separate the effect of the independent variable from the effect of the extraneous variables.

Any variable other than the dependent and independent variable.

Changing the level, or condition, of the independent variable systematically so that different groups of participants are exposed to different levels of that variable, or the same group of participants is exposed to different levels at different times.

An experiment design involving a single independent variable with two conditions.

When an experiment has one independent variable that is manipulated to produce more than two conditions.

An extraneous variable that varies systematically with the independent variable, and thus confuses the effect of the independent variable with the effect of the extraneous one.

Any intervention meant to change people’s behavior for the better.

The condition in which participants receive the treatment.

The condition in which participants do not receive the treatment.

An experiment that researches the effectiveness of psychotherapies and medical treatments.

The condition in which participants receive no treatment whatsoever.

A simulated treatment that lacks any active ingredient or element that is hypothesized to make the treatment effective, but is otherwise identical to the treatment.

An effect that is due to the placebo rather than the treatment.

Condition in which the participants receive a placebo rather than the treatment.

Condition in which participants are told that they will receive the treatment but must wait until the participants in the treatment condition have already received it.

Research Methods in Psychology Copyright © 2019 by Rajiv S. Jhangiani, I-Chant A. Chiang, Carrie Cuttler, & Dana C. Leighton is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

Our systems are now restored following recent technical disruption, and we’re working hard to catch up on publishing. We apologise for the inconvenience caused. Find out more: https://www.cambridge.org/universitypress/about-us/news-and-blogs/cambridge-university-press-publishing-update-following-technical-disruption

We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings .

Login Alert

  • > Statistics Explained
  • > Introductory concepts of experimental design

define manipulative experiment

Book contents

4 - introductory concepts of experimental design.

Published online by Cambridge University Press:  05 June 2012

Introduction

To generate hypotheses you often sample different groups or places (which is sometimes called a mensurative experiment because you usually measure something, such as height or weight, on each experimental unit) and explore these data for patterns or associations. To test hypotheses you may do mensurative experiments, or manipulative experiments where you change a condition and observe the effect of that change upon each experimental unit (like the experiment with millipedes and light described in Chapter 2). Often you may do several experiments of both types to test a particular hypothesis. The quality of your sampling and the design of your experiment can have an effect upon the outcome and determine whether your hypothesis is rejected or not. Therefore it is important to have an appropriate and properly designed experiment.

First, you should attempt to make your measurements as accurate and precise as possible so they are the best estimates of actual values.

Accuracy is the closeness of a measured value to the true value.

Precision is the ‘spread’ or variability of repeated measures of the same value.

For example, a thermometer that consistently gives a reading corresponding to a true temperature (e.g. 20℃) is both accurate and precise. Another that gives a reading consistently higher (e.g. + 10℃) than a true temperature is not accurate, but it is very precise.

Access options

Save book to kindle.

To save this book to your Kindle, first ensure [email protected] is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle .

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service .

  • Introductory concepts of experimental design
  • Steve McKillup , Central Queensland University
  • Book: Statistics Explained
  • Online publication: 05 June 2012
  • Chapter DOI: https://doi.org/10.1017/CBO9780511815935.005

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox .

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive .

5.1 Experiment Basics

Learning objectives.

  • Explain what an experiment is and recognize examples of studies that are experiments and studies that are not experiments.
  • Distinguish between the manipulation of the independent variable and control of extraneous variables and explain the importance of each.
  • Recognize examples of confounding variables and explain how they affect the internal validity of a study.

What Is an Experiment?

As we saw earlier in the book, an  experiment  is a type of study designed specifically to answer the question of whether there is a causal relationship between two variables. In other words, whether changes in an independent variable  cause  a change in a dependent variable. Experiments have two fundamental features. The first is that the researchers manipulate, or systematically vary, the level of the independent variable. The different levels of the independent variable are called conditions . For example, in Darley and Latané’s experiment, the independent variable was the number of witnesses that participants believed to be present. The researchers manipulated this independent variable by telling participants that there were either one, two, or five other students involved in the discussion, thereby creating three conditions. For a new researcher, it is easy to confuse  these terms by believing there are three independent variables in this situation: one, two, or five students involved in the discussion, but there is actually only one independent variable (number of witnesses) with three different levels or conditions (one, two or five students). The second fundamental feature of an experiment is that the researcher controls, or minimizes the variability in, variables other than the independent and dependent variable. These other variables are called extraneous variables . Darley and Latané tested all their participants in the same room, exposed them to the same emergency situation, and so on. They also randomly assigned their participants to conditions so that the three groups would be similar to each other to begin with. Notice that although the words  manipulation  and  control  have similar meanings in everyday language, researchers make a clear distinction between them. They manipulate  the independent variable by systematically changing its levels and control  other variables by holding them constant.

Manipulation of the Independent Variable

Again, to  manipulate  an independent variable means to change its level systematically so that different groups of participants are exposed to different levels of that variable, or the same group of participants is exposed to different levels at different times. For example, to see whether expressive writing affects people’s health, a researcher might instruct some participants to write about traumatic experiences and others to write about neutral experiences. As discussed earlier in this chapter, the different levels of the independent variable are referred to as  conditions , and researchers often give the conditions short descriptive names to make it easy to talk and write about them. In this case, the conditions might be called the “traumatic condition” and the “neutral condition.”

Notice that the manipulation of an independent variable must involve the active intervention of the researcher. Comparing groups of people who differ on the independent variable before the study begins is not the same as manipulating that variable. For example, a researcher who compares the health of people who already keep a journal with the health of people who do not keep a journal has not manipulated this variable and therefore has not conducted an experiment. This distinction  is important because groups that already differ in one way at the beginning of a study are likely to differ in other ways too. For example, people who choose to keep journals might also be more conscientious, more introverted, or less stressed than people who do not. Therefore, any observed difference between the two groups in terms of their health might have been caused by whether or not they keep a journal, or it might have been caused by any of the other differences between people who do and do not keep journals. Thus the active manipulation of the independent variable is crucial for eliminating potential alternative explanations for the results.

Of course, there are many situations in which the independent variable cannot be manipulated for practical or ethical reasons and therefore an experiment is not possible. For example, whether or not people have a significant early illness experience cannot be manipulated, making it impossible to conduct an experiment on the effect of early illness experiences on the development of hypochondriasis. This caveat does not mean it is impossible to study the relationship between early illness experiences and hypochondriasis—only that it must be done using nonexperimental approaches. We will discuss this type of methodology in detail later in the book.

Independent variables can be manipulated to create two conditions and experiments involving a single independent variable with two conditions is often referred to as a  single factor two-level design.  However, sometimes greater insights can be gained by adding more conditions to an experiment. When an experiment has one independent variable that is manipulated to produce more than two conditions it is referred to as a single factor multi level design.  So rather than comparing a condition in which there was one witness to a condition in which there were five witnesses (which would represent a single-factor two-level design), Darley and Latané’s used a single factor multi-level design, by manipulating the independent variable to produce three conditions (a one witness, a two witnesses, and a five witnesses condition).

Control of Extraneous Variables

As we have seen previously in the chapter, an  extraneous variable  is anything that varies in the context of a study other than the independent and dependent variables. In an experiment on the effect of expressive writing on health, for example, extraneous variables would include participant variables (individual differences) such as their writing ability, their diet, and their gender. They would also include situational or task variables such as the time of day when participants write, whether they write by hand or on a computer, and the weather. Extraneous variables pose a problem because many of them are likely to have some effect on the dependent variable. For example, participants’ health will be affected by many things other than whether or not they engage in expressive writing. This influencing factor can make it difficult to separate the effect of the independent variable from the effects of the extraneous variables, which is why it is important to  control  extraneous variables by holding them constant.

Extraneous Variables as “Noise”

Extraneous variables make it difficult to detect the effect of the independent variable in two ways. One is by adding variability or “noise” to the data. Imagine a simple experiment on the effect of mood (happy vs. sad) on the number of happy childhood events people are able to recall. Participants are put into a negative or positive mood (by showing them a happy or sad video clip) and then asked to recall as many happy childhood events as they can. The two leftmost columns of  Table 5.1 show what the data might look like if there were no extraneous variables and the number of happy childhood events participants recalled was affected only by their moods. Every participant in the happy mood condition recalled exactly four happy childhood events, and every participant in the sad mood condition recalled exactly three. The effect of mood here is quite obvious. In reality, however, the data would probably look more like those in the two rightmost columns of  Table 5.1 . Even in the happy mood condition, some participants would recall fewer happy memories because they have fewer to draw on, use less effective recall strategies, or are less motivated. And even in the sad mood condition, some participants would recall more happy childhood memories because they have more happy memories to draw on, they use more effective recall strategies, or they are more motivated. Although the mean difference between the two groups is the same as in the idealized data, this difference is much less obvious in the context of the greater variability in the data. Thus one reason researchers try to control extraneous variables is so their data look more like the idealized data in  Table 5.1 , which makes the effect of the independent variable easier to detect (although real data never look quite  that  good).

4 3 3 1
4 3 6 3
4 3 2 4
4 3 4 0
4 3 5 5
4 3 2 7
4 3 3 2
4 3 1 5
4 3 6 1
4 3 8 2
 = 4  = 3  = 4  = 3

One way to control extraneous variables is to hold them constant. This technique can mean holding situation or task variables constant by testing all participants in the same location, giving them identical instructions, treating them in the same way, and so on. It can also mean holding participant variables constant. For example, many studies of language limit participants to right-handed people, who generally have their language areas isolated in their left cerebral hemispheres. Left-handed people are more likely to have their language areas isolated in their right cerebral hemispheres or distributed across both hemispheres, which can change the way they process language and thereby add noise to the data.

In principle, researchers can control extraneous variables by limiting participants to one very specific category of person, such as 20-year-old, heterosexual, female, right-handed psychology majors. The obvious downside to this approach is that it would lower the external validity of the study—in particular, the extent to which the results can be generalized beyond the people actually studied. For example, it might be unclear whether results obtained with a sample of younger heterosexual women would apply to older homosexual men. In many situations, the advantages of a diverse sample (increased external validity) outweigh the reduction in noise achieved by a homogeneous one.

Extraneous Variables as Confounding Variables

The second way that extraneous variables can make it difficult to detect the effect of the independent variable is by becoming confounding variables. A confounding variable  is an extraneous variable that differs on average across  levels of the independent variable (i.e., it is an extraneous variable that varies systematically with the independent variable). For example, in almost all experiments, participants’ intelligence quotients (IQs) will be an extraneous variable. But as long as there are participants with lower and higher IQs in each condition so that the average IQ is roughly equal across the conditions, then this variation is probably acceptable (and may even be desirable). What would be bad, however, would be for participants in one condition to have substantially lower IQs on average and participants in another condition to have substantially higher IQs on average. In this case, IQ would be a confounding variable.

To confound means to confuse , and this effect is exactly why confounding variables are undesirable. Because they differ systematically across conditions—just like the independent variable—they provide an alternative explanation for any observed difference in the dependent variable.  Figure 5.1  shows the results of a hypothetical study, in which participants in a positive mood condition scored higher on a memory task than participants in a negative mood condition. But if IQ is a confounding variable—with participants in the positive mood condition having higher IQs on average than participants in the negative mood condition—then it is unclear whether it was the positive moods or the higher IQs that caused participants in the first condition to score higher. One way to avoid confounding variables is by holding extraneous variables constant. For example, one could prevent IQ from becoming a confounding variable by limiting participants only to those with IQs of exactly 100. But this approach is not always desirable for reasons we have already discussed. A second and much more general approach—random assignment to conditions—will be discussed in detail shortly.

Figure 6.1 Hypothetical Results From a Study on the Effect of Mood on Memory. Because IQ also differs across conditions, it is a confounding variable.

Figure 5.1 Hypothetical Results From a Study on the Effect of Mood on Memory. Because IQ also differs across conditions, it is a confounding variable.

Key Takeaways

  • An experiment is a type of empirical study that features the manipulation of an independent variable, the measurement of a dependent variable, and control of extraneous variables.
  • An extraneous variable is any variable other than the independent and dependent variables. A confound is an extraneous variable that varies systematically with the independent variable.
  • Practice: List five variables that can be manipulated by the researcher in an experiment. List five variables that cannot be manipulated by the researcher in an experiment.
  • Effect of parietal lobe damage on people’s ability to do basic arithmetic.
  • Effect of being clinically depressed on the number of close friendships people have.
  • Effect of group training on the social skills of teenagers with Asperger’s syndrome.
  • Effect of paying people to take an IQ test on their performance on that test.

Creative Commons License

Share This Book

  • Increase Font Size

Logo for BCcampus Open Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Chapter 6: Experimental Research

Experiment Basics

Learning Objectives

  • Explain what an experiment is and recognize examples of studies that are experiments and studies that are not experiments.
  • Explain what internal validity is and why experiments are considered to be high in internal validity.
  • Explain what external validity is and evaluate studies in terms of their external validity.
  • Distinguish between the manipulation of the independent variable and control of extraneous variables and explain the importance of each.
  • Recognize examples of confounding variables and explain how they affect the internal validity of a study.

What Is an Experiment?

As we saw earlier in the book, an  experiment  is a type of study designed specifically to answer the question of whether there is a causal relationship between two variables. In other words, whether changes in an independent variable  cause  changes in a dependent variable. Experiments have two fundamental features. The first is that the researchers manipulate, or systematically vary, the level of the independent variable. The different levels of the independent variable are called conditions . For example, in Darley and Latané’s experiment, the independent variable was the number of witnesses that participants believed to be present. The researchers manipulated this independent variable by telling participants that there were either one, two, or five other students involved in the discussion, thereby creating three conditions. For a new researcher, it is easy to confuse  these terms by believing there are three independent variables in this situation: one, two, or five students involved in the discussion, but there is actually only one independent variable (number of witnesses) with three different conditions (one, two or five students). The second fundamental feature of an experiment is that the researcher controls, or minimizes the variability in, variables other than the independent and dependent variable. These other variables are called extraneous variables . Darley and Latané tested all their participants in the same room, exposed them to the same emergency situation, and so on. They also randomly assigned their participants to conditions so that the three groups would be similar to each other to begin with. Notice that although the words  manipulation  and  control  have similar meanings in everyday language, researchers make a clear distinction between them. They manipulate  the independent variable by systematically changing its levels and control  other variables by holding them constant.

Four Big Validities

When we read about psychology experiments with a critical view, one question to ask is “is this study valid?” However, that question is not as straightforward as it seems because in psychology, there are many different kinds of validities. Researchers have focused on four validities to help assess whether an experiment is sound (Judd & Kenny, 1981; Morling, 2014) [1] [2] :internal validity, external validity, construct validity, and statistical validity. We will explore each validity in depth.

Internal Validity

Recall that two variables being statistically related does not necessarily mean that one causes the other. “Correlation does not imply causation.” For example, if it were the case that people who exercise regularly are happier than people who do not exercise regularly, this implication would not necessarily mean that exercising increases people’s happiness. It could mean instead that greater happiness causes people to exercise (the directionality problem) or that something like better physical health causes people to exercise   and  be happier (the third-variable problem).

The purpose of an experiment, however, is to show that two variables are statistically related and to do so in a way that supports the conclusion that the independent variable caused any observed differences in the dependent variable. The logic is based on this assumption : If the researcher creates two or more highly similar conditions and then manipulates the independent variable to produce just  one  difference between them, then any later difference between the conditions must have been caused by the independent variable. For example, because the only difference between Darley and Latané’s conditions was the number of students that participants believed to be involved in the discussion, this difference in belief must have been responsible for differences in helping between the conditions.

An empirical study is said to be high in  internal validity  if the way it was conducted supports the conclusion that the independent variable caused any observed differences in the dependent variable. Thus experiments are high in internal validity because the way they are conducted—with the manipulation of the independent variable and the control of extraneous variables—provides strong support for causal conclusions.

External Validity

At the same time, the way that experiments are conducted sometimes leads to a different kind of criticism. Specifically, the need to manipulate the independent variable and control extraneous variables means that experiments are often conducted under conditions that seem artificial (Bauman, McGraw, Bartels, & Warren, 2014) [3] .In many psychology experiments, the participants are all undergraduate students and come to a classroom or laboratory to fill out a series of paper-and-pencil questionnaires or to perform a carefully designed computerized task. Consider, for example, an experiment in which researcher Barbara Fredrickson and her colleagues had undergraduate students come to a laboratory on campus and complete a math test while wearing a swimsuit (Fredrickson, Roberts, Noll, Quinn, & Twenge, 1998) [4] .At first, this manipulation might seem silly. When will undergraduate students ever have to complete math tests in their swimsuits outside of this experiment?

The issue we are confronting is that of external validity . An empirical study is high in external validity if the way it was conducted supports generalizing the results to people and situations beyond those actually studied. As a general rule, studies are higher in external validity when the participants and the situation studied are similar to those that the researchers want to generalize to and participants encounter everyday, often described as mundane realism . Imagine, for example, that a group of researchers is interested in how shoppers in large grocery stores are affected by whether breakfast cereal is packaged in yellow or purple boxes. Their study would be high in external validity and have high mundane realism if they studied the decisions of ordinary people doing their weekly shopping in a real grocery store. If the shoppers bought much more cereal in purple boxes, the researchers would be fairly confident that this increase would be true for other shoppers in other stores. Their study would be relatively low in external validity, however, if they studied a sample of undergraduate students in a laboratory at a selective university who merely judged the appeal of various colours presented on a computer screen; however, this study would have high psychological realism where the same mental process is used in both the laboratory and in the real world.  If the students judged purple to be more appealing than yellow, the researchers would not be very confident that this preference is relevant to grocery shoppers’ cereal-buying decisions because of low external validity but they could be confident that the visual processing of colours has high psychological realism.

We should be careful, however, not to draw the blanket conclusion that experiments are low in external validity. One reason is that experiments need not seem artificial. Consider that Darley and Latané’s experiment provided a reasonably good simulation of a real emergency situation. Or consider  field experiments  that are conducted entirely outside the laboratory. In one such experiment, Robert Cialdini and his colleagues studied whether hotel guests choose to reuse their towels for a second day as opposed to having them washed as a way of conserving water and energy (Cialdini, 2005) [5] . These researchers manipulated the message on a card left in a large sample of hotel rooms. One version of the message emphasized showing respect for the environment, another emphasized that the hotel would donate a portion of their savings to an environmental cause, and a third emphasized that most hotel guests choose to reuse their towels. The result was that guests who received the message that most hotel guests choose to reuse their towels reused their own towels substantially more often than guests receiving either of the other two messages. Given the way they conducted their study, it seems very likely that their result would hold true for other guests in other hotels.

A second reason not to draw the blanket conclusion that experiments are low in external validity is that they are often conducted to learn about psychological processes  that are likely to operate in a variety of people and situations. Let us return to the experiment by Fredrickson and colleagues. They found that the women in their study, but not the men, performed worse on the math test when they were wearing swimsuits. They argued that this gender difference was due to women’s greater tendency to objectify themselves—to think about themselves from the perspective of an outside observer—which diverts their attention away from other tasks. They argued, furthermore, that this process of self-objectification and its effect on attention is likely to operate in a variety of women and situations—even if none of them ever finds herself taking a math test in her swimsuit.

Construct Validity

In addition to the generalizability of the results of an experiment, another element to scrutinize in a study is the quality of the experiment’s manipulations, or the construct validity . The research question that Darley and Latané started with is “does helping behaviour become diffused?” They hypothesized that participants in a lab would be less likely to help when they believed there were more potential helpers besides themselves. This conversion from research question to experiment design is called operationalization (see Chapter 2 for more information about the operational definition). Darley and Latané operationalized the independent variable of diffusion of responsibility by increasing the number of potential helpers. In evaluating this design, we would say that the construct validity was very high because the experiment’s manipulations very clearly speak to the research question; there was a crisis, a way for the participant to help, and increasing the number of other students involved in the discussion, they provided a way to test diffusion.

What if the number of conditions in Darley and Latané’s study changed? Consider if there were only two conditions: one student involved in the discussion or two. Even though we may see a decrease in helping by adding another person, it may not be a clear demonstration of diffusion of responsibility, just merely the presence of others. We might think it was a form of Bandura’s social inhibition  (discussed in Chapter 4). The construct validity would be lower. However, had there been five conditions, perhaps we would see the decrease continue with more people in the discussion or perhaps it would plateau after a certain number of people. In that situation, we may not necessarily be learning more about diffusion of responsibility or it may become a different phenomenon. By adding more conditions, the construct validity may not get higher. When designing your own experiment, consider how well the research question is operationalized your study.

Statistical Validity

A common critique of experiments is that a study did not have enough participants. The main reason for this criticism is that it is difficult to generalize about a population from a small sample. At the outset, it seems as though this critique is about external validity but there are studies where small sample sizes are not a problem (Chapter 10 will discuss how small samples, even of only 1 person, are still very illuminating for psychology research). Therefore, small sample sizes are actually a critique of statistical validity . The statistical validity speaks to whether the statistics conducted in the study support the conclusions that are made.

Proper statistical analysis should be conducted on the data to determine whether the difference or relationship that was predicted was found. The number of conditions and the number of total participants will determine the overall size of the effect. With this information, a power analysis can be conducted to ascertain whether you are likely to find a real difference. When designing a study, it is best to think about the power analysis so that the appropriate number of participants can be recruited and tested (more on effect sizes in Chapter 12). To design a statistically valid experiment, thinking about the statistical tests at the beginning of the design will help ensure the results can be believed.

Prioritizing Validities

These four big validities–internal, external, construct, and statistical–are useful to keep in mind when both reading about other experiments and designing your own. However, researchers must prioritize and often it is not possible to have high validity in all four areas. In Cialdini’s study on towel usage in hotels, the external validity was high but the statistical validity was more modest. This discrepancy does not invalidate the study but it shows where there may be room for improvement for future follow-up studies (Goldstein, Cialdini, & Griskevicius, 2008) [6] . Morling (2014) points out that most psychology studies have high internal and construct validity but sometimes sacrifice external validity.

Manipulation of the Independent Variable

Again, to  manipulate  an independent variable means to change its level systematically so that different groups of participants are exposed to different levels of that variable, or the same group of participants is exposed to different levels at different times. For example, to see whether expressive writing affects people’s health, a researcher might instruct some participants to write about traumatic experiences and others to write about neutral experiences. As discussed earlier in this chapter, the different levels of the independent variable are referred to as  conditions , and researchers often give the conditions short descriptive names to make it easy to talk and write about them. In this case, the conditions might be called the “traumatic condition” and the “neutral condition.”

Notice that the manipulation of an independent variable must involve the active intervention of the researcher. Comparing groups of people who differ on the independent variable before the study begins is not the same as manipulating that variable. For example, a researcher who compares the health of people who already keep a journal with the health of people who do not keep a journal has not manipulated this variable and therefore not conducted an experiment. This distinction  is important because groups that already differ in one way at the beginning of a study are likely to differ in other ways too. For example, people who choose to keep journals might also be more conscientious, more introverted, or less stressed than people who do not. Therefore, any observed difference between the two groups in terms of their health might have been caused by whether or not they keep a journal, or it might have been caused by any of the other differences between people who do and do not keep journals. Thus the active manipulation of the independent variable is crucial for eliminating the third-variable problem.

Of course, there are many situations in which the independent variable cannot be manipulated for practical or ethical reasons and therefore an experiment is not possible. For example, whether or not people have a significant early illness experience cannot be manipulated, making it impossible to conduct an experiment on the effect of early illness experiences on the development of hypochondriasis. This caveat does not mean it is impossible to study the relationship between early illness experiences and hypochondriasis—only that it must be done using nonexperimental approaches. We will discuss this type of methodology in detail later in the book.

In many experiments, the independent variable is a construct that can only be manipulated indirectly. For example, a researcher might try to manipulate participants’ stress levels indirectly by telling some of them that they have five minutes to prepare a short speech that they will then have to give to an audience of other participants. In such situations, researchers often include a manipulation check  in their procedure. A manipulation check is a separate measure of the construct the researcher is trying to manipulate. For example, researchers trying to manipulate participants’ stress levels might give them a paper-and-pencil stress questionnaire or take their blood pressure—perhaps right after the manipulation or at the end of the procedure—to verify that they successfully manipulated this variable.

Control of Extraneous Variables

As we have seen previously in the chapter, an  extraneous variable  is anything that varies in the context of a study other than the independent and dependent variables. In an experiment on the effect of expressive writing on health, for example, extraneous variables would include participant variables (individual differences) such as their writing ability, their diet, and their shoe size. They would also include situational or task variables such as the time of day when participants write, whether they write by hand or on a computer, and the weather. Extraneous variables pose a problem because many of them are likely to have some effect on the dependent variable. For example, participants’ health will be affected by many things other than whether or not they engage in expressive writing. This influencing factor can make it difficult to separate the effect of the independent variable from the effects of the extraneous variables, which is why it is important to  control  extraneous variables by holding them constant.

Extraneous Variables as “Noise”

Extraneous variables make it difficult to detect the effect of the independent variable in two ways. One is by adding variability or “noise” to the data. Imagine a simple experiment on the effect of mood (happy vs. sad) on the number of happy childhood events people are able to recall. Participants are put into a negative or positive mood (by showing them a happy or sad video clip) and then asked to recall as many happy childhood events as they can. Table 6.1 shows what the data might look like if there were no extraneous variables and the number of happy childhood events participants recalled was affected only by their moods. Every participant in the happy mood condition recalled exactly four happy childhood events, and every participant in the sad mood condition recalled exactly three. The effect of mood here is quite obvious.

Table 6.1 Hypothetical Noiseless Data
Number of happy childhood events recalled when in a happy mood Number of happy childhood events recalled when in a sad mood
4 3
4 3
4 3
4 3
4 3
4 3
4 3
4 3
4 3
4 3
= 4 = 3

In reality, however, the data would probably look more like those Table 6.2 . Even in the happy mood condition, some participants would recall fewer happy memories because they have fewer to draw on, use less effective recall strategies, or are less motivated. And even in the sad mood condition, some participants would recall more happy childhood memories because they have more happy memories to draw on, they use more effective recall strategies, or they are more motivated.

Table 6.2 Realistic Noisy Data
Number of happy childhood events recalled when in a happy mood Number of happy childhood events recalled when in a sad mood
3 1
6 3
2 4
4 0
5 5
2 7
3 2
1 5
6 1
8 2
 = 4  = 3

Although the mean difference between the two groups is the same as in the idealized data, this difference is much less obvious in the context of the greater variability in the data. Thus one reason researchers try to control extraneous variables is so their data look more like the idealized data in  Table 6.1 , which makes the effect of the independent variable easier to detect (although real data never look quite  that  good).

One way to control extraneous variables is to hold them constant. This technique can mean holding situation or task variables constant by testing all participants in the same location, giving them identical instructions, treating them in the same way, and so on. It can also mean holding participant variables constant. For example, many studies of language limit participants to right-handed people, who generally have their language areas isolated in their left cerebral hemispheres. Left-handed people are more likely to have their language areas isolated in their right cerebral hemispheres or distributed across both hemispheres, which can change the way they process language and thereby add noise to the data.

In principle, researchers can control extraneous variables by limiting participants to one very specific category of person, such as 20-year-old, heterosexual, female, right-handed psychology majors. The obvious downside to this approach is that it would lower the external validity of the study—in particular, the extent to which the results can be generalized beyond the people actually studied. For example, it might be unclear whether results obtained with a sample of younger heterosexual women would apply to older homosexual men. In many situations, the advantages of a diverse sample outweigh the reduction in noise achieved by a homogeneous one.

Extraneous Variables as Confounding Variables

The second way that extraneous variables can make it difficult to detect the effect of the independent variable is by becoming confounding variables. A confounding variable  is an extraneous variable that differs on average across  levels of the independent variable. For example, in almost all experiments, participants’ intelligence quotients (IQs) will be an extraneous variable. But as long as there are participants with lower and higher IQs at each level of the independent variable so that the average IQ is roughly equal, then this variation is probably acceptable (and may even be desirable). What would be bad, however, would be for participants at one level of the independent variable to have substantially lower IQs on average and participants at another level to have substantially higher IQs on average. In this case, IQ would be a confounding variable.

To confound means to confuse , and this effect is exactly why confounding variables are undesirable. Because they differ across conditions—just like the independent variable—they provide an alternative explanation for any observed difference in the dependent variable.  Figure 6.1  shows the results of a hypothetical study, in which participants in a positive mood condition scored higher on a memory task than participants in a negative mood condition. But if IQ is a confounding variable—with participants in the positive mood condition having higher IQs on average than participants in the negative mood condition—then it is unclear whether it was the positive moods or the higher IQs that caused participants in the first condition to score higher. One way to avoid confounding variables is by holding extraneous variables constant. For example, one could prevent IQ from becoming a confounding variable by limiting participants only to those with IQs of exactly 100. But this approach is not always desirable for reasons we have already discussed. A second and much more general approach—random assignment to conditions—will be discussed in detail shortly.

""

Key Takeaways

  • An experiment is a type of empirical study that features the manipulation of an independent variable, the measurement of a dependent variable, and control of extraneous variables.
  • Studies are high in internal validity to the extent that the way they are conducted supports the conclusion that the independent variable caused any observed differences in the dependent variable. Experiments are generally high in internal validity because of the manipulation of the independent variable and control of extraneous variables.
  • Studies are high in external validity to the extent that the result can be generalized to people and situations beyond those actually studied. Although experiments can seem “artificial”—and low in external validity—it is important to consider whether the psychological processes under study are likely to operate in other people and situations.
  • Practice: List five variables that can be manipulated by the researcher in an experiment. List five variables that cannot be manipulated by the researcher in an experiment.
  • Effect of parietal lobe damage on people’s ability to do basic arithmetic.
  • Effect of being clinically depressed on the number of close friendships people have.
  • Effect of group training on the social skills of teenagers with Asperger’s syndrome.
  • Effect of paying people to take an IQ test on their performance on that test.
  • Judd, C.M. & Kenny, D.A. (1981). Estimating the effects of social interventions . Cambridge, MA: Cambridge University Press. ↵
  • Morling, B. (2014, April). Teach your students to be better consumers. APS Observer . Retrieved from http://www.psychologicalscience.org/index.php/publications/observer/2014/april-14/teach-your-students-to-be-better-consumers.html ↵
  • Bauman, C.W., McGraw, A.P., Bartels, D.M., & Warren, C. (2014). Revisiting external validity: Concerns about trolley problems and other sacrificial dilemmas in moral psychology. Social and Personality Psychology Compass, 8/9 , 536-554. ↵
  • Fredrickson, B. L., Roberts, T.-A., Noll, S. M., Quinn, D. M., & Twenge, J. M. (1998). The swimsuit becomes you: Sex differences in self-objectification, restrained eating, and math performance. Journal of Personality and Social Psychology, 75 , 269–284. ↵
  • Cialdini, R. (2005, April). Don’t throw in the towel: Use social influence research. APS Observer . Retrieved from http://www.psychologicalscience.org/index.php/publications/observer/2005/april-05/dont-throw-in-the-towel-use-social-influence-research.html ↵
  • Goldstein, N. J., Cialdini, R. B., & Griskevicius, V. (2008). A room with a viewpoint: Using social norms to motivate environmental conservation in hotels. Journal of Consumer Research, 35 , 472–482. ↵

A study in which the researcher manipulates the independent variable.

The different levels of the independent variable.

Anything that varies in the context of a study other than the independent and dependent variables.

When the way an experiment was conducted supports the conclusion that the independent variable caused observed differences in the dependent variable. These studies provide strong support for causal conclusions.

When the way a study is conducted supports generalizing the results to people and situations beyond those actually studied.

The participants and the situation studied are similar to those that the researchers want to generalize to and participants encounter everyday.

The same mental process is used in both the laboratory and in the real world.

The quality of the experiment’s manipulations.

Conversion from research question to experiment design.

Whether the statistics conducted in the study support the conclusions that are made.

To change an independent variable’s level systematically so that different groups of participants are exposed to different levels of that variable, or the same group of participants is exposed to different levels at different times.

A separate measure of the construct the researcher is trying to manipulate.

Method of holding extraneous variables at a constant.

An extraneous variable that differs on average across levels of the independent variable.

Research Methods in Psychology - 2nd Canadian Edition Copyright © 2015 by Paul C. Price, Rajiv Jhangiani, & I-Chant A. Chiang is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

define manipulative experiment

What is a Manipulated Variable? (Definition & Example)

An experiment is a controlled scientific study. In statistics, we often conduct experiments to understand how changing one variable affects another variable.

A  manipulated variable is a variable that we change or “manipulate” to see how that change affects some other variable. A manipulated variable is also sometimes called an independent variable .

A  response variable is the variable that changes as a result of the manipulated variable being changed. A response variable is sometimes called a dependent variable because its value often depends on the value of the manipulated variable.

Manipulated variable

Often in experiments there are also  controlled variables , which are variables that are intentionally kept constant.

The goal of an experiment is to keep all variables constant except for the manipulated variable so that we can attribute any change in the response variable to the changes made in the manipulated variable.

Let’s check out a couple examples of different experiments to gain a better understanding of manipulated variables.

Example 1: Free-Throw Shooting

Suppose a basketball coach wants to conduct an experiment to determine if three different shooting techniques affect the free-throw percentage of his players.

He divides his team into three groups and has each group use a different technique to shoot 100 free-throws. He then records the average free-throw percentage for each group.

In this experiment, we would have the following variables:

  • Manipulated variable: The shooting technique. This is the variable that we manipulate to see how it affects free-throw percentage.
  • Response variable: The free-throw percentage. This is the variable that changes as a result of the manipulated variable being changed.
  • Controlled variables: We would want to make sure that each of the three groups shoot free-throws under the same conditions. So, variables that we might control include (1) gym lighting, (2) time of day, and (3) gym temperature.

Example of manipulated variable

Example 2: Exam Scores

Suppose a teacher wants to understand how the number of hours spent studying affects exam scores. She intentionally has groups of students study for 1, 2, 3, 4, or 5 hours prior to an exam. She then has each group take the same exam and records the average scores for each group.

  • Manipulated variable: The number of hours spent studying. This is the variable that the teacher manipulates to see how it affects exam scores.
  • Response variable: The exam scores. This is the variable that changes as a result of the manipulated variable being changed.
  • Controlled variables: We would want to make sure that each of the groups of students take the exam under the same conditions. So, variables that we might control include (1) time available to complete exam, (2) number of breaks given during exam, and (3) time of day when exam is administered.

Example of a manipulated variable in an experiment

Additional Reading

What is an Antecedent Variable? What is an Extraneous Variable? What is an Intervening Variable? What is a Confounding Variable?

The Satterthwaite Approximation: Definition & Example

Percentile to z-score calculator, related posts, how to normalize data between -1 and 1, how to interpret f-values in a two-way anova, how to create a vector of ones in..., vba: how to check if string contains another..., how to determine if a probability distribution is..., what is a symmetric histogram (definition & examples), how to find the mode of a histogram..., how to find quartiles in even and odd..., how to calculate sxy in statistics (with example), how to calculate sxx in statistics (with example).

Manipulated, response, and control variables [definition and research example]

Manipulated, response, and control variables, coffee drinking and lung cancer, plant varieties and yield, you may also enjoy, calculate coverage from bam file, python: why vif return inf value, find max and min sequence length in fasta, get non-overlapping portion between two regions in bedtools.

IMAGES

  1. PPT

    define manipulative experiment

  2. PPT

    define manipulative experiment

  3. SES DK 024

    define manipulative experiment

  4. Manipulative field experiments mimicking N deposition: (a) conventional

    define manipulative experiment

  5. PPT

    define manipulative experiment

  6. PPT

    define manipulative experiment

VIDEO

  1. Manipulation

  2. I'm Exposing You

  3. What is combination reaction?

  4. Child Scammer Exposed #motivation #motivationalvideo #inspiration

  5. Manipulative Personality sings| Dark Manipulation #manipulation #darkpsychology

  6. Sampling Techniques Part 1

COMMENTS

  1. Experimental Method In Psychology

    There are three types of experiments you need to know: 1. Lab Experiment. A laboratory experiment in psychology is a research method in which the experimenter manipulates one or more independent variables and measures the effects on the dependent variable under controlled conditions. A laboratory experiment is conducted under highly controlled ...

  2. APA Dictionary of Psychology

    in an experiment, the manipulation of one or more independent variables in order to investigate their effect on a dependent variable. An example would be the assignment of a specific treatment or placebo to participants in a research study in order to control possible confounds and assess the effect of the treatment.

  3. Experimental manipulation

    Experimental manipulation. Experimental manipulation refers to the intentional alteration of an independent variable by a researcher to observe its effects on a dependent variable in a controlled setting. This technique is crucial for establishing causal relationships and understanding how specific factors influence behavior, communication, and ...

  4. Guide to Experimental Design

    Table of contents. Step 1: Define your variables. Step 2: Write your hypothesis. Step 3: Design your experimental treatments. Step 4: Assign your subjects to treatment groups. Step 5: Measure your dependent variable. Other interesting articles. Frequently asked questions about experiments.

  5. Experimental Manipulation

    Specifically, manipulation of an IV allows researchers to explore whether the IV causes change in a study's DVs. To further explain the process of experimental manipulation, this entry first outlines types of IVs that might be manipulated in an experimental design and specific approaches to manipulating IVs.

  6. What Is a Controlled Experiment?

    Revised on June 22, 2023. In experiments, researchers manipulate independent variables to test their effects on dependent variables. In a controlled experiment, all variables other than the independent variable are controlled or held constant so they don't influence the dependent variable. Controlling variables can involve:

  7. Module 2: Research Design

    The American Heritage Dictionary of the English Language defines an experiment as "A test under controlled conditions that is made to demonstrate a known truth, to examine the validity of a hypothesis, or to determine the efficacy of something previously untried." True experiments have four elements: manipulation, control , random assignment ...

  8. PDF Chapter 10, Experimental Designs

    experiments and the word "quasi-experiment" to apply to observational studies. This chapter concentrates on manipulative experiments, and the following chapter will discuss quasi-experimental studies (mensurative experiments or observational studies). Experiments always involve some measurements being taken on the units of

  9. Construct Validation of Experimental Manipulations in Social Psychology

    This definition excludes attention checks, comprehension checks, and other forms of instructional manipulation checks ... When such differences occur between pilot validity studies and focal experiments, including a manipulation check in the focal experiment could establish whether these changes have affected the manipulation's construct ...

  10. How the Experimental Method Works in Psychology

    When using the experimental method, researchers first identify and define key variables. Then they formulate a hypothesis, manipulate the variables, and collect data on the results. Unrelated or irrelevant variables are carefully controlled to minimize the potential impact on the experiment outcome.

  11. Research Design : Experimental Studies

    The American Heritage Dictionary of the English Language defines an experiment as "A test under controlled conditions that is made to demonstrate a known truth, to examine the validity of a hypothesis, or to determine the efficacy of something previously untried.". True experiments have four elements: manipulation , control , random ...

  12. Experimental Studies and Observational Studies

    Definitions. The experimental study is a powerful methodology for testing causal relations between one or more explanatory variables (i.e., independent variables) and one or more outcome variables (i.e., dependent variable). In order to accomplish this goal, experiments have to meet three basic criteria: (a) experimental manipulation (variation ...

  13. APA Dictionary of Psychology

    n. behavior designed to exploit, control, or otherwise influence others to one's advantage. in an experimental design, the researcher's adjustment of an independent variable such that one or more groups of participants are exposed to specific treatments while one or more other groups experience a control condition.

  14. What Is a Controlled Experiment?

    In an experiment, the control is a standard or baseline group not exposed to the experimental treatment or manipulation.It serves as a comparison group to the experimental group, which does receive the treatment or manipulation. The control group helps to account for other variables that might influence the outcome, allowing researchers to attribute differences in results more confidently to ...

  15. Experiment Basics

    Experiments have two fundamental features. The first is that the researchers manipulate, or systematically vary, the level of the independent variable. The different levels of the independent variable are called conditions. For example, in Darley and Latané's experiment, the independent variable was the number of witnesses that participants ...

  16. 4

    Introduction. To generate hypotheses you often sample different groups or places (which is sometimes called a mensurative experiment because you usually measure something, such as height or weight, on each experimental unit) and explore these data for patterns or associations. To test hypotheses you may do mensurative experiments, or manipulative experiments where you change a condition and ...

  17. 5.1 Experiment Basics

    An experiment is a type of empirical study that features the manipulation of an independent variable, the measurement of a dependent variable, and control of extraneous variables. An extraneous variable is any variable other than the independent and dependent variables. A confound is an extraneous variable that varies systematically with the ...

  18. Experiment Basics

    Experiments have two fundamental features. The first is that the researchers manipulate, or systematically vary, the level of the independent variable. The different levels of the independent variable are called conditions. For example, in Darley and Latané's experiment, the independent variable was the number of witnesses that participants ...

  19. What is a Manipulated Variable? (Definition & Example)

    An experiment is a controlled scientific study. In statistics, we often conduct experiments to understand how changing one variable affects another variable. A manipulated variable is a variable that we change or "manipulate" to see how that change affects some other variable. A manipulated variable is also sometimes called an independent variable.

  20. What is a Manipulated Variable? (Definition & Example)

    An experiment is a controlled scientific study. In statistics, we often conduct experiments to understand how changing one variable affects another variable. A manipulated variable is a variable that we change or "manipulate" to see how that change affects some other variable. A manipulated variable is also sometimes called an independent variable.. A response variable is the variable that ...

  21. Manipulated, response, and control variables [definition and research

    Manipulated, response, and control variables. When we perform an experiment, we mainly measure three types of variables including manipulated, response, and controlled variables. The manipulated variable is a type of variable that we can change or manipulate in an experiment. The manipulated variable is also called an independent variable.

  22. Experiment

    An experiment is a procedure carried out to support or refute a hypothesis, or determine the efficacy or likelihood of something previously untried. Experiments provide insight into cause-and-effect by demonstrating what outcome occurs when a particular factor is manipulated. Experiments vary greatly in goal and scale but always rely on repeatable procedure and logical analysis of the results.