Abstract
Classical fear conditioning provides a powerful model to explain the acquisition of fear in humans and nonhuman animals (LeDoux, 2014) and is often used to explain the development of anxiety disorders (Lissek et al., 2005; Mineka & Oehlberg, 2008). For example, in this framework, the emergence of a dog phobia could be explained as a consequence of being bitten by a dog; the dog would be the conditioned stimulus (CS), and the bite would be the aversive unconditioned stimulus (US) that becomes associated with the CS. A critical issue with this model, however, is that many patients with anxiety disorders do not recall such experiences (e.g., an aversive US such as a dog bite) in their past (Murray & Foote, 1979; Rachman, 1977), and this raises the question of whether an aversive US must physically occur in order for fear learning to occur.
Observational or vicarious fear-conditioning studies suggest that merely observing someone else receiving an aversive stimulation after CS presentation rather than experiencing the aversive stimulation oneself may suffice for fear learning to occur (Mineka, Davidson, Cook, & Keir, 1984), possibly because of overlapping neural representations of observing someone else in pain and experiencing pain oneself (Morrison, Lloyd, Di Pellegrino, & Roberts, 2004; Olsson & Phelps, 2007). Given that the neural representation of pain may also be activated by merely imagining a painful stimulation (Fairhurst, Fairhurst, Berna, & Tracey, 2012; Jackson, Brunet, Meltzoff, & Decety, 2006; Ogino et al., 2007), associative fear learning could also be based on mental images of the US (King, 1973; Lewis, O’Reilly, Khuu, & Pearson, 2013) and could even occur in the total absence of any physical or observed aversive stimulation. If this is the case, stimulus-contingent aversive imagery could provide an explanation for how fear may develop without aversive in vivo experiences and thus be of high relevance for understanding and treating anxiety disorders.
Previous studies that have investigated the role of an imagined US in conditioning (for a review, see Dadds, Bovbjerg, Redd, & Cutmore, 1997) either provided explicit instructions on CS–US contingencies (e.g., as used by Arabian, 1982; Soeter & Kindt, 2012) or first conditioned the CS with the physical US and then modulated an existing CS–US association with further US imagery (Jones & Davey, 1990). An open question is whether mental images of an aversive US may cause de novo fear conditioning in the total absence of any physically aversive stimulation, explicit instructions, or previously established CS–US associations. Translated to everyday life, can a person who was never bitten by a dog (and who neither observed how someone else was bitten nor was informed that dogs may bite) develop dog phobia, only because of aversive imagery when seeing a dog?
To investigate this hypothesis, we trained participants to produce specific mental images at the presentation of particular imagery cues. In a subsequent differential fear-conditioning procedure, we systematically paired CSs with these imagery cues but not with an actual US. Two different positive CSs (CSs+) were presented to disentangle CS responses related to aversive imagery from CS responses related to imagery per se. One CS+ was paired with a cue for aversive imagery (aversive CS+) and the other CS+ was paired with a cue for neutral imagery (neutral CS+). In addition, a negative CS (CS–) was presented and paired with an irrelevant stimulus that was physically similar to the imagery cues but was not supposed to prompt any imagery. After an acquisition phase, participants underwent an extinction phase in which the CSs were presented without the respective cues, to further investigate whether imagery-based conditioned fear is extinguished in the same manner as conditioned fear with a physical US (Dadds et al., 1997).
Study 1: Conditioning With an Imagined Thumbtack
Method
Participants and procedure
A total of 45 individuals (age:
Participants signed informed consent, filled out a battery of questionnaires to test hypotheses unrelated to the current study, and completed a brief interview, after which they had electrodes attached for recording of the electrocardiogram (ECG) and electrodermal activity (EDA). Afterward, they were seated for a 5-min resting phase. Following an imagery training (see below), participants underwent the imagery-based fear-conditioning paradigm (Fig. 1). At the end, electrodes were detached, a postexperimental interview was conducted, and participants were debriefed and compensated for participation. The study was approved by the local ethics committee of the University of Marburg Psychology Department.

Schematic depiction of the experimental protocol used in both studies. During imagery training (a), participants were informed of the association between cues (e.g., square, ellipse, hexagon) and imagery scenarios of aversive valence (e.g., stepping on a thumbtack) or neutral valence (e.g., stepping on a coin). In a third condition, participants were instructed not to imagine anything. In Study 2, a circle and a triangle were used instead of the ellipse and hexagon cues, respectively. During the imagery-based differential-conditioning procedure (b), each of three neutral faces (aversive conditioned stimulus, or CS+, neutral CS+, and CS–) was paired with the corresponding cue (aversive cue, neutral cue, no-image cue). All conditioned stimuli (CSs) were presented for 10 s (Study 1) or 8 s (Study 2). CS presentations coterminated with the imagery cue centrally superimposed on the CS for the last 3 s in 80% of the trials. In Study 2, acoustic startle probes were presented during 50% of CS presentations (potential window: 2–4 s after CS onset, i.e., prior to the onset of the imagery cue) and during six intertrial intervals (ITIs). The number and stimuli types used during the habituation, acquisition, and extinction phases are shown in (c). The extinction phase was identical to the acquisition phase, except that the imagery cues were never shown. In Study 2, participants saw the aversive-imagery cue once between two extinction blocks (reinstatement cue).
CSs and imagery cues
Three different faces with a neutral expression served as the aversive CS+, neutral CS+, and CS–, respectively (faces obtained from the Ekman faces series; Ekman & Friesen, 1976). The particular CS type of each face was counterbalanced across participants. Three different geometric shapes (a red square, a blue ellipse, and a yellow hexagon) served as imagery cues (aversive cue, neutral cue, no-image cue); assignment of shape and cue type was counterbalanced.
Imagery scripts
Prior to the current study, an online survey with 29 individuals had been conducted to determine a scenario that was considered highly aversive and could be vividly imagined by most individuals. From 10 different scenarios, participants found the “thumbtack-in-the-heel” scenario (see below) to be both highly aversive (
The script for the thumbtack-in-the-heel scenario was as follows: Imagine the following situation: You walk barefoot through a room, and your right foot steps on a thumbtack. You can feel the thin needle sinking into your heel as you step on the pin with your entire weight. The pain is piercing and intense and spreads from your heel into your leg. Every [red square/blue ellipse/yellow hexagon] that appears on the screen evokes the feeling of the needle pushing into your heel and the piercing and intense pain going through your body. The stinging pain is extremely unpleasant and barely tolerable. Focus on the pain you are experiencing. You can feel how it spreads from your right heel and you are cramping. You do not want to experience the stinging pain again. With every [red square/blue ellipse/yellow hexagon], you feel the thumbtack pushing into your heel.
The script for the stepping-on-a-coin scenario was as follows: Imagine the following situation: You walk barefoot through a room, and your right foot steps on a 1-cent coin. You can feel the round metal under your heel when you step on it. The coin feels cool but it is not unpleasant. Every [red square/blue ellipse/yellow hexagon] that appears on the screen evokes the feeling of the round, cool coin under your heel. The contact is not unpleasant and is easily tolerable. Focus on the contact; you are relaxed. With every [red square/blue ellipse/yellow hexagon] that appears on the screen, you feel the round, cool coin under your heel.
The script for the control cue was as follows: Whenever this [red square/blue ellipse/yellow hexagon] appears on the screen, you do not have to imagine anything. Just sit in your chair, observe the [red square/blue ellipse/yellow hexagon], and think of nothing in particular.
Imagery training
The imagery training was completed prior to the imagery-based-conditioning procedure and started with an auditory recording of the imagery scripts (recordings in German are available at https://doi.org/10.5281/zenodo.2591593). After the auditory instructions were given, participants were reminded two times about each cue–scenario association by instructions on the screen. If necessary, this reminder was repeated until participants were able to report the correct associations.
Imagery-based-conditioning paradigm
The imagery-based-conditioning paradigm consisted of an initial habituation phase, a subsequent acquisition phase, and a final extinction phase. The habituation phase consisted of three presentations of each CS for 10 s (intertrial interval, or ITI, jittered from 8 to 10 s) in random order. During each of two sequential acquisition blocks, every CS was presented 10 times for 10 s each, again with a jittered ITI of 8 to 10 s. Of the 10 CS presentations, 8 coterminated with the imagery cue centrally superimposed on the CS for the last 3 s (80% reinforcement). The two extinction blocks were identical to the acquisition blocks, except that the imagery cues were never shown.
Ratings
Before and after each phase and block, participants rated the valence and arousal of each of the three CSs on a 5-point Likert-type scale. In addition, participants reported their subjective experience of fear, anger, and disgust when looking at the CSs. Furthermore, participants were shown the three cues and asked to indicate whether they associated an image with each cue and, if so, to rate the unpleasantness of that image (from 0,
Psychophysiological-data recording and reduction
The ECG and EDA were recorded with a BioSemi ActiveTwo system (BioSemi, Amsterdam, The Netherlands) with the Common Mode Sense and the Driven Right Leg electrodes attached to the right leg (sampling rate = 1024 Hz). For ECG measurement, Ag/AgCl electrodes (4-mm diameter) were applied in a lead-two configuration. In BrainVision Analyzer 2 (Brain Products, Gilching, Germany), the ECG was band-pass filtered (−3 dB at 1 Hz and 30 Hz, fourth-order two-way Butterworth filter, 24 dB/octave roll-off), and R spikes were detected automatically with the EKG Markers solution in the Analyzer software. R spikes were corrected manually if necessary, and nonusable data (e.g., premature systoles, excessive movement artifacts) were removed. Using custom-made MATLAB scripts (MATLAB Version 9.2; The MathWorks, Natick, MA), we then converted the ECG to a time course of interbeat intervals (IBIs), in which the value at each time point reflected the latency between the preceding and the next R spike (Mueller, Stemmler, Hennig, & Wacker, 2013). The IBI time series was then segmented into epochs ranging from −1,000 to 7,000 ms relative to CS onset (CS-evoked IBI) or from −1,000 to 10,000 ms relative to cue onset (cue-evoked IBI), baseline-corrected relative to −1,000 to 0 ms, downsampled to 2 Hz, and averaged across all trials by block and condition.
Heart rate responses to a CS during fear acquisition typically showed a triphasic response pattern (Lipp, 2007) consisting of an initial deceleration (D1), a transient acceleration (A1), and a second deceleration (D2). For analysis of CS-evoked IBIs, the maximum values were extracted for the time periods from 0 ms to 2,000 ms (D1) and 5,000 ms to 7,000 ms (D2), and the minimum values were extracted from 2,000 ms to 5,000 ms (A1). To remove the influence of the preceding components, we then computed peak-to-peak values, for example, the value for A1 was referenced to D1 (corrected A1 = A1 – D1), and the value for D2 was referenced to A1 (corrected D2 = D2 – A1). In addition to analyzing the three components separately, we also analyzed the mean IBI for the entire epoch from 0 ms to 7,000 ms (results are provided in the Supplemental Material available online). For cue-evoked IBIs, in which only a biphasic response was observed, the maximum value from 0 ms to 4,000 ms (D1) and the minimum value from 4,000 ms to 10,000 ms (A1) were taken.
EDA was recorded at the thenar and hypothenar of the nondominant hand with two Ag/AgCl electrodes (5-mm diameter, exosomatic measurement, 1 µA at 16 Hz AC). Electrodes were filled with isotonic (0.5% NaCl) electrolyte medium. Raw EDA was low-pass filtered off-line (1 Hz, same filter specifics as for ECG) and downsampled to 128 Hz. Ledalab 3.4.9 (implemented in MATLAB 9.2) was used for artifact correction and through-to-peak analyses (Benedek & Kaernbach, 2010a, 2010b). All data were visually screened, and technical artifacts were interpolated with spline or cubic interpolation.
Skin conductance responses (SCRs) were defined as the sum of SCR amplitudes of significant SCRs within 1,000 and 5,000 ms after CS or cue onset. SCRs smaller than 0.01 µS were considered zero responses. SCRs were logarithmized, ln(µS + 1), before averaging to obtain a normal distribution. Finally, as in the ECG analysis, SCR through-to-peak scores were averaged within blocks and conditions. Additional, more fine-grained SCR analyses with range correction and exclusion of nonresponders are provided in the Supplemental Material.
Statistical analyses
For analyzing responses to the imagery cues, repeated measures analyses of variance (ANOVAs) with the factors cue type (aversive vs. neutral vs. none) and block of acquisition (first vs. second) were conducted. For analyzing responses to CSs, the repeated measures ANOVAs included the factors CS type (aversive CS+ vs. neutral CS+ vs. CS–) and block of acquisition (first vs. second). Main effects were followed up by pairwise post hoc
Results
Responses to imagery cues
Subjective ratings
A Block (after first acquisition vs. after second acquisition) × Cue Type (aversive vs. neutral vs. none) ANOVA on the pleasantness of mental images revealed a main effect of cue type (

Responses to imagery cues. The mean unpleasantness rating of mental images in Study 1 is shown in (a) for responses to each of the three cue types. Ratings were made on an 11-point Likert-type scale ranging from 0 (
Physiological responses
The Block × Cue Type ANOVA on the D1 component revealed no main effects or interactions (all
Together, analyses of ratings and physiological responses thus confirmed that the aversive cue evoked imagery that was perceived as highly unpleasant and was accompanied by increased heart rate and SCR.
Responses to CSs
Subjective ratings
ANOVAs on self-rated fear after habituation confirmed that participants rated all faces to be similarly fear evoking prior to conditioning (

Fear ratings in Study 1. The graph on the left shows participants’ mean ratings of experienced fear when viewing each conditioned stimulus (CS) type during baseline (BL), habituation (HAB), the first acquisition block (ACQ1), the second acquisition block (ACQ2), the first extinction block (EXT1), and the second extinction block (EXT2). Ratings were made on a 5-point Likert-type scale ranging from 1 (

Responses to conditioned stimuli (CSs). Mean arousal rating, negative-valence rating, and CS-evoked interbeat interval (IBI) in Study 1 are shown in the top row, and mean arousal rating, startle response, and CS-evoked IBI in Study 2 are shown in the bottom row. Startle response was normalized relative to the first startle response during acquisition. In all graphs, results are shown for each CS type during three phases: habituation (HAB), first acquisition block (ACQ1), and second acquisition block (ACQ2). CS-evoked IBIs are shown only for the deceleration time window (D1). Error bars show repeated measures standard errors of the mean (Masson & Loftus, 2003).
Similar to the ANOVAs during acquisition, results of the Block × CS Type ANOVA on ratings during extinction (i.e., after termination of cue presentation) revealed main effects of CS type for fear, anger, disgust, arousal, and valence ratings (all
To test whether extinction reduced negative feelings relative to acquisition, we additionally performed a Phase × CS Type ANOVA in which the factor phase consisted of the last block of acquisition versus the last block of extinction. This ANOVA yielded a significant Phase × CS Type interaction for fear (
Physiological responses
During habituation, the D1 component did not differ as a function of CS type (

Mean evoked interbeat interval (IBI) in Study 1 (a) and Study 2 (b) in response to each of the three conditioned stimulus (CS) types during the deceleration time windows (D1 and D2) and the acceleration time window (A1). Results are shown separately for the first acquisition block (ACQ1) and second acquisition block (ACQ2).
Consistent with successful extinction, the Block × CS Type ANOVA on D1 during extinction did not reveal any main effects or interactions (
Study 2: Conditioning With an Imagined Electric Shock
The first study showed that, when contingently paired with aversive mental images, CSs elicit fear responses at the subjective and cardiovascular levels. The aim of Study 2 was to determine whether imagery-based fear conditioning would also work with shorter CS durations and a US that is more typical for classical fear-conditioning studies (i.e., imagery of an electric shock). Moreover, to rule out demand effects, fear-potentiated startle, which is a physiological marker outside of conscious control (Hamm & Weike, 2005; Lipp, 2007), was assessed in Study 2. Finally, to further explore similarities between imagery-based fear conditioning and fear conditioning based on physical USs, we tested whether imagery of the US after extinction triggers a return of fear (i.e., reinstatement) because it is commonly observed after physical US presentations (Hermans et al., 2005).
Method
Participants and procedure
In Study 2, 41 individuals (age:
Imagery scripts
In contrast to participants in Study 1, participants in Study 2 were instructed to imagine receiving a strong electric shock on the forearm (aversive imagery) or receiving a mild vibration on the forearm (neutral imagery). The script for feeling a painful electric shock was as follows: Imagine the following situation: You sit in a chair; your hands are on the arm rests. An electrode is attached to your left wrist. The electrode provides a short but powerful electric shock whenever a [red square/blue triangle/yellow circle] appears on the screen. The shock spreads throughout your whole body. Every time the [red square/blue triangle/yellow circle] appears on the screen, you receive a painful electric shock. The pain is extremely uncomfortable and barely tolerable. Focus on the pain you are experiencing. You can feel how it spreads throughout your entire body and how your muscles are cramping. You do not wish to experience this pain again. With every [red square/blue triangle/yellow circle], you experience the electric shock again.
The script for feeling a vibration was as follows: Imagine the following situation: You sit in a chair; your hands are on the arm rests. A wristband is attached to your left wrist. The wristband provides a short vibration whenever a [red square/blue triangle/yellow circle] appears on the screen. Every time the [red square/blue triangle/yellow circle] appears on the screen, you experience this vibration. You can feel the vibration spread throughout your whole body. The sensation is not at all uncomfortable and is easily tolerable. Focus on the vibration you are experiencing. You can feel how it spreads throughout your entire body, and your muscles are relaxed. With every [red square/blue triangle/yellow circle], you experience the vibration again.
The script for the control cue was as follows: Whenever this [red square/blue triangle/yellow circle] appears on the screen, you do not have to imagine anything. Just sit in your chair, observe the [red square/blue triangle/yellow circle], and think of nothing in particular.
Original imagery scripts in the German language are available at https://doi.org/10.5281/zenodo.2591593.
Imagery-based-conditioning paradigm
The imagery-based-conditioning paradigm was identical to that in Study 1 with the following exceptions. First, the CS was presented for 8 s instead of 10 s. Second, participants saw the aversive-imagery cue once after the first extinction block (reinstatement cue). Third, a circle and a triangle were used instead of the ellipse and hexagon cues, respectively. The ITI, reinforcement rate, cue presentation time, and number of CS presentations were identical to those in Study 1.
Dependent variables
Ratings
As the effects of conditioning on different affect-rating scales were largely redundant in Study 1, we collected only CS-associated arousal and valence in Study 2. CS ratings (i.e., arousal and valence) from only 40 participants were analyzed because 1 participant claimed after the experiment to have misunderstood the questions. The assessment and analysis were identical to those in Study 1.
Physiological responses
Procedures for SCR and ECG recording and analysis were largely identical to those in Study 1. However, because of the shorter CS presentation latency, only the ECG recording from −1,000 ms to 5,000 ms relative to CS and cue onset were analyzed. Accordingly, the CS-evoked IBI included a D1 component from 0 ms to 2,000 ms, an A1 component from 2,000 ms to 5,000 ms, and a D2 component from 4,000 ms to 5,000 ms. The cue-evoked IBI included a D1 component that was measured as the maximum IBI from 0 ms to 2,000 ms and an A1 component that was measured as the minimum IBI from 2,000 ms to 5,000 ms. To remove the influence of the preceding components in the CS-evoked IBI, we then referenced the value for A1 to D1 (corrected A1 = A1 – D1) and the value for D2 was referenced to A1 (corrected D2 = D2 – A1). Only trials not containing startle probes (see the next section) were used for SCR and IBI analyses. One participant had to be excluded from SCR analyses because of missing data, and 1 participant had to be excluded from IBI analyses for the extinction phase because of excessive artifacts in the ECG recording during that phase.
Fear-potentiated startle
After the resting phase, the startle probe—50 ms duration, 85 dB(A) white-noise burst, 1 ms rise–fall time—was presented five times to allow for an initial startle habituation. In the acquisition and extinction phases, the startle probe was presented during five presentations of each CS in each block (potential window: 2–4 s after CS) and during six ITIs in each block (between 2 s into the ITI and 1 s before its end). Electromyography (EMG) was measured below the left eye on the musculus orbicularis oculi using two Ag/AgCl electrodes (4-mm diameter) and analyzed according to recommendations from Blumenthal et al. (2005). It was first band-pass filtered from 28 Hz to 500 Hz, rectified and low-pass filtered with a time constant of 10 ms, segmented from −50 ms to 250 ms relative to startle onset, and then baseline-corrected from −50 ms to 0 ms.
Because of the good signal-to-noise ratio of startle responses, data were not aggregated within blocks but instead analyzed at the single-trial level (Sevenster, Beckers, & Kindt, 2013; Soeter & Kindt, 2010, 2012) to allow visualization of the learning dynamics during imagery-based fear conditioning. To this end, the maximum value between 20 ms and 150 ms was assessed for each trial in which a startle response was observed that did not begin earlier than 20 ms after startle onset. Single-trial startle magnitudes were T standardized (
Results
Responses to image cues
Subjective ratings
The Block × Cue Type ANOVA on the unpleasantness ratings of the cue-related image revealed a main effect of cue (
Physiological responses
As in Study 1, the Block × Cue Type ANOVA on the A1 revealed a significant main effect of cue type (
Responses to CSs
Subjective ratings
At baseline and after habituation, participants rated all faces to be similarly arousing and pleasant (
During extinction, the Block (before reinstatement vs. after reinstatement) × CS Type ANOVAs on arousal and valence ratings showed a significant main effect of CS type only for the arousal ratings (
Peripheral measures
During habituation, there was no effect of CS type in any of the three cardiac components (
Fear-potentiated startle
The CS Type × Trial ANOVA revealed a main effect of CS type (
Trial-wise

Mean normalized single-trial eyelid startle responses to the 85-dB noise burst as a function of the conditioned stimulus (CS) type during the first acquisition block (ACQ1), second acquisition block (ACQ2), first extinction block (EXT1), and second extinction block (EXT2). The aversive cue was presented between EXT1 and EXT2 to test for reinstatement of fear. Error bars show repeated measures standard errors of the mean (Masson & Loftus, 2003). Asterisks above the lines indicate significant differences between the aversive CS+ and the neutral CS+, and asterisks below the lines indicate significant differences between the aversive CS+ and the CS– (
The CS Type × Trial ANOVA for the startle responses during extinction revealed only main effects of trial (
Discussion
The goal of the current research was to test whether fear can be conditioned de novo with aversive mental images as USs only. To this end, we conducted two studies in which different neutral face photographs were contingently paired with specific cues that had been previously trained to prompt aversive, neutral, or no imagery in 41 and 45 participants, respectively. Across studies, participants rated neutral faces as more fear evoking, unpleasant, and arousing, and they responded with relative cardiac deceleration and fear-potentiated startle if the faces had been paired with aversive imagery compared with neutral or no imagery. Because these findings indicate that associative fear learning may occur in the total absence of aversive physical stimulation, vicarious experiences, or explicit instructions, our results are relevant for understanding how phobias and anxiety disorders may develop in the absence of prior physically aversive experiences.
Most importantly, CS ratings revealed that faces, which were initially perceived as neutral, were later rated as more unpleasant, arousing, and fear evoking if they had been paired with cues for aversive as opposed to neutral or no imagery. It can be assumed that these cues prompted participants to produce the intended mental images, given that participants rated the mental images in response to the aversive cue as highly unpleasant and showed cardiac acceleration and increased SCRs to the aversive cue. Because (a) none of the faces had ever been paired with an aversive physical stimulus, (b) no instructions regarding the faces had ever been given, and (c) participants had not observed anyone else receiving an aversive stimulation in response to the faces, the higher arousal, negative valence, and fear ratings to the aversive CS+ than to the neutral CS+ and the CS– in Studies 1 and 2 can be ascribed only to the different mental images prompted by the associated cues.
At the cardiac level, the aversive CS+ evoked more cardiac deceleration compared with the neutral CS+ after the first acquisition block in Study 1 and (marginally significantly) in Study 2. Because fear-conditioned CSs+ generally evoke cardiac deceleration or “fear bradycardia” (Notterman et al., 1952; Panitz et al., 2015; Sperl et al., 2016), this finding further supports successful imagery-based fear learning from Block 1 to Block 2. In addition, the two nonthreatening stimuli differed from each other; there was a relative acceleration to the neutral CS+ as opposed to the CS–, consistent with imagery tasks evoking cardiac acceleration (Vrana & Lang, 1990).
With regard to the eyelid startle magnitude, which is believed to be a relatively pure correlate of stimulus valence in both classical fear conditioning and fear imagery (Hamm & Weike, 2005; Vrana & Lang, 1990), noise bursts given during aversive CS+ evoked stronger startle responses than bursts given during the neutral CS+ or CS–. As with cardiac deceleration, this effect increased throughout the course of learning and was particularly pronounced in the second half of the acquisition phase. At the same time, we did not observe higher electrodermal responses to the CS+ than to the CS– as we have found with physical USs using the same type of CS (Mueller, Panitz, Hermann, & Pizzagalli, 2014; Panitz et al., 2018; Sperl et al., 2016). Furthermore, in Study 2, imagery of a US did not trigger a reinstatement as would be expected with a physical US presentation (Hermans et al., 2005). Together, this suggests that de novo fear acquisition based on imagery mirrors physical-US-based fear conditioning with regard to some factors (i.e., subjective report, fear-potentiated startle, fear bradycardia) but not all factors (i.e., EDA, reinstatement).
An open question is whether the fear conditioning, as observed in both studies, was actually caused by the mental images that were paired with the CSs. Alternatively, the aversive-imagery cues themselves may have acquired aversive properties during the initial imagery-training procedure and served as a second-order conditioning US (Rizley & Rescorla, 1972). Although this is somewhat speculative at this point, such a mechanism may have far-reaching clinical implications because it would suggest that cues remotely associated with aversive imagery (rather than aversive imagery per se) may cause new fear learning. To probe the involvement of second-order conditioning, researchers may in the future control for cue valence, for example, by collecting cue valence ratings or by applying more indirect approaches to assess stimulus valence. Alternatively, researchers may include a control group that receives the initial imagery training but is instructed to not engage in imagining when cues are presented during conditioning.
Furthermore, the observed fear responses to the aversive CS+ may not have been caused by the actual imagery of an aversive event but may instead relate to propositional knowledge. Although this is a general issue of human associative-learning studies (Mitchell, De Houwer, & Lovibond, 2009), the startle potentiation during the aversive CS+ compared with both the neutral CS+ and CS– in Study 2 shows that associative learning could also be observed with regard to threat responses that are largely outside of cognitive control (Hamm & Weike, 2005; Lipp, 2007). Moreover, the observed relative fear bradycardia to the aversive CS+ in Studies 1 and 2 supports the notion that the aversive CS+ indeed triggered fear responses across multiple response systems, suggesting that the acquired association of the aversive CS+ and aversive imagery goes beyond merely propositional knowledge.
The type of learning that is captured with this novel paradigm is potentially relevant for anxiety disorders and other aspects of human functioning, particularly in light of the relevance of imagery for mental disorders and their treatment (Pearson, Naselaris, Holmes, & Kosslyn, 2015). This type of imagery-based learning connects truly existing external stimuli to threatening images or, by extension, reality to fantasy. With such connections, the emergence of dog phobia does not require being bitten by a dog but it would suffice to merely imagine being bitten when encountering dogs. Similarly, imagery of social embarrassment, suffocation, back pain, or even terrorist attacks may be highly relevant for the emergence and treatment of social phobia, agoraphobia, pain disorder, and social prejudice, when contingently paired with seeing other individuals, subway trains, movements, or foreigners, respectively.
It should be noted that the content and time course of experimentally induced imagery cannot be perfectly controlled. After contingencies are learned, participants may initiate imagery before cues are presented. As a consequence, recordings of conditioned responses after CS presentations may have been confounded with unconditioned responses to the mental images. In contrast to this assumption, however, we observed a dissociation of unconditioned responses and conditioned responses at the cardiovascular level, which is typically found in classical fear-conditioning studies (Lipp, 2007): relative cardiac deceleration to the aversive CS+, but cardiac acceleration to the US or, in the current studies, the aversive cue. Moreover, participants may have visualized an image when US presentations were not intended (e.g., during the CS–, nonreinforced trials, or the extinction phase) or, alternatively, may have avoided unpleasant mental images by not vividly imagining the US or not imagining the US at all. Because such behavior may have led to enhanced or reduced imagery-based conditioning, respectively, the reported effect sizes may not accurately reflect the actual potential of aversive images to induce fear learning in real life.
Taken together, the present studies showed that subjective and physiological fear responses were evoked by neutral faces, which were never paired with any aversive physical stimuli, any observations, or any explicit instructions but only with cues for aversive imagery. When contingently paired with neutral stimuli, particular images may thus lead to de novo conditioning, which is of potential relevance for anxiety disorders, social prejudice, and other dimensions of human functioning.
Supplemental Material
MuellerSupplementalMaterial_rev – Supplemental material for Aversive Imagery Causes De Novo Fear Conditioning
Supplemental material, MuellerSupplementalMaterial_rev for Aversive Imagery Causes De Novo Fear Conditioning by Erik M. Mueller, Matthias F. J. Sperl and Christian Panitz in Psychological Science
Supplemental Material
Mueller_OpenPracticesDisclosure_rev – Supplemental material for Aversive Imagery Causes De Novo Fear Conditioning
Supplemental material, Mueller_OpenPracticesDisclosure_rev for Aversive Imagery Causes De Novo Fear Conditioning by Erik M. Mueller, Matthias F. J. Sperl and Christian Panitz in Psychological Science
Footnotes
Action Editor
Author Contributions
Declaration of Conflicting Interests
Open Practices
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
