Abstract
Introduction
Sexual violence is broadly defined as any sexual acts obtained from a person without their consent (Gavey, 2014). When used as an umbrella term, sexual violence includes a range of experiences such as rape, sexual assault, sexual abuse, and sexual harassment (Gavey, 2014). This article uses sexual violence to refer to completed or attempted penetration (oral, vaginal, or anal) against individuals who are past the age of consent. In the literature, this form of sexual violence is often termed as “rape,” “sexual assault,” or “sexual coercion” (Abbey et al., 2007; Koss et al., 1987; Struckman-Johnson et al., 2003). Though sexual violence can be experienced and perpetrated by anyone, most victims are women, and most perpetrators are men (Basile et al., 2022; Gavey, 2014; Martin et al., 2022).
Measuring men’s sexual violence perpetration has been challenging. Studies that concurrently assessed men’s self-reported perpetration and women’s self-reported victimization have found that perpetration rates were lower than victimization rates (Koss et al., 1987, 2022; Struckman-Johnson et al., 2003). While it has been postulated that this discrepancy between perpetration and victimization rates was due to men perpetrating sexual violence against multiple women, longitudinal evidence has suggested that perpetrators who committed sexual violence repeatedly were not the majority of perpetrators (Abbey et al., 2012; Abbey & McAuslan, 2004; Swartout et al., 2015). Thus, the discrepancy between perpetration rates and victimization rates seems to be a methodological artifact rather than a true discrepancy. Indeed, scholars have argued that a gold standard measure of sexual violence perpetration is still lacking (Anderson et al., 2017, 2021).
Multiple measures of sexual violence perpetration exist in the literature, and they tend to produce different perpetration rates (Anderson et al., 2021; Cook, 2002; Testa et al., 2015). The lack of a consistent and well-validated perpetration measure makes a comparison of prevalence rates across studies challenging. When prevalence rates cannot be compared meaningfully across samples, it is more difficult to evaluate the effectiveness of perpetration prevention programs. Thus, there is a need to gain better knowledge of measures of perpetration to advance the field.
Tactics of Sexual Violence Perpetration
There is a consensus that a good measure of sexual violence perpetration should use behaviorally specific language to describe the sexual act, and the behavior or tactic used to obtain the sexual act (Kolivas & Gross, 2007; Koss et al., 2007). For example, instead of asking “Have you ever raped someone?” measures should use language such as “Have you ever physically forced someone into having vaginal sexual intercourse with you?.” The reason for using behaviorally specific language is that terms such as “rape” or “sexual assault” are differentially defined by respondents and can lead to underreporting (Koss et al., 1987, 2007; Koss & Oros, 1982). Historically, “rape” or “sexual assault” only referred to using physical force to obtain sexual intercourse from a woman (Bienen, 1980). Sexual assault using other tactics, such as verbal coercion and substances was excluded from this definition. Studies have found that women and men were less likely to label victimization or perpetration as “sexual assault” or “rape” when it did not involve physical force (Abbey et al., 2004; Jeffrey & Barata, 2017, 2019; Littleton & Axsom, 2003), and measures that used behaviorally specific language detected higher prevalence rates than measures that did not (Anderson et al., 2023; Koss, 1993).
Generally, measures of sexual violence perpetration tended to include items that assessed the use of physical force and substances as the use of these two types of tactics corresponded to most legal definitions of sexual offense (Gylys & McNamara, 1996; Koss et al., 2007; Peterson, 2022; Peterson et al., 2021) and were often termed as rape or sexual assault by researchers (DeGue & DiLillo, 2005; Koss et al., 2007). However, social and legal developments in the past decade in the U.S., such as the affirmative consent standard (California Senate Bill SB-967, 2014) and updating of the FBI’s definition of rape (U.S. Department of Justice, 2012), removed the focus on the use of force and emphasized lack of consent mean the use of verbal coercion and other non-forceful tactics legally constitute sexual assault as well. Research has also documented the pervasiveness and harm of verbal coercion and other non-forceful tactics (Abbey et al., 2004; Brown et al., 2009; Edwards et al., 2014; Jeffrey & Barata, 2017). The use of verbal coercion or other non-forceful tactics was usually labeled sexual coercion by scholars (DeGue & DiLillo, 2005; Koss et al., 2007). Studies have found that compared to the use of physical force and substances, men more often used verbal coercion or other non-forceful tactics, such as those used with the intention to sexually arouse the unwilling partner, to perpetrate sexual violence against women (Edwards et al., 2014; Livingston et al., 2004; Struckman-Johnson et al., 2003). Thus, the inclusion of these tactics is warranted in measures of perpetration. However, verbal coercion has been inconsistently defined and operationalized in the literature (Pugh & Becker, 2018). For instance, in the Sexual Strategies Scale (SSS; Strang et al., 2013), verbal coercion included questioning a partner’s commitment to the relationship and threatening self-harm, however, these were not included in the Sexual Experiences Survey Short Form Perpetration (SES-SFP; Koss et al., 2007). Furthermore, scholars have pointed out that the selection and categorization of verbally coercive tactics in perpetration measures were not informed by theory (Abbey et al., 2021). Similarly, scholars have debated whether the use of attempted sexual arousal tactics should be considered coercive (Camilleri et al., 2009; Testa et al., 2015). A consensus on what tactics should be included in a measure of sexual violence perpetration is lacking.
Reliability and Validity of Perpetration Measures
While including a comprehensive range of tactics can strengthen the content validity of a measure and thus produce higher self-reported perpetration rates among men (Strang et al., 2013; Testa et al., 2015), a high-quality perpetration measure should have good reliability and other types of validity as well. There is a paucity of research validating measures of sexual violence perpetration, and scholars have called for more research examining the psychometric properties of perpetration measures and improving the measurement of perpetration (Abbey et al., 2021; Anderson et al., 2017). For example, the Sexual Experiences Survey (SES) original and revised versions (Koss & Oros, 1982; Koss et al., 1987, 2007) were the most common measures of men’s sexual violence perpetration (Anderson et al., 2021; Kolivas & Gross, 2007) but the psychometric evidence of the earlier versions of the SES (Koss & Oros, 1982; Koss et al., 1987) have been criticized as being obtained with inappropriate methodology (Koss et al., 2007). The article (Koss et al., 2007) that published the most recent version of SES, the SES-SFP, did not include any psychometric evidence for it, and the SES-SFP was not formally evaluated until 10 years later (Anderson et al., 2017).
Additionally, there was an assumption that the elements that constituted a reliable and valid measure of sexual violence victimization worked the same for a measure of perpetration (Anderson et al., 2021; Kolivas & Gross, 2007). Kolivas and Gross (2007) pointed out that the questions in SES were written primarily from the perspective of a female victim, and the same questions were reworded to assess men’s perpetration. They argued that this strategy might require additional cognitive efforts from respondents to a perpetration measure. This could lead to inaccurate responses because when responding to a question in the SES, both victims and perpetrators must recall the behavior of both parties, but victims only need to recall their desire to engage in sex while perpetrators need to recall and interpret the victims’ desire.
There is some preliminary evidence supporting Kolivas and Gross’s (2007) argument. In a study that compared men’s responses to two perpetration measures, a common reason for false negative responses on the SES was that men believed they had consent from the woman (Strang & Peterson, 2017). In another study (Rueff & Gross, 2017), researchers replaced “When she didn’t want to” in the SES (Koss et al., 1987) with the description of explicit behavior (e.g., resistance), and they found that this modified version of SES yielded a higher self-reported prevalence of perpetration than the unmodified SES. Therefore, there is a clear need to explicitly investigate false positives and false negatives to obtain evidence of the validity of any perpetration measure.
The Current Review
To my knowledge, only one comprehensive review of sexual violence perpetration measures has been conducted. However, this review was more than 30 years old and only included studies in college populations (Porter & Critelli, 1992). Reviews on sexual violence perpetration prevalence or risk factors sometimes included findings regarding perpetration measures but did not provide a comprehensive overview of their assessment of tactics and psychometric properties (e.g., Anderson et al., 2021; Yapp & Quayle, 2018). Given the inconsistencies in the measurement of sexual violence perpetration and its implications for sexual violence prevention, the current review aims to answer the following questions: what measures of men’s sexual violence perpetration against women have been published? What types of perpetration tactics are included in these measures? What reliability and validity evidence do these measures have?
Method
Systematic Review Protocol
This review was conducted following the guidelines of the updated Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISM; Page et al., 2021). A search was conducted in the following databases to identify articles published on or before April 18, 2023: Scopus, Web of Science, ScienceDirect, MEDLINE via Ovid, psychINFO, psychTESTS, Sociological Abstracts, Social Work Abstracts, and Social Services Abstracts. These databases were searched using the following search string: (“sexual violence” OR “sexual assault” OR “sexual coercion” OR “rape” OR “sexual aggression”) AND perpetrat* AND (“measure” OR “scale” OR “measurement” OR “test”).
Eligibility Criteria
This review included articles published in peer-reviewed journals. Studies were included if (a) they included self-reported sexual violence perpetration data by adult men against adolescent or adult women and (b) the perpetration data was obtained with a standardized quantitative measure. Studies were excluded if they (a) only included perpetration data by men against men or minors (e.g., same-sex perpetration and child sexual abuse), perpetration data by adolescent boys (e.g., teen dating violence and delinquency), or restricted perpetration of sexual violence to men’s current or former intimate partners (e.g., sexual violence as a form of intimate partner violence); (b) did not include self-reported perpetration data (i.e., obtained via criminal records and interviews), (c) did not use a standardized quantitative measure (i.e., a measure designed for a single study’s purpose, modified measures), (d) were not published in English, and (e) full text unavailable.
Screening Procedure
Figure 1 provides an outline of the screening process. A total of 2,679 records were identified from the databases. These records were imported to Zotero and screened. After removing duplicated records, the author screened the remaining 1,626 records by title, abstract, and keywords to determine eligibility. This resulted in 167 articles for full-text review. References of these articles were searched to identify eight relevant articles for full-text review. Ninety articles were ineligible and were excluded. The final sample included 85 articles.

PRISMA flow diagram.
Data Extraction
A data extraction form was created by the author. The author first coded the metadata of the articles (e.g., year of publication, journal, and country). All articles were coded for the name and source of the measure of sexual violence perpetration they used. Articles that published the identified perpetration measures were used to extract information regarding the number of items, response scale, scoring method, and tactics included in the measure. Based on the literature review (e.g., DeGue & DiLillo, 2005; Koss et al., 2007; Struckman-Johnson et al., 2003), it was expected that most perpetration tactics could be categorized into four common groups: physical force, substances, verbal coercion, and attempted sexual arousal. To facilitate comparison across measures, the original categorization of tactics used by the authors of these measures was not used. Rather, tactics were assigned to the four common types found in the literature. Tactics that could not be assigned to any of the existing categories were assigned to a “miscellaneous” category. Only articles that had a specific focus on evaluating the psychometric properties of a perpetration measure(s) or articles the measures were published in were used to extract psychometric evidence of the measures. The specific types of psychometric evidence extracted included reliability, comparison between different perpetration measures, relationships between responses on perpetration measures and social desirability, false positives and false negatives, and analysis of the internal structure of the measures. Given that the primary concern in the literature was whether perpetration measures detect perpetration accurately (Kolivas & Gross, 2007; Koss et al., 2007; Willis & Germann, 2016), psychometric evidence regarding relationships between responses on perpetration measures and risk factors of sexual violence perpetration (e.g., rape myths acceptance) were not extracted. In addition, some studies modified the perpetration measures they validated, such as changing the description of non-consent or the response scale. Psychometric evidence for these modified measures were not extracted because research has found that such modifications could alter participants’ responses significantly (Anderson et al., 2021; Willis & Germann, 2016). Finally, some psychometric evidence was only available in mixed-gender samples and was thus excluded.
Results
In the final sample, 13 unique measures were identified (see Table 1 for details). The most frequently used measures among these studies were the Sexual Experiences Survey Short Form Perpetration (SES-SFP; Koss et al., 2007), the Sexual Experiences Survey 1987 version (Koss et al., 1987), the Sexual Aggression and Victimization Scale (SAV-S; Krahé & Berger, 2013), the Sexual Experiences Survey 1982 version (Koss & Oros, 1982), and the Post-Refusal Sexual Persistence Scale (PRPS; Struckman-Johnson et al., 2003). Two measures were subscales of measures for broader constructs that subsumed sexual violence perpetration. They were the sexual aggression subscales from the Sexual Experiences Questionnaire (SEQ; Lisak & Roth, 1988) and the Lifetime Perpetrator Sexual Violence subscale from the Cumulative Lifetime Violence Severity Scale (CLVS; Scott-Storey et al., 2020). The number of items included in each measure ranged from 5 to 55, with a mean of 18 items, and a median of 13 items. The most common response scale used in these measures was a binary response scale (i.e., yes or no) (
Description of Measures.
The number of times used in the final sample of this review (
Type of Tactics
Table 2 provides an overview of the types of tactics included in the 13 measures. Measures varied greatly in terms of the types of tactics included in their items. Most measures (
Types of Tactics in the Measures.
Other than the four aforementioned types of tactics, four measures included tactics that could not be easily classified into these four types. Examples of these tactics were taking away someone’s car keys or ignoring someone’s protests. None of the authors of the four measures that included these miscellaneous tactics labeled them as a distinct group of tactics. Instead, these authors grouped them with other types of tactics such as verbal coercion or physical force to form a broader group of tactics that constitute sexual assault or sexual coercion (e.g., SISS; Peterson, 2022).
Among the 13 measures, the SISS was the only measure that included all subtypes of tactics listed in Table 2. The PRPS and the SSS included the most tactic subtypes, except those labeled miscellaneous. The SISS, PRSP, and SSS were also the only three measures that included all verbal coercion subtypes. All other measures lacked one or more tactic subtypes within physical force, substances, or verbal coercion.
Psychometric Evidence
Among the 13 measures, 12 had published evaluations. Table 3 provides a summary of the findings of the evaluation studies. The SES-SFP was the most frequently evaluated measure. Four studies evaluated the psychometric properties of the SES-SFP. The only other measure that was evaluated in more than one study was the SSS. All other measures were included in only one evaluation study. The SES-SFP and the SSS were also the only two measures that had been evaluated with both student and community samples. All evaluation studies used North American samples.
Findings of Reliability and Validity.
The name of the measure that appeared in the article was Sexual Experiences Survey Long Form Perpetration (SES-LFP; Koss et al., 2007). However, the items validated were equivalent to those in the SES-SFP.
The frame of reference was changed from the age of 16 to the age of 14 to be comparable to the frame of reference in the SES-SFP.
The gender-neutral language in the SES-SFP was changed to gender-specific language. One study has found that this difference in language did not affect men’s responses (Anthony & Cook, 2012).
Reliability
Findings regarding reliability were reported for seven measures. The most commonly reported type of reliability was internal consistency, which was available for six measures. Only two measures have been evaluated for test-retest reliability. Evaluation studies often reported different types of coefficients, which makes direct comparisons across studies challenging. While the results suggest that these seven measures have moderate to strong internal consistency or test-retest reliability, none of the findings were directly replicated by another study.
Validity
Findings of validity were reported for 11 measures. Most commonly, studies examined the association between self-reported perpetration and social desirability. Such an association was evaluated for four measures, and the results were inconclusive due to a lack of replication or conflicting findings. The internal structures of four measures have been investigated. All had different internal structures, and only one measure had tactics that loaded onto three factors corresponding to physical force, substances, and verbal coercion (SCS; Tyler et al., 1998). Relationships between responses to different perpetration measures were examined in four studies. The results showed that responses on the SES-SFP, PRPS, and SSS had moderate correlations, but the SES-SFP detected significantly different perpetration rates than the two modified versions of SES. Findings regarding false positives and false negatives were available for four measures, but the results were largely incomparable due to the different methods used to assess false positives and false negatives. Similar to the findings of reliability, none of the findings regarding the validity of these measures were directly corroborated by another study.
Discussion
Sexual violence against women is a persistent problem. A well-validated measure of men’s sexual violence perpetration against women would help advance research and practice in combating sexual violence. Yet, a consensus on what tactics should be included in a perpetration measure is lacking, and scholars have voiced concerns over the validity of existing perpetration measures. Following the systematic review protocol (Page et al., 2021), 13 measures that had been used to assess men’s sexual violence against women were identified from the published literature. The findings of this review provide an up-to-date overview of the type of tactics and psychometric evidence of these measures.
The measures included in this review varied greatly in terms of their content and format, specifically the type of tactics they included. Most commonly, the tactics found in these measures could be categorized into three types of tactics found in the literature: physical force, substances, and verbal coercion. Almost every measure included items representing these general tactics. However, authors differed in their inclusion of subtypes within these tactics, especially for verbal coercion. For example, all measures included verbal coercion items, but only three of them included all verbally coercive tactics subtypes identified across measures. Scholars have voiced concerns over the lack of a widely accepted taxonomy of verbally coercive tactics (Abbey et al., 2021). Without a standard taxonomy, measures include verbally coercive tactics inconsistently and this can lead to underreporting of verbal coercion.
Compared to physical force, substances, and verbal coercion, attempted sexual arousal was included in fewer measures. Only four measures included attempted sexual arousal. The omission of this type of tactic in most measures is concerning because qualitative studies have found that women often reported men using these tactics to coerce them into having sex (Jeffrey & Barata, 2017; Livingston et al., 2004), and quantitative studies have found that more perpetrators used these tactics than other types of tactics (Struckman-Johnson et al., 2003; Testa et al., 2015). The omission of attempted sexual arousal tactics in most perpetration measures may be due to the judgment that these tactics (e.g., starting to undress oneself or the other person with the intention to arouse) are also used commonly in consensual sex. However, a key element that separates consensual sex and sexual violence is whether the sexual act is obtained without consent; thus, any tactic used to acquire sexual acts without consent should be included in a perpetration measure, regardless of whether such behaviors can also be involved in consensual sex. Measures of perpetration should include these tactics to increase validity. However, since attempted sexual arousal can be used in consensual sex, researchers should emphasize the use of these tactics without consent in the wording of the measure to reduce false positives.
Some tactic items were categorized as “miscellaneous” in this review as they did not seem to fall into any of the aforementioned types of tactics. Miscellaneous tactics were present in four measures. These tactics did not form a distinct category in any measure, likely due to their heterogeneity. For example, some tactics in this group are having sex with someone who is asleep (SISS; Peterson, 2022), ignoring someone’s protests (CSS; Rapaport & Burkhart, 1984), taking away someone’s car keys (SISS; Peterson, 2022), and giving someone a gift (PSCS; Mathes & Mccoy, 2011). Some of these tactics, such as ignoring someone’s protest and having sex with someone without asking for consent first, have been reported by women who experienced coerced sex (Jeffrey & Barata, 2017, 2019) or were deemed as constituting sexual offense by attorneys (Peterson et al., 2021). Hence, the inclusion of these tactics may strengthen the validity of a perpetration measure.
Including a comprehensive range of tactics is only one prerequisite of a reliable and valid measure of perpetration. Researchers have voiced concerns over the lack of available psychometric evidence for perpetration measures (Anderson et al., 2017; Kolivas & Gross, 2007). This review found that among the 13 measures identified from the literature, 12 of them had psychometric data available, and only two had their psychometric properties evaluated in more than one study. Reliability results were available for seven measures, and validity results were available for 11 measures. The available results for the 12 measures show moderate-to-strong evidence of reliability and validity. Among the 12 measures, the SES-SFP was the most evaluated measure. The findings of this review showed when comparing the responses on the SES-SFP and the responses on other perpetration measures, moderate to strong correlations were reported, but significantly different perpetration rates were also found. Yet, among the measures that have been compared to the SES-SFP, only the SSS had data for false positives and false negatives, and there is no clear evidence of which of them has a better balance for detecting true positives and true negatives. Without further research examining false positives and false negatives, the lower perpetration rates detected by the SES-SFP are not evidence of poor validity. When comparing perpetration rates detected by different measures, future research should gather data on false positives and false negatives to contextualize the differences found in perpetration rates.
This review identified several areas of improvement in evaluating the psychometric properties of perpetration measures. First, more research is needed to gather evidence for test-retest reliability. Only two measures, the SES-SFP and the SISS had evidence of test-retest reliability. In contrast, evidence of internal consistency was reported for six measures. The use of test-retest reliability is recommended over internal consistency when evaluating perpetration measures (Anderson et al., 2017; Koss et al., 2007). This is because internal consistency is used to evaluate measures that are conceptualized to assess a latent construct (e.g., trait), but perpetration measures are often conceptualized as behavioral sampling measures (Peterson, 2022). Second, evaluation studies should include more types of perpetration tactics. Most evaluation studies identified in this review reported psychometric data for responses to physical force, substances, and verbal coercion items. Attempted sexual arousal and other miscellaneous tactics are under-represented and under-evaluated. Third, more replication is needed to establish strong validity for the existing measures. The reviewed studies often used different indicators of psychometric properties, such as reporting Cohen’s kappa (Anderson et al., 2017) and phi correlation coefficient (Peterson, 2022) for test-retest reliability, making the results not directly comparable. Thus, the evidence regarding the psychometric properties of these measures remains inconclusive, and more research is needed to determine which perpetration measure has the strongest psychometric properties. In addition, all studies were conducted with North American samples, which provides no clear support for their use outside of North American populations.
The current review is not without limitations. First, this review excluded dissertations, theses, and gray literature. There may be other measures of perpetration or psychometric evidence in these types of literature. Second, this review did not include measures designed for single studies or modified measures. Modified measures may, in some cases, have stronger psychometric evidence, and both types of measures may include types of tactics that are overlooked in the current review. Third, articles not published in English were excluded. Measures developed or validated outside of North America may be published in non-English languages. Finally, psychometric evidence obtained with mixed-gendered samples was not included in this review. It is possible that this evidence could be generalized to all-male samples.
Despite these limitations, this review has several critical findings regarding the measures of men’s sexual violence perpetration against women (see Table 4). There is room for improvement in the measurement of sexual violence perpetration. The current evidence supports the use of the SES-SFP as a self-reported measure of perpetration, but its validity can be further strengthened by including more perpetration tactics, such as the ones identified in this review. Additionally, more research is needed to evaluate perpetration measures in general, as the current evidence is insufficient for establishing a gold standard measure. The SES-SFP was the most evaluated measure, yet it often produced different perpetration rates than other measures, and only one study has investigated false positives and false negatives to contextualize the varying perpetration rates. The SSS was the second most evaluated measure, but it still lacks validity data regarding its attempted sexual arousal items, as evaluation studies only examined physical force, substance, and verbal coercion items. Based on the evidence found in this review, conclusions cannot be made regarding whether the SES-SFP or the SSS detects perpetration more accurately. Future research should gather more data to facilitate comparisons of validity for these two measures. It would also be particularly useful to compare the SES-SFP and the SSS with the SISS since the latter measure is the most recently developed measure and includes the broadest range of perpetration tactics. It is also important to obtain validity data for these measures outside North America. A summary of implications for practice, policy, and research is provided in Table 5.
Critical Findings.
Implications for Practice, Policy, and Research.
Ngai Lam Mou, M.A., is a PhD candidate in Applied Social Psychology at the University of Windsor. Her research focuses on sexual violence against women. She is particularly interested in studying men’s perpetration behavior, such as risk factors and patterns of perpetration.
Advancement in the science of measurement of perpetration is crucial to preventing sexual violence against women. It will only be possible to stop sexual violence when we can accurately screen perpetrators and evaluate the effectiveness of prevention programs with reliable and valid measures.
Footnotes
Acknowledgements
I would like to thank Dr. Charlene Senn for her help and support for this manuscript.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
