Abstract
Keywords
Introduction
Sociological research often collects data on private, illegal, and unsocial behavior or extreme attitudes via survey interviews. For example, the German General Social Survey (ALLBUS) asks respondents to self-report on several offenses such as dodging the fare, drunk driving, tax evasion, and shoplifting. In the United States, the National Survey on Drug Use and Health (NSDUH) and the General Social Survey (GSS) regularly ask respondents to self-report on sensitive topics such as drug use or sexual habits. The GSS also asks about very sensitive topics such as prostitution (“Thinking about the time since your 18th birthday, have you ever had sex with a person you paid or who paid you for sex?”). Some survey studies also investigate the incidence of socially undesirable opinions such as xenophobia, racism, and anti-Semitism (Krumpal, 2012; Ostapczuk et al., 2009; Stocké, 2007b).
Cumulative evidence in survey methodologists’ research literature indicates that self-reports on sensitive topics often do not reflect the truth (Jann et al., 2019; Krumpal, 2013; Tourangeau & Yan, 2007). Sensitive questions pose a trust problem for the respondent. Besides the trust problem, there could be other factors explaining why self-reports on sensitive topics do not reflect the truth, for example, self-deception, rationalization, or the fact that recalling information and reporting about unpleasant events can have a subjective cost in itself for the respondent (see Näher & Krumpal, 2012; Tourangeau & Yan, 2007). In this article, however, we focus on the trust problem.
Due to fear of negative consequences, respondents are unwilling to reveal deviant and norm-violating behaviors. They misreport in a survey (systematically underreport socially undesirable behaviors and overreport socially desirable ones) to avoid subjective costs such as embarrassment in the interview situation or sanctions from third parties beyond the interview setting (Rasinski et al., 1999). Such misreporting leads to invalid survey estimates, which are distorted by social desirability bias. To combat misreporting and to obtain more valid answers to sensitive questions, survey researchers have developed different data collection approaches designed to reduce social influence in the data collection process, to guarantee anonymity of the respondent’s answers and to reduce the respondent’s self-presentation concerns (Lee, 1993).
The Randomized Response Technique (RRT)
The RRT is a method to elicit more honest answers in sensitive surveys (Warner, 1965). Warner’s original method relies on the pairing of two statements, both relating to the sensitive attribute (statement and negation of the statement). The respondent uses a randomization device (e.g., cards, coins, dice) to select which of the two statements he or she will answer. For example:
I sometimes smoke marijuana (selected with probability
I never smoke marijuana (selected with probability 1 −
Without telling the interviewer which statement was chosen, respondents answer “Yes” or “No” according to their marijuana smoking habits. Because only the respondent knows the outcome of the randomization device, a specific answer is always ambiguous to the interviewer. The interviewer cannot infer the respondent’s true status from a given answer and, under idealistic assumptions, the respondent trusts in his or her data protection. Probability theory is used to derive an unbiased estimator π ˆ of the sensitive behavior in the population of interest. The expected value ϕ of observing a “Yes” answer can be written as
Furthermore, the sampling variance of π̂Warner can be estimated by
Different modifications of Warner’s original method have been developed and empirically applied (overviews of designs and estimators for different RRT schemes can be found in Blair et al., 2015; Chaudhuri et al., 2016; Fox & Tracy, 1986; Krumpal et al., 2015; Lensvelt-Mulders, Hox, & van der Heijden, 2005). For example, the “forced–choice-design,” which is one of the most widely applied RRT schemes, works as follows (Boruch, 1971): A randomization device determines whether the respondent is supposed to answer the sensitive question truthfully (with probability
probability
probability λ of being directed to give an automatic “Yes” answer (three tails) = .53 = .125,
probability
The expected value ϕ of observing a “Yes” answer can be written as
Furthermore, the sampling variance of π̂FC can be estimated by
In general, all variants of the RRT share the common feature that by deliberately introducing a random element in the question-and-answer process, respondents’ answers do not reveal anything definite to the interviewer (see Nayak, 1994, for a generalized approach for integrating and comparing different RRT designs). The advantage of “protection via randomization” is faced with different drawbacks: Compared with direct questioning, RRT imposes a higher cognitive burden on the respondent. Landsheer et al. (1999) show empirically that respondents with a low degree of understanding of the RRT procedure also have less trust in the method compared with respondents who have a higher degree of understanding of the instructions. However, Landsheer et al.’s results seem incompatible with results of a more recent study by Hoffmann et al. (2017), who did not find a correlation between comprehension of the RRT and perceived privacy protection.
Empirical evidence indicates that a substantial proportion of respondents do not comply with the RRT instructions (Ostapczuk et al., 2009). They give self-protective “No” answers even if the outcome of the randomization device instructs them to answer “Yes.” Statistical models have been developed to account for such self-protective response behavior (Cruyff et al., 2007). Note that some designs, including Warner’s original RRT as well as the crosswise model (Yu et al., 2008), do not feature a safe, self-protective response option. Thus, in these specific (non-)RRT designs, noncompliance is not clearly associated with a specific response option, which makes a cheating correction more difficult compared with RRT schemes with an unambiguous self-protective response option (such as the typical forced choice design or the unrelated question model; see Krumpal et al., 2015, for an overview of different RRT schemes).
A meta-analysis conducted by Lensvelt-Mulders, Hox, van der Heijden, and Mass (2005) suggests that self-reports of self-stigmatizing behavior are overall more accurate with RRT than with direct questioning. However, several other studies indicate that there are serious difficulties of using the RRT (such as higher item nonresponse, negative prevalence estimates, or increased break-off rates) and that the superiority of the RRT should not be taken for granted in any case (Coutts et al., 2011; Coutts & Jann, 2011; Höglinger et al., 2016; Höglinger & Jann, 2018; Holbrook & Krosnick, 2010; Kirchner, 2015; Stem & Steinhorst, 1984; Weissman et al., 1986; Wolter & Preisendörfer, 2013).
John et al. (2018) give a useful overview of previous validation studies demonstrating at best mixed evidence on the performance of RRT versus direct questioning. Based on ideas from cognitive psychology and on experimental evidence, the authors conjecture that RRT may fail because of respondents’ concern over response misinterpretation. In particular, innocent respondents may be concerned that complying to the RRT instructions (e.g., to answer “Yes”) will be misinterpreted as indicating that one belongs to the group of people with sensitive trait A. We argue that even perfectly rational and self-regarding respondents will be (
From a sociological perspective, one fundamental question of the research on sensitive topics is still unresolved: Why do survey respondents answer truthfully to sensitive questions? Esser (1986, 1990) argues that respondent reactions to the measurement process (e.g., truthful vs. socially desirable answering) could be explained by general behavioral regularities, by habits, and by norms that are activated in social interactions in secondary relations (e.g., presentation and deference).
Respondents’ Behavior as a Rational Choice
Former research often assumed that the RRT procedure guarantees complete privacy of answers. The respondent is expected to self-report sensitive information truthfully without fear of negative consequences and, thus, social desirability bias in survey estimates should decrease. However, this expectation is questionable as will be demonstrated. In the following, we present an attempt to model the interview situation as a social interaction via a simple game theoretic analysis. Comments on the RRT research indeed suggest that game theoretic thinking may “be a valuable contribution to the field” (Rao & Rao, 2016, p. 7). However, research along these lines is extremely rare. Because we do not yet have a comprehensive and empirically valid psychological theory of respondent behavior in various interview situations, the purpose of this analysis is to work out the conditions for truthful answers by using an idealized model of rational behavior. There is some previous research in this field within the framework of a rational choice analysis of respondents’ behavior that assumes (expected) utility maximization (Ljungqvist, 1993). Ljungqvist (1993) alludes to the possibility of using theoretical tools from game theory in this area. However, this work implicitly assumes that respondents perceive the interview as a parametric (nonstrategic) situation but not as a social interaction.
Behavioral Assumptions
In addition to consistency assumptions about desires (preferences), game theory postulates that expectations (beliefs) are rational in the sense of objective or of Bayesian (subjective) probabilities. In this way, one can analyze games with complete information and also games with incomplete information. The rationality assumption will be used throughout the article. In game theory, rationality assumptions do not imply that agents are self-interested. Altruism, fairness, or other kinds of other-regarding “social preferences” and normative orientations may well be represented by consistent preferences. In the following, we first use the motivational assumption that agents (respondents) are completely self-regarding. In other words, we first use a kind of rational egoism (or “homo economicus”) model. The motivational assumption of complete self-interestedness will be relaxed in a second step, in that, we consider respondents who are endowed with social preferences. This is, they are not merely motivated by their own material payoffs but consider fairness or reciprocity criteria or they are intrinsically motivated to act in accordance with certain social norms.
Why Do Respondents Participate in Surveys?
There are useful applications of rational choice concepts in previous survey research such as leverage–salience (Groves et al., 2000), risk-of-disclosure (Couper et al., 2008), or benefit–cost theories of survey participation (Singer, 2011). These contributions explain the respondents’ choice of whether to participate in a survey or not. The following theoretical ideas advance these contributions.
Our analysis of respondents’ behavior obviously depends on their willingness to participate in a survey. We assume that the survey contains questions about sensitive items. Any participation in a survey yields
Given these costs, it is tempting to ask whether a rational egoist would ever participate. Even agents who are completely self-regarding, however, may consider
There may be thus conditions (rewards compensate expected costs) such that rational egoists are willing to participate. Given that an agent has social preferences, there are additional rewards and additional costs. As to the costs, there are expected informal sanctions and psychological costs of being detected as someone with the sensitive trait A. With regard to the rewards, there are some further commodities, which may motivate participation: Survey participation can be due to “warm-glow” altruism (in the sense of Andreoni, 1990). It may also be that the participant perceives a moral or other normative obligation to cooperate. Survey participation can also stem from “positive reciprocity” (Fehr & Gächter, 2000; Gouldner, 1960), in particular in face-to-face interviews, if the respondent reciprocates the interviewer’s kindness.
Our presentation rests on certain assumptions, which will be introduced in each of the following paragraphs and which will be modified step by step subsequently. Our contribution is based on the idea that surveys that include sensitive items generate trust problems. There can be trust problems on both sides of the survey relation: The interviewer has a trust problem that arises because the respondent may not give truthful answers, in particular with respect to sensitive items. In this article, however, we focus on the respondents’ perspective: Respondents may distrust whether the interviewer (or the organization that administrates the interview and controls the collected data) in fact is willing to protect the respondent’s privacy. We also develop our argument by comparing incentives to answer truthfully in RRT surveys with surveys that employ the direct mode of questioning. We furthermore demonstrate the impact of several motivational assumptions in these survey modes. In contrast to prior contributions to the field (e.g., Ljungqvist, 1993), we argue that respondents’ behavior depends not only on preferences and beliefs with respect to the stigmatizing trait but also on subjective estimates with respect to the interviewer’s trustworthiness. We share the assumptions that participants indeed (a) are willing to participate in the survey, (b) are able to act
Analysis of Respondents’ Behavior in the Direct Mode (Rational Egoism)
Assumptions 1 and 2 are constant across all presented situations. To reduce repetitiveness, they will not be repeated in the following different situations under consideration.
Note that these assumptions refer to the respondent’s subjective beliefs about the interview situation. It is not necessary that these assumptions are veridical representations of the “true” properties of the interviewer’s preferences. Assumptions 4 and, in particular, 5 represent extremely pessimistic beliefs of the respondent with regard to the interviewer’s type (these assumptions will be relaxed in the “Relaxing Pessimistic Assumptions About the Interviewer’s Trustworthiness: The Incomplete Information Game” section). We propose to represent the interview situation as a trust relation. In sociology, trust has been seminally analyzed by Coleman (1990, chapter 5), who models the investment of trust as a rational decision under risk. Coleman’s account has been subject to the criticism of neglecting the strategic nature of the investment decision. Both agents, interviewer and respondent, must instead be modeled as being rational agents, which can be accomplished by using game theory. The most elementary game theoretic model of the interview situation is depicted in the game tree of Figure 1. The social interaction between an interviewer and a respondent type A in a sensitive survey can be conceived as a game that is akin to a

Respondent type A in direct mode (simple trust game).
The Degree of Privacy Disclosure in the RRT Mode
To give an analysis of respondents’ behavior in the RRT mode, it is useful to specify a measure for the degree of privacy disclosure in RRT surveys. Remember that RRT surveys are designed to increase the degree of privacy protection and decrease the degree of privacy disclosure, respectively. If the respondent is convinced that there is perfect privacy protection, there will, in principle, be no positive incentive to lie.
Note that the following analysis and the proof (see the appendix) hold for RRT designs offering an unambiguous self-protective response option, that is, the forced-choice design or related designs (such as the unrelated question model). The following statements do not hold for symmetric RRT designs, in which noncompliance is not clearly associated with a specific response option (e.g., Warner’s original RRT or the crosswise model; see Yu et al., 2008).
For simplicity, but without loss of generality, only dichotomous items with possible answers “Yes” or “No” in a typical “forced-choice” RRT design will be considered: Although many alternative privacy measures have been discussed or used in the literature for the purposes of our analysis, we assume that the degree of privacy disclosure depends on the difference between the conditional probabilities of being perceived as belonging to a sensitive group A given a specific answer.
2
Thus, the difference
In the next section, some elementary game theoretic arguments to explain a respondent’s tendency to answer truthfully and/or to follow the RRT instructions are presented.
Analysis of Respondents’ Behavior in the RRT Mode (Rational Egoism)
Truthful answers of a respondent with trait A reveal trait A with
Let us now examine the case of asking a sensitive question in the

Respondent types A and non-A in RRT mode (simple trust game).
In addition, rational non-As (who do not have the sensitive trait) may be reluctant to follow the RRT instructions. They are tempted to give a protective “No” answer even if the result of the randomizing device instructs them to answer “Yes.” This is so because only the protective answer will secure that respondents do not become suspect of having sensitive trait A. In other words, both types of respondents, As and non-As, have an incentive to lie or to disregard the RRT instructions, respectively. Assuming rationality, both types of respondents will recognize that “Yes” answers (which would be stigmatizing in the case of direct questioning) reveal trait A with probabilities
Note that the modified structure of the trust game in Figure 2 predicts that even respondents type non-A have an incentive to disregard the RRT instructions and to give evasive “No” answers even if the result of the randomizing device instructs them to answer “Yes.” This corresponds to qualitative observations in former RRT surveys. Some exemplary respondents’ statements were “I only said ‘Yes’ because I tossed 3 times head” or “what I tossed does not reflect my true opinion.” Especially with items reflecting xenophobic and anti-Semitic attitudes, respondents were reluctant to give a surrogate “Yes” answer independent of their personal opinions (Krumpal, 2010). The unique Nash equilibrium is not to give a truthful answer (and not to follow the RRT instructions, respectively) and not to protect privacy. Because bias is introduced by both types of respondents, As and non-As, the potential for overall social desirability bias is higher in the RRT mode. It is important to notice that in the case that a proportion of type non-A respondents does not follow RRT instructions, there will, ceteris paribus and even if—counterfactually—all A types answer truthfully, be an underestimation of ϕ and, therefore, also of the true population prevalence π of the sensitive trait. If there is a considerable fraction of rational egoists among respondents, there will be many false negatives and even negative prevalence estimates (as reported, on the basis of experimental data, in Coutts & Jann, 2011). In contrast, only respondents of type A introduce social desirability bias into prevalence estimates in the direct questioning mode.
In conclusion, our analysis predicts that rational and self-regarding respondents (under standard “homo economicus” rationality assumptions) in general will not participate and (if so) not answer truthfully in sensitive surveys. To elaborate conditions under which respondents answer truthfully (and comply with the RRT instructions, respectively) in sensitive surveys, different motivational assumptions have to be introduced into the model.
Relaxing Pessimistic Assumptions About the Interviewer’s Trustworthiness: The Incomplete Information Game
Our results about behavior in the direct and in the RRT modes critically depend on respondents’ extremely pessimistic beliefs about the type of the interviewer. However, respondents may be more optimistic, in that, they know (in game theoretic terms: have a common prior probability estimate) that a fraction µ (1 > µ > 0) of interviewers is trustworthy. Thus, we employ the following (modified) behavioral assumptions in the
The rationale for Assumptions 5’ and 6 can be seen in the fact that some proportion of interviewers or organizations that administrate the surveys are intrinsically motivated to behave trustworthy or because they want to acquire a good reputation. However, according to assumption 7, respondents are not able to evaluate the trustworthiness of individual interviewers. Figure 3 depicts the basic structure of the incomplete information game representing the direct mode.

Incomplete information in direct mode for respondent type A (extended trust game).
Examining the incomplete information game for the direct mode is straightforward. Because to lie is weakly dominant whenever (as has been assumed) µ < 1, there is no incentive to give a truthful answer—irrespective how large the prior µ is. For µ = 1, however, the game is equivalent to the situation with complete information and with an interviewer who is considered as being perfectly trustworthy.
A respondent type non-A (without sensitive trait) obviously (due to the assumption
Figure 4 shows the game tree of this incomplete information game in the RRT mode. Because the game is structurally identical to the game in Figure 3, analogous results apply.

Incomplete information in RRT mode for respondent types A and non-A (extended trust game).
Introducing Social Preferences and Norms
There is by now a comprehensive literature in behavioral game theory indicating the effects of social preferences and of norms on cooperative behavior (see, for example, Camerer, 2003; Diekmann, 2004). In the survey methodology literature, a great deal of work assumes that participating in an interview may depend on rewards (such as approval from the interviewer) such that a motive of positive reciprocity is elicited on part of the respondent. Sometimes this reciprocity is associated with the activation of a “norm of truthful answering” (Esser, 1990) prescribing that someone should be honest and cooperative in a social interaction (e.g., in a survey interview). This norm may possibly interfere with another norm, which is relevant in this realm, namely, the “norm of social desirability,” specifying that certain kinds of behavior are negatively valued by society. Given this norm, respondents with sensitive trait A will incur costs of embarrassment if they answer truthfully in the direct mode or if there is a positive probability that the trait will be detected by the interviewer in the RRT mode. This may in particular be the case in face-to-face interview situations. 4
There is of course a plethora of possible ways to model social preferences and internalized norms in a game theoretic context. Because, in this article, we do only want to use the most elementary modeling tools, we can represent these ideas by the following assumptions, which apply to the interview situation in the
First, consider the direct mode under complete information conditions including social norms, which is represented in Figure 5. The figure covers two types of respondents: Either

Respondent type A in direct mode (simple trust game including social norms).
Introducing more optimistic beliefs as before to the direct mode situation leads to our next result. We assume that there is a nonzero probability of a trustworthy interviewer as before.
The game model for the direct mode under incomplete information conditions including social norms is depicted in Figure 6.

Incomplete information in direct mode for respondent type A (extended trust game including social norms).
For a respondent type A, the following predictions could be derived with respect to the direct mode: If µ exceeds the critical probability µ*: = 1 − (
This result can again be applied to two types of respondents: Either
If
The larger the strength of the intrinsic motivation to tell the truth
Conformity to the “the norm of truthfulness” is immediately recognized by the interviewer if the respondent gives a self-stigmatizing “Yes” answer in the direct mode. Furthermore, a “Yes” answer in the direct mode can be interpreted as a strong signal to the interviewer that the respondent values the “norm of truthfulness” highly. In contrast, a respondent type non-A will always give a truthful “No” answer in the direct mode.
Let us finally examine the RRT situation under conditions of more optimistic beliefs about the interviewer and for respondents with internalized norms. Now the following assumptions apply:
The respondent with trait A will, therefore, incur costs
The cost may (in addition to material sanctions) be related to the cost of violating “the norm of social desirability.”
The game model for the RRT mode under incomplete information conditions including social norms is depicted in Figure 7:

Incomplete information in RRT mode for respondent types A and non-A (extended trust game including social norms).
With respect to the RRT mode, the following predictions could be derived for both types of respondents A and non-A: If µ exceeds the critical probability µ**: = 1− (
Conformity to the “norm of truthfulness” is not directly recognized by the interviewer if the respondent answers “Yes” in the RRT mode. For respondent type A, a truthful answer is less costly in terms of subjective risks of being punished (if interviewer is opportunistic) compared with the direct mode. Furthermore, a “Yes” answer in the RRT mode may be interpreted as a weak signal to the interviewer that the respondent values the “norm of truthfulness” highly.
Comparing Propositions 6 and 7 yields the following proposition with respect to the probability to answer truthfully: The probability for As to answer truthfully is (holding constant
Summary
In summary, Table 1 gives an overview of the conditions for giving truthful answers under incomplete information for respondent type and interview mode.
Conditions for Giving Truthful Answers Under Incomplete Information for Respondent Type and Interview Mode.
Our approach reveals that rational
Introducing
Discussion
In this article, a simple game theoretic approach to the survey interview has been presented. Our analysis is based on certain assumptions, which may be targets of critical comments. With regard to the assumption that is implied in most theoretical work on this subject that respondents are able to act

The degree of privacy disclosure as a function of the base rate
In the following, some further ideas are outlined: One possible model extension is relaxing the assumption that
Thus, one can think of a “
A rational choice analysis of the social interaction in sensitive surveys shows that modelling
Furthermore, future theoretical and empirical studies could focus on the impact of the RRT scheme on the innocuous (type non-A) respondents’ tendency to answer truthfully and comply with the RRT instructions, respectively. Whereas respondents of type A might benefit from the RRT mode, respondents type non-A might not: In our discussion of preliminary research, we reviewed empirical studies documenting noncompliance with the RRT rules, self-protective “No” answers, and negative prevalence estimates (Coutts & Jann, 2011; Holbrook & Krosnick, 2010). It is likely that these problems are primarily driven by respondents type non-A. This result is in accordance with our game theoretic model predicting that the probability for non-As to answer truthfully and comply with the RRT instructions, respectively, is lower in the RRT mode than in the direct mode or equal in both modes (depending on the respondent’s preferences, either
In regard to prevalence estimation using different data collection methods, one could hypothesize that RRT failures are more likely to occur with sensitive characteristics that are less prevalent (e.g., heroin use) compared with ones that are highly prevalent (e.g., alcohol use). This is because, in the former case, a higher share of respondents type non-A exists, for which the use of the RRT mode might be less beneficial as our theoretical model suggests. In future empirical studies focusing on different sensitive characteristics with varying prevalence rates, this prediction could be directly tested. However, note that the suggested manipulations will, in many cases, affect not only the prevalence rates (and thus the influence of self-protective answer behavior by respondents type non-A) but also the costs: Attributes with low prevalence rates are also often very sensitive (e.g., heroin use), whereas attributes with high prevalence rates tend to be less sensitive (e.g., alcohol use). With increasing item’s sensitivity, the extent of self-protective answer behavior (and also the risk of RRT failure) is expected to increase. Researchers designing an experimental test of our model’s prediction should be aware of the potential of confounding between the prevalence rate (i.e., the share of respondents type non-A) and the item’s sensitivity.
Finally, possibilities and limits of game theoretic analyses of the survey response process in sensitive surveys could be further explored. In our article, we explicate and discuss the theoretical foundation of the research on sensitive topics and social desirability bias in the context of a general theory of social interactions. Taking into account the interactive nature of the interview situation in sensitive surveys, our work advances former theoretical contributions (i.e., parametric models of decision making; see Esser, 1986; Stocké, 2007b), who conceptualized the choice whether or not to answer truthfully as a parametric decision problem of the respondent and not as a strategic situation. We think that our game theoretic model contributes to a better understanding of the psychological processes and social interactions between the actors (respondents, interviewers, and data collection institutions) that are involved in the collection of sensitive data.
Empirical researchers could also benefit from our insights providing them with a substantiated theoretical basis for optimizing the survey design to achieve high-quality data: Former theoretical papers assumed that all respondents give truthful answers and follow the RRT procedure, respectively (e.g., Nayak, 1994). In contrast, our theoretical model argues that these assumptions are questionable and predicts that truthful responding is less likely for innocuous (type non-A) respondents in the RRT mode than in the direct mode. To increase the respondents’ motivation to comply with the RRT instructions, careful designing and pretesting of the concrete RRT implementation as well as a thorough interviewers’ training seem reasonable strategies to generate better data. RRT surveys should always be pretested very carefully. If the pretests of a specific study indicate severe problems in regard to the implementation of the RRT, alternative methods of privacy protection might be considered (e.g., self-administered data collection, mixed mode designs, sealed envelope techniques, or special wording approaches; for an overview, see Krumpal, 2013; Tourangeau & Yan, 2007).
In regard to prevalence estimation, statistical methods using a cheating extension of the RRT (e.g., Ostapczuk et al., 2011; Reiber et al., 2020) should be used to account for self-protective response behavior, especially in surveys in which the characteristic under investigation is very sensitive or has a low prevalence rate (i.e., in populations in which the share of respondents type non-A is high). These considerations regarding survey design and analysis are quite general in nature. They are based on predictions of the proposed theory that should be tested empirically in future research studies.
