Abstract
Introduction
Given the ubiquity of personal information and online behavior collected, one of the biggest challenges facing firms is large-scale data breaches where a significant amount of data is either accidentally or deliberately released to external parties (Goode et al., 2017). In light of the recent scandals such as security breaches by Cambridge Analytica, there is a growing concern about social media privacy. In 2018, the Breach Level Index (https://breachlevelindex.com), a database monitoring worldwide data breaches, highlighted that data breaches are becoming more frequent and larger in scope. Identity theft was once again the most prevalent data breach type. It accounted for approximately 83% of the accounts breached in H1 2018, a massive growth of 757% over the previous year. Social media accounts for the majority of the data breaches in 2018 (56.2%) compared to only 1.5% in 2017. In a survey conducted by Pew Research, a majority of Americans (64%) have personally experienced a major data breach and lack trust in key institutions—especially the federal government and social media sites—to protect their personal information (Arli & Dietrich, 2017).
The two key problems facing an organization after a data breach are (1) financial losses, and (2) customer misgiving (such as brand equity loss, customer turnover). The average cost of a data breach worldwide reached US$3.92 million, while the US has the highest average cost of US$8.19 million (Ponemon Institute, 2019). Lost business from customer turnover appears to be the biggest cost component of a data breach (Ponemon Institute, 2019). A key problem for firms is that affected customers often discontinue their relationship with the organization. The 2019 global average customer turnover rate is 3.9%, an increase from 3.4% in 2018. This is despite the fact that organizations notify affected customers and offer an apology, explain the nature of the breach, and recommend steps that customers can follow to protect their information.
Although information campaigns are required and customer compensations can be effective in recovery, firms still face the added challenge of the effects these data breaches might have on their customers, especially on consumer psychology and expectations. While data breaches have been looked at by different disciplines such as information systems (Lowry et al., 2017), public policy (Caudill & Murphy, 2000), legal and ethics (Spiekermann et al., 2015), very little has focused on the consumer outcome. Goode et al. (2017) found that most, if not all, studies on consumer outcome have been post-hoc, that is, after they have occurred and they usually suffer from various biases such as recall bias. Our study addresses this weakness by looking at the impact of data breaches on consumer outcome by comparing their attitude and behavior before, during and after a data breach. Our study was fortuitous that we were studying data breach and privacy one year before Facebook’s data scandal. We then were able to collect data during and after the data breach to be able to do multi-period comparisons.
Researchers have tried to look at the effects of data breaches and recovery effectiveness; for example, customer compensation (Goode et al., 2017), customer spending (Janakiraman et al., 2018), corporate reputation (Gwebu et al., 2018), investor confidence and stock performance (Martin et al., 2017), consumer perception such as trust and perceived vulnerability (Chatterjee et al., 2019). The goal of this research is to address the impact of Facebook’s data breach on consumers’ trust and their motivation to engage in protection behavior. To achieve this, we compare the effects over three time periods: before, during and after the data breach.
Five major streams of research inform our work in this paper: (1) technology adoption model (TAM), (2) consumer privacy paradox, (3) service failure, (4) protection motivation theory (PMT), and (5) trust. First, digital life has become such an integral part of consumers’ existence that it is hard to separate the two. Second, while consumers value their personal data, there is an inconsistency between the concerns of people regarding privacy and their actual behavior (Palmatier & Martin, 2019). Research has found that consumers are willing to trade personal data for perceived benefits (Norberg et al., 2007). Third, data breaches are a form of electronically mediated service failure (Bolton, 1998). Fourth, PMT (Rogers, 1975, 1983) provides a theory of how consumers are motivated to protect themselves based on both the threat appraisal and the coping appraisal. Fifth, consumers’ online behavior assumes a certain degree of trust on their part and data privacy of their personal information.
Our study makes three major contributions. First, although research on data breaches is critical, most research is focused on organizational response. Very little research is done to look at the consumer side and how they react to such breach (Goode et al., 2017). This study examines the mechanisms that motivate people to protect themselves from perceived risks to make social media safer for them. Second, our research adds to the service failure literature about how organizations should handle a public relation crisis to restore consumer confidence. Third, most research on data breaches has focused mainly on post-breach analysis, that is, the impact of data breach. In our study, we were able to measure the before-during-after a data breach using cross-sectional panel studies to measure the impact of a data breach on consumers’ perception and behavioral intentions.
Literature Review
The TAM is an information systems model that looks at how users accept and use a technology (Davis, 1989; Venkatesh & Bala, 2008) that is based on the theory of reasoned action (Ajzen & Fishbein, 1970). Two key factors determine users’ willingness to use the technology: perceived usefulness and ease of use. Behavior intention is influenced by the users’ attitude towards the general use of the technology. Attitude in turn is influenced by many external factors such as social influence and design characteristics. There has been substantial empirical support for the TAM over the last few decades. Venkatesh and Bala (2008) found that TAM consistently explains about 40% of the variance in individual’s intention to use a new technology. There have many studies that adopt the TAM model in online behavior such as key drivers of Facebook usage (e.g., Rauniar et al., 2014).
Researchers generally define consumer privacy as a consumer’s ability to control when, how, and to what extent their personal information is to be transmitted to others (G. R. Milne & Culnan, 2004). Past research has looked at the relationship between privacy concerns and various variables; for example, consumer attitudes and behavior (Tsay-Vogel et al., 2018). Consumers’ privacy concern has become an important issue in light of the Facebook–Cambridge Analytica data scandal in early 2018 where Cambridge Analytica had harvested the personal data of 87 million Facebook profiles without their consent and used it for political purposes. In addition, Facebook also revealed other data breaches such as a software bug that may have revealed the posts of up to 14 million users and a security hack that allowed an unknown party to take over 50 million accounts (Abbruzzese & Boyce, 2018).
The outcome of service failures is usually a negative experience for the consumers. Service failure and recovery research has matured over the last three decades and there is a considerable amount of research on how firms that suffer service failure should recover from it (e.g., Bitner et al., 1990). In a review of 44 empirical research, Goode et al. (2017) found the majority of research has focused on recovery strategies such as apology and explanation, and the effects of compensation on consumer reaction. However, there has been no empirical data from affected customers looking at before, during and after a service failure. If there is any research looking at pre- and post-service failure, it is usually done in a simulated lab environment (e.g., Du et al., 2011). There is a gap in the literature looking at the long-term effect of service failure especially on how consumers engage in protection behavior.
The PMT was first introduced by Rogers (1975, Rogers, 1983) as a framework for the prediction of and intervention in health-related behavior (S. Milne et al., 2002). It explains why people engage in different unhealthy practices and offers suggestions how to change these behaviors. According to PMT, an individual’s motivation to protect oneself from risks is influenced by four cognitive assessments: the vulnerability to the risk, the severity of the risk, response efficacy, and self-efficacy. The theory states that people’s motivations to protect themselves are weakened by perceived benefits of risky behaviors and perceived costs of advocated risk reducing behaviors. They can be grouped into two cognitive processes: threat appraisal (severity, vulnerability, and benefits) and coping appraisal (self-efficacy, response efficacy, and costs). Vulnerability refers to the likelihood the risk to occur to oneself and severity of it if it occurs. Response efficacy refers to perceiving that protection behavior to be effective in reducing the risk; and self-efficacy refers to the ability to perform a desired behavior. This study adopts the PMT as the research framework for understanding behavioral choices before, during, and after a data breach, in which threat appraisal is increased with benefits (both intrinsic and extrinsic) and decreased with severity and vulnerability. It assumes that the appraisal of these factors (severity, vulnerability, and benefits) increases or decreases the probability of the maladaptive response. This study thus assumes four sufficient conditions to elicit protection motivation: (1) the threat of data breach is severe, (2) one is personally vulnerable to the consequence of the data breach, (3) one has the ability to perform the coping responses, and (4) the coping response is effective in protecting oneself. According to social contract theory, users voluntarily provide personal information in exchange for the ability to connect socially; therefore, they perceive benefits as well as risks regarding online self-disclosure (Okazaki et al., 2009). Research shows that only when perceived benefits outweigh risks will users practice this social contract (Culnan & Armstrong, 1999). On the other hand, recent studies also show that online disclosure is influenced by both perceived benefits and perceived risks (Krasnova et al., 2010).
In addition to consumers’ protection motivation, the impact of a data breach may have a severe effect on their trust, which affects a firm’s reputation. Jøsang et al. (2007) argue that while trust and reputation are closely linked; however, there are important differences between the two concepts. While reputation is earned mainly from public belief, trust is an internal personal and subjective phenomenon. For example, consumers may choose not to trust a reputable firm after a data breach. When it comes to trust, consumers’ personal experience tends to overwrite public reputation. In marketing, trust is defined as “the willingness to rely on an exchange partner with whom the individual has confidence” because of their expertise or reliability (Moorman et al., 1992). In virtual communication, the absence of social cues necessitates the reliance on trust (Ridings et al., 2002). Research has found that individuals do not normally disclose information about themselves to other individuals if they do not trust them (Wheeless & Grotz, 1977). Building on social exchange theory, Metzger (2004) argues that trust reduces perceived risks, that is, individuals with high trust perceive a low risk of interpersonal exchange and perceive the exchange to be beneficial (Dwyer et al., 2007). In addition, higher levels of trust increase the probability of more self-disclosure (Wheeless, 1976).
Research Framework
In Figure 1, we propose a theoretical model based on the PMT that looks at the impact of perceived risks and benefits on protection behavior, self-disclosure, and message valence that is mediated by trust. Self-disclosure refers to what users reveal about themselves on Facebook. Message valence refers to the intrinsic, emotional goodness (positive valence) or averseness (negative valence) of the postings. Protection behavior refers to the likelihood of controlling their postings on Facebook. Perceived risks refer to the perceived negative consequences (vulnerability and severity) resulting from information disclosure in three areas: social, psychological, and physical. Perceived benefits include the benefits of posting on Facebook: entertainment and maintaining social relationships.

Conceptual model.
Research shows that consumers feel vulnerable with marketers collecting their personal data (Martin et al., 2017). This may lead to consumers losing trust in a company. Research has found that users rarely-to-occasionally engage in protecting their privacy online (Boerman et al., 2021). Boerman et al. (2021) found that perceived severity significantly predicted protection behavior. Palmatier and Martin (2019) argue that consumers’ perceived vulnerability can be mitigated by firms offering a sense of transparency and giving consumers a degree of control of their personal data. Both perceived risks and trust have been identified as two key antecedents with respect to privacy concerns and intentions (Norberg et al., 2007). The primary focus of this study is driven by the association between risks and benefits (perceptions), trust, and privacy protection (intentions). The theory of reasoned action (Ajzen & Fishbein, 1970) proposes that behavioral intention is a product of one’s relevant attitudes and beliefs. Youn (2009) found that perceived risks increase privacy concerns while perceived benefits decrease privacy concerns. Z. T. Chen and Cheung (2018) found that once users entrench their social profile with a platform, the inertia to remain in it outweighs their needs to secure their privacy; ergo, the privacy paradox. Therefore, the following research question is proposed:
RQ1: What effects do perceived risks and perceived benefits have on privacy protection?
Researchers have looked at the relationships between self-disclosure and social media; for example, motivation to post (Bazarova & Choi, 2014), effects of self-disclosure (Luo & Hancock, 2020), privacy risks (Krämer & Schäwel, 2020), privacy protection behavior (Boerman et al., 2021), etc. For the self-disclosure outcome variable, the effect paths of perceived risks function in an opposite direction, while it is a positive direction for perceived benefits. Liang et al. (2017) argued that content sharing behavior is potentially in conflict with the need to reduce privacy risk on the Internet. On the other hand, Tsay-Vogel et al. (2018) found that social media has cultivated more relaxed privacy attitudes, subsequently increasing self-disclosure in both offline and online contexts. Applying the assumptions to the issue of privacy, one might predict that those who hold more relaxed views of privacy have stronger intentions to self-disclose and vice versa. Therefore, the following research question is proposed:
RQ2: What effects do perceived risks and perceived benefits have on self-disclosure?
According to Petronio’s (2002) communication privacy management (CPM) theory, individuals maintain privacy boundaries (the limits of what they are willing to share) with various partners depending on the perceived benefits and costs of information disclosure. It has been used recently to explain social media privacy management such as message valence. Message valence in previous studies is often treated as a predictor. Studies have found that positive consumer reviews have a positive impact on product sales; in contrast, negative consumer reviews are more likely to influence brand evaluation and product judgement (Z. F. Chen et al., 2017). However, it is not clear whether users who perceive social media as risky will share more positive than negative postings. Therefore, our proposed research question is:
RQ3: What effects do perceived risks and perceived benefits have on message valence?
In a virtual world, individuals have to interact with others with few social cues especially when posting on social media. This necessitates the reliance on trust between communication partners. According to social exchange theory, Metzger (2004) argues that the presence of trust reduces perceived risks when posting private information, that is, an individual with higher trust perceives a lower cost of interpersonal exchange and the exchange to be beneficial (Dwyer et al., 2007). Wheeless and Grotz (1977) found that higher levels of trust increase the probability of more self-disclosure. The importance of trust has been widely recognized in many studies (Norberg et al., 2007). As Facebook users create and share their information, they feel their privacy will be protected. Rauniar et al. (2014) found that trust is a critical determinant in minimizing security and privacy concerns for Facebook users. Therefore, the following research question is proposed:
RQ4: Trust mediates between perceived risks/benefits and protection behavior, self-disclosure and message valence.
Methodology
Sample
For this study, the data came from a national representative consumer panel provider Innovate MR (http://www.innovatemr.com/) that has over 3.5 million panelists in the U.S. An online survey was sent to prequalified Facebook users over three time periods. Wave 1 was conducted one year before the Facebook data breach in April 2017 with 859 respondents. Wave 2 was conducted right after the data breach in May 2018 with 807 respondents. Wave 3 was conducted about 18 months after the data breach in October 2019 with 512 respondents. For each wave, a standard data screening analysis was used to drop respondents based on either too many missing data, disengaged respondents, and missing data for sex. Our final usable sample size is 822 for Wave 1, 773 for Wave 2, and 500 for Wave 3. Series median or mean was imputed for missing data for the remaining respondents. For Wave 1, our sample is made up of users that have been on Facebook around 7.2 years with an average 389.6 friends and spend half hour to one hour on Facebook a day. The average age is 39.8 years and female users make up 55.2%. For Wave 2, the sample has been on Facebook 8.5 years with an average 481.2 friends and spend half hour to one hour on Facebook a day. The average age is 35.6 years and 79.2% are female. For Wave 3, the average years on Facebook is 9.1 years with an average 429.2 friends and spend half hour to one hour on Facebook a day. The average age is 52.6 years and 67% are female. Since the data were collected over two-and-a-half-year period, it would be expected that respondents in Wave 3 would have been on Facebook longer.
Measurement
Four items were used to measure each of three types of online risks: social, psychological, and physical. The scales are adapted from Johnston and Warkentin (2010). Research has identified that two of the most important benefits of engaging on Facebook are social communication and entertainment (Wang et al., 2014). We measure perceived benefits of using Facebook based on social communication or relationship (three items) and social entertainment with friends (three items). We adapt Lee et al.’s (2008) measure of self-disclosure, using three items to measure the self-presentation motivation to intentionally present oneself to others in a favorable style. User protection behavior was measured using two items. Message valence of postings, that is, whether they will post more positive or good things about themselves on Facebook was measured using two items. We measure trust as generalized trust that people have about fellow members of society using the three-item disposition to trust scale by Ridings et al. (2002). All items except for demographic variables are measured on a 7-point Likert scale.
Results
We use the structural equation modeling (SEM) method and multi-group analysis to explore whether the measurement models are similar across the three time periods. We first conducted an exploratory factor analysis (EFA) on the 38 items used. Our factor stopping criteria included using an eigenvalue of one and a factor loading value of .4 for factor interpretation. Principal component analysis (PCA) with promax rotation initially yielded seven factors. Ten items were eliminated due to low or cross-loadings and the structure explained 76.3 percent of the matrix variance. However, one of the factors had both perceived social and psychological risks loaded together and it was decided that it made more theoretical sense to separate them into two factors. Table 1 shows the loadings and Cronbach alphas for all scales that indicate both internal consistency and reliability of all constructs.
Measures and Factor Loadings.
easured from 1 = strongly disagree to 7 = strongly agree.
The factor structure was further tested via confirmatory factor analysis (CFA) that examined whether each of the dimensions had good measurement properties and was distinct from the other dimensions. Using AMOS 26, the CFA shows a good fit was achieved [χ2 = 1,338.6 (
Reliability and Validity.
We also ran a more rigorous metric invariance test by constraining factor loadings to be equal (fully constrained model) across the three time periods analyzed (Vandenberg & Lance, 2000). The χ2 difference between the unconstrained model [χ2 = 2,001.9 (
Since we used online surveys to collect all our data, it may introduce systematic response bias or single (common) method bias. We tested for common method bias using the “single unmeasured latent method factor method” suggested by Podsakoff et al. (2003) to extract the common variance. We included an unmeasured latent factor to the measurement model during the CFA that includes all indicators from all other latent factors. We then compared the unconstrained common factor model to the zero-constrained model. The χ2 difference between the two models is 333.7 (
To test the conceptual model across the three time periods, two separate analyses were conducted mainly due to the way the data was collected. In the first analysis, because trust was not measured in the pre-scandal stage, we were not able to test the full conceptual model in Figure 1 across all three time periods. Trust was measured only in stage 2 (during data breach) and stage 3 (post data breach). In the first model test, we tested using only variables that were available in all three time periods, that is, perceived benefits and perceived risks on protection behavior, self-disclosure and message valence (Figure 2). The overall causal model shows a good fit [χ2 = 31.3 (

Structural model without trust as mediating variable*.
Structural Model Results (Without Trust as Mediator).
Looking at the standardized weights, the results suggest that the social benefits of posting online outweigh perceived risks in determining consumers engaging in protection behavior, disclosing more about themselves and posting more positive messages. This is consistent with the consumer privacy paradox (Palmatier & Martin, 2019) and other research where consumers are reluctant to self-disclose except when it involves the norm of reciprocity (Hill & Stull, 1982; Moon, 2000; Shaffer & Tomarelli, 1989). The reciprocity effect assumes that people engage in self-disclosure if they believe that their disclosure is returned in kind from their partners. In addition, self-disclosure is a key component in relationship development and maintenance (Derlega et al., 1993).
To test time effect, all the constructs are compared across the three periods with ANOVA and to identify any difference in means using the post-hoc Scheffe test. As shown in Table 4, there is no statistical difference for all perceived risks across the time periods and the means suggest that consumers are aware of the risks with their social media behavior. There seems to be a time effect for perceived benefits and protection behavior. For both perceived benefits, consumers have become more concerned about the benefits of posting online, that is, there is a significant drop in the means during and after the data breach. This is also true for protection behavior and self-disclosure. This pattern suggests there is a significant impact of the data breach on consumers’ perceived benefits of posting on Facebook and they are more likely to engage in protection behavior and less likely to self-disclose. Next, the paths for the model were tested for time differences using multi-group analysis. The results in Table 5 comparing the model (unconstrained) across the three time periods show that there is no statistical difference (χ2 = 47.94,
Difference in Means Before, During, and After Data Breach.
easured from 1 = strongly disagree to 7 = strongly agree.
Time Period Analysis (Without Trust as Mediator).
For the second test of the conceptual model with trust as the mediating variable, we analyzed the relationships specified in the causal model in Figure 3. The overall structural model shows a good fit [χ2 = 66.2 (

Structural model with trust as mediating variable*.
Structural Model Results (With Trust as Mediator).
Next, the paths for the model were tested for time differences using multi-group analysis. The results comparing the model (unconstrained) across the two time periods show that there is no statistical difference (χ2 = 32.88,
Indirect Effects Mediated by Trust.

Final conceptual model.
Conclusion and Implications
Consumer data privacy has recently become more salient especially in the aftermath of the massive data breach at Facebook. Our study is the first to try to address the “privacy paradox” among users to see if their online behavior (protection, disclosure and message valence) has changed over time due to Facebook’s data breach. We examine this relationship across three periods—one year before the data breach (wave 1), during (wave 2), and a year and half after the data breach (wave 3)—to do a temporal comparison. While they want more control over their personal data, they do not necessarily scale back on the online time especially on social media as digital media is such an integral part of their daily lives. Our results are consistent with other research that shows users are willing to provide personal information in exchange for the ability to connect socially (Okazaki et al., 2009).
Our study found that the structural paths of our model have not changed over the three time periods, suggesting that Facebook’s data breach scandal has not changed users’ pattern of behavior in terms of protection behavior, self-disclosure and message valence. Perceived risks directly affect protection behavior, self-disclosure and message valence. Perceived benefits directly affect protection behavior and self-disclosure. Trust mediates between perceived benefits and protection behavior and message valence. Overall, our study found that perceived benefits outweigh perceived risks in determining consumers’ engagement in protection behavior, self-disclosure and message valence. This is consistent with prior research that shows perceived benefits reduce privacy concerns (Youn, 2009). However, the ANOVA analysis shows that the data breach did raise users’ concern about the benefits of posting online, likelihood to engage in protection behavior, and are less likely to self-disclose. Our study has found that Facebook users do balance between perceived risks (particularly social and psychological) and the social benefits when posting online.
There are a few implications from the results of this study. Companies such as Facebook are under a lot of pressure to secure and protect user data. They need to, either by legal or regulatory means, do a better job of convincing users that their personal information is secure and being used in a judicious and ethical way. Our study shows that if users do not have the confidence and trust with Facebook, they may limit their activities and worst, may abandon Facebook completely. On the other hand, firms such as Facebook need to balance between their business models (ad revenue) while trying to satisfy this external privacy demand. After their data breach, Facebook has made many changes in their business model, including granting more privacy control to users and limiting third party advertisers access to the information. However, this has a huge impact on both their top and bottom line results. Advertisers now have a harder time in using user data to target their audience at a finer individual level. They will instead have to be able to do this at a more aggregated level. More firms are taking advantage of Facebook’s data breach to argue for more consumer choices, more transparency with the way firms handle user data, and some competitors are updating their own privacy policies that could seriously undercut how Facebook makes money. For example, Apple plans to update their privacy policy that requires app developers to get users’ permission to collect data used for targeted advertising, and allows users to opt-out of this type of tracking (Hartmans, 2021; Leskin, 2021). This could mean an extensive cut to Facebook’s advertising revenue and its dominance in the social media advertising space.
The principle of regulations is that consumers have the right to their personal information and how the information is transferred and protected. However, this is aimed at those who collect the information, not at those who disclose it. Hence, motivating consumers to protect themselves is necessary. For example, social media networks should develop algorithms that detect risks in consumers’ posts to notify them to protect themselves. This self-regulation practice would reduce more government regulations, which helps reduce compliance costs. This can position companies as pro-consumers and market leaders.
Finally, privacy advocates and government agencies will also have an important role in this. They have to come up with a sensible set of laws and regulations that have all stakeholders in mind. For example, California has passed the new California Consumer Privacy Act (CCPA) in 2020 where any California consumer is allowed to see all the information a company has saved on them and a full list of all the third parties that data is shared with (Korolov, 2019). Both CCPA and the European Union’s General Data Protection Regulation (GDPR) share similarities and differences; for example, CCPA gives consumers more access to their data.
Limitations and Future Research
Even though our research is the first to look at changes in perception and behavior before, during and after Facebook’s data breach, and it offers valuable insights for researchers, policy makers, and practitioners, it has limitations that may offer opportunities for future research. While we were able to show the structural invariance of our conceptual model, we did not measure trust prior to the data breach and could only test the mediating effect of trust during and after the data breach. This limits our study’s ability to have a benchmark to test the temporal validity of the full model. While our conceptual model is based on PMT, we focused mainly on the risk appraisal and benefits to determine self-disclosure, message valence, and protection behavior. Other researchers may want to include coping appraisal (e.g., self-efficacy) in the future. In addition to PMT, future research may also want to use other social exchange theories to drive the understanding of disclosure and protection behavior. Future research may also want to look at other socio-psychological traits that are important in explaining online behavior such as personality, group dynamics and network analysis. Finally, future research may look at other moderators and/or mediators that may enhance or mitigate the relationships; for example, cross-cultural effects.
