Abstract
Keywords
Introduction
Communication scholarship has long been interested in which particular media channels people use to get information (e.g., Heeter, 1985; Phalen & Ducey, 2012) and to interact with others (e.g., Haythornthwaite, 2001) in a media-rich environment. Despite this long-held understanding that people turn to multiple media for their communication needs, most research on social media use examines use of one such platform in isolation from the use of other such platforms. Even when studies do look at the adoption of multiple social network sites (SNSs) (e.g., Blank & Lutz, 2017; Hargittai, 2007), they do not connect the dots from use of one to another. Although some work has suggested the need to differentiate by both diversity and frequency of platform usage (Hargittai & Hsieh, 2010), little empirical research has taken such an approach. In this article, we apply the media repertoire perspective to the adoption of social media platforms by using social network analysis, which has previously been used to study news consumption patterns (e.g., Mukerjee et al., 2018; Taneja et al., 2012), but not SNS usage.
With the majority of the population in the United States using social media (Pew Research Center, 2019), it is easy to assume that everyone uses such services. Yet the diffusion of SNSs is far from universal. For example, less than a quarter of Americans visit Twitter or Reddit, which is not the impression one gets from the amount of attention they receive both in the popular press and in academic research (Blank & Lutz, 2017), the latter mostly due to ease of data collection rather than representativeness (Tufekci, 2014). In fact, research has found that people do not select into the use of SNSs randomly (Blank, 2017; Hargittai, 2015; Hargittai & Litt, 2011). In other words, different social media platforms have varying user bases with implications both for the generalizability of research that uses such sites as their sampling frame and for how those wishing to reach diverse audience can do so. For example, key findings about disease tracking (Paul & Dredze, 2021; Sarker et al., 2020), crisis response (Sutton et al., 2015; Vieweg et al., 2010), activism and political participation (Budak & Watts, 2015; Conover et al., 2011; González-Bailón & Wang, 2016; Hemphill & Schöpke-Gonzalez, 2020), and spreading of (mis)information (Bakshy et al., 2011; Huang et al., 2015; Jiang et al., 2020; Romero et al., 2011; Starbird, 2017) rely largely on research that has been conducted on single SNSs.
While research efforts focused on one platform are helpful, their insights may have limited applicability at the time of important communication challenges like during natural disasters (Hargittai, 2020, p. 20) and pandemics (Barry, 2009). Such situations require effective and simultaneous communication with the entire population regardless of demographics and media use (Gans, 2020). The World Health Organization (2017) itself notes the importance of social media communication in crisis situations yet does not give concrete recommendations as to which channels to pursue for effective population-wide communication. This article adds to the existing literature on social media adoption and crisis communication by exploring how use of one SNS relates to the use of another, which has both scholarly implications for expanding understanding of SNS use across the population, and practical implications for advising organizations—whether governmental, nonprofit, or commercial—on how to reach diverse constituents.
Prior research on social media has shown that people select into their uses at different rates. Understanding patterns of social media adoption across sites is important for several reasons. First, methodologically speaking, many studies use just one SNS as their sampling frame (e.g., Twitter or Facebook), thereby biasing their samples (e.g., Bakshy et al., 2015; Romero et al., 2013; Ugander et al., 2012; Weng et al., 2013). By knowing which such services have overlaps in their user base and, importantly, which do not, researchers can diversify their samples without having to include in their sampling frame too many sites per se. Second, for digital inequality scholarship, it is important to go beyond understanding inequalities in the use of one site to see whether variations across population groups exist even when considering the use of more than one site. In other words, are people siloing themselves across platforms not just with respect to individual sites? Third, for campaigns—whether commercial (Liang & Turban, 2011), health-related (Neiger et al., 2012), political (Klinger & Svensson, 2015), or otherwise—it is significant to know how they can maximize reaching different types of people on social media (Thackeray et al., 2012). If platform A and platform B have largely overlapping user bases, then there is not much point in spending resources on both; rather, reaching out to platforms whose user bases do not overlap so much is a better way to reach a broader public.
We address this gap in the literature by analyzing a national survey about American adults’ social media uses to see how the user base of various such platforms may overlap. We answer this question by identifying pairs of SNSs that have a significant shared user base. Our computational approach is rooted in work on product associations such as “customers who bought product
We rely on data about 1,512 Internet users and their experiences with 10 popular SNSs at the time of data collection in 2016. We show that women and men inhabit different social media universes when going beyond the use of just one such service, which is what most research tends to do. We also find similar variations across age, educational levels, and Internet skills. We end by discussing the implications of these findings both for research methodologies and for substantive questions ranging from digital inequality scholarship to communication that concerns reaching diverse publics.
Media Repertoires and Differentiated SNS Adoption
Both with respect to how people get their information (Kim, 2016; Mukerjee et al., 2018; Taneja, 2017; Taneja & Webster, 2016; Taneja et al., 2012; Wu et al., 2020) and how they communicate with others (Haythornthwaite, 2001; Hsieh, 2012), communication research has highlighted that it is important to recognize that people do not use one or another medium in isolation of others. Despite this uncontested premise, relatively little empirical work takes a media repertoire perspective beyond the topical focus of news consumption (Dvir-Gvirsman, 2020; Yuan, 2011). While theoretically it is a convincing proposition, empirically it is certainly more complex to investigate the use of multiple channels of communication than it is to focus on one. In work on social media, most scholarship tends to focus on just one platform with relatively rare exceptions (e.g., Blank & Lutz, 2017; Hargittai, 2015; 2020), and such work has rarely used network analytical methods with no work known to us doing so to map the shared user base of different platforms. Platforms differ in their affordances (van Dijck, 2013), which means that although part of the same genre of media, they nonetheless represent diverse options. To this end, there is value in seeing how much their user bases overlap or diverge.
In another stream of communication scholarship, digital inequality research has long studied how user background relates to what people do online. This interest in identifying variations in usage by sociodemographics expanded to SNSs as those started gaining traction. Haight and colleagues (2014) identified differences by gender and education when considering use of SNSs generally speaking without disaggregating the question to specific platforms. Other work has explored differences by specific service.
When MySpace and Facebook were the most popular such sites, boyd (2011) and Hargittai (2007) both documented socioeconomic differences in their adoption finding that those from less privileged backgrounds were more likely to be on MySpace, whereas those from higher socioeconomic status (SES, measured by parental education and income) adopted Facebook at higher rates. As Twitter gained traction, using panel data about a diverse group of young adults collected in 2009 and then again in 2010, Hargittai and Litt (2011) showed that African American young adults were more likely to start using the site than young people from other racial and ethnic backgrounds. They also found socioeconomic differences (measured by parental education) whereby those from the least privileged backgrounds were considerably less likely to adopt the platform than the most privileged (p. 836). A few years later, Blank (2017) investigated this same question on data collected in 2013 from a national sample of British and American adults also finding that those from higher SES were more likely to adopt the service. As social media platforms started to proliferate, the studies about their adoption expanded to additional services. A study of undergraduate students in 2014–2015 at a US university found that women were considerably more likely to use Instagram than men (Sheldon & Bryant, 2016).
Although Hargittai’s (2007) early paper of SNSs included several platforms, this was likely viable at the time due to that study’s focus on college students. No other work at the time compared several platforms. This changed as social media diffused to the larger population. Blank and Lutz (2017) analyzed data from the 2013 Oxford Internet Surveys to examine how various factors were linked to the use of Facebook, LinkedIn, Twitter, Pinterest, Google + and Instagram. They found differences by age, gender, and income, which is similar to what Hargittai (2015) found analyzing panel data about a group of 25- to 26-year-olds with 2012 data about SNS use. Unfortunately, neither of these papers report regression results on SNS adoption before controlling for a host of factors beyond sociodemographics that may be explaining differences by such characteristics. That is, Blank and Lutz (2017) control for autonomy of use, Internet skills, self-efficacy, and privacy concerns, while Hargittai also controls for autonomy of use, Internet skills, and frequency of use and number of use years, so it is impossible to tell whether there are differences across population groups concealed due to the model specifications.
Analyzing data from the 2015 British Election Study, Mellon and Prosser (2017) find age, gender, and education differences between Facebook and Twitter users compared with the general population. Survey data about Belgian adults from 2017, 2018, and 2019 also showed that age, gender, and other factors were related to the propensity of using Facebook, Twitter, and Instagram (Hellemans et al., 2020). Gazit and colleagues (Gazit et al., 2019) examined whether the use of Facebook, Instagram, Twitter, and WhatsApp varied by gender among Israeli college students in 2017–2018. They found large variation in the popularity of these platforms with WhatsApp being the most popular, Twitter the least, and popularity varied by gender whereby women used WhatsApp and Instagram more whereas men used Twitter more.
In sum, while the specific sociodemographics that matter may differ, the overwhelming consensus of this work is that people do not select into the use of such sites randomly; rather, various socioeconomic factors—especially age, gender, and education—relate to who ends up on which site. In addition, those with higher Internet skills are more likely to use social media (Hargittai, 2020). While the above work has been helpful in pointing out such differences, the literature so far has only looked at site adoption in isolation of other site adoption. That is, while papers may have compared the users of various SNSs, they have not looked at shared users. That is the gap in the literature this article fills. We add an important new dimension to the literature on SNS use by asking the research questions: Which SNSs share a user base? What user characteristics are associated with pairs of SNSs?
We draw on methods used in the fields of data mining, information retrieval, and network science to do this. The approach mirrors techniques used for recommender systems such as “Customers who bought book
Data and Methods
We draw on a national survey study to explore SNS associations. The data set is based on a national sample of US adults 18 years old and over. Data collection took place online in summer 2016 through the University of Chicago’s NORC research unit using their AmeriSpeak panel. The panel is representative of the US population using “area probability sampling and includes additional coverage of hard-to-survey population segments such as rural and low-income households that are underrepresented in surveys relying on address-based sampling” (NORC, n.d.). After pretesting the survey with 23 respondents and updating it based on the results in early May 2016, we ran the survey from 25 May to 5 July 2016. (We note the change in popularity of the various services since 2016 when we discuss the results below.) For survey quality, we included an attention-check question and only analyze responses from participants who passed this question. The 1,512 participants reflect a 37.8% survey response rate.
Measures: Independent Variables
Background variables about respondents were supplied by NORC based on their earlier data collection about the AmeriSpeak panel. In this article, we analyze SNS associations by age, gender, education, and Internet skills, variables that past work on both digital inequality and SNS adoption has identified as important correlates of SNS adoption. Gender was collected and coded as binary. The survey asked respondents their date of birth, which was used to calculate their age. We created three education categories: high school or less, some college, and college degree or more. We collected data on Internet skills through a widely used measure (Hargittai & Hsieh, 2012) that asks respondents their level of understanding of various Internet-related terms on a 1–5 scale; averaging these responses creates an index score (Cronbach’s α = .94).
Measures: Dependent Variable
To measure whether people use SNSs, we asked respondents whether they visit various sites with the following answer options: “No, have never visited it,” “Yes, have visited it in the past, but do not visit it nowadays,” “Yes, currently visit it sometimes,” “Yes, currently visit it often.” We recoded the answers into a binary no (for people who picked the first and second response options) and yes for those who use it currently (respondents who picked the third and fourth options). We collected this information for the following social media platforms, chosen for inclusion due to their popularity during or before the time of data collection: Facebook, GooglePlus, Instagram, LinkedIn, MySpace, Pinterest, Reddit, Snapchat, Tumblr, and Twitter.
Sample Descriptives
Table 1 shows sample descriptives including figures on the right side of the table from the Pew Research Center’s 2015 Spring Tracking survey to offer comparisons of the sample composition. Almost the same number of men and women participated in this study, the average age was 48.7. Just under 30% of respondents were ethnic and racial minorities (11.7% Hispanic, 11.5% African American, 3.1% Asian American, and 1.7% Native American). A quarter had no more than a high school education, just under a third (31.4%) completed some college, and 43.5% had at least a college degree. The average household income was US$71,478. Just under two-thirds were employed, and 13.4% lived in a rural area. Overall, it is a diverse sample although somewhat less educated, less rural and more racially diverse than Internet user figures according to the Pew Research Center.
Sample Descriptives With Pew 2015 Data as Comparison.
Income was measured differently whereby Pew’s highest category was considerably lower than NORC’s.
Regarding their online experiences, respondents on average have been using the Internet for just over 11 years, have 4.8 locations where they can go online, and spend 14.7 hr on the Web weekly. Their Internet skills are varied; on a 1–5 scale, they average a 3.4 score.
Analysis
We generate the association between two SNSs based on whether they have a similar set of users. Specifically, to compute the association between SNS
where

Number of social network sites people use.
To develop an understanding of the similarity score, consider the following examples from our data set. LinkedIn and Instagram have comparable numbers of users (501 vs 517). Of these people, 245 used both platforms, with a similarity score of .48. Had there been 500 people who used both platforms, the similarity score would be nearly perfect (.98), whereas if the number of shared users had been small, say 50, then the similarity score would drop to .08. Looking next at two SNSs with very different popularity such as Facebook and Reddit (1,201 vs 182 users), with 159 shared users their similarity score is .34. In this case we would reach a maximal similarity score if all 182 Reddit users were also on Facebook. Yet, this maximum value would only be .39 since generally speaking, Reddit has considerably fewer users. If the two platforms had shared 18 users, the similarity would become .04. Note that 18 users in the latter case means 10% of the maximum 182 possible, just like 50 shared users means 10% of the maximum possible in the LinkedIn–Instagram case. However, the penalty was higher as indicated by the resulting similarity values .04 versus .08 due to the different user base sizes of the different platforms being compared.
We computed the cosine similarity and created the networks using custom scripts written in the statistical programming language R. We made our full code publicly available on the GitHub repository under the link: https://github.com/LINK-NU/SMS21-Horvat-and-Hargittai. For network visualization, we used a freely available graph editor called yEd. 1 This software offers a wide variety of layout algorithms including ones based on the force directed layout paradigm, and it allows users to tune several parameters such as desired edge length, edge routing, and node labeling. To ensure that our network visualizations are not cluttered, we choose a cut-off of 10 SNS associations to show per network such that figures and tables convey information only about the strongest associations. This cut-off is not inherently motivated by the analytic method and can be freely changed based on application and visualization needs.
SNS Use
Eleven percent of respondents do not use any social media while an additional 19% use only one. These participants do not provide any relevant data for examining SNS associations and thus are excluded from the analyses that follow. Figure 1 shows the total number of SNSs people use indicating that use of multiple platforms is common. Figure 2 displays the SNSs in order of popularity with over three-quarters of respondents using the most popular platform, Facebook, at the time of data collection and only 2% using the least popular MySpace. Given that some time has elapsed since these data were collected, we note here changes in popularity of these sites based on data collected by the Pew Research Center (2019) as comparison. (Because their figures are based on all American adults rather than Internet users only like our study, we are only noting changes over time, not use percentages.) According to Pew, Facebook use has been stable since 2016 as has Pinterest with LinkedIn and Twitter gaining just a few percentage points. Instagram has seen the most significant gains. Pew did not measure the popularity of the remaining services until 2018 or later so comparisons are hard to show, but they all remain below LinkedIn in diffusion.

Percentage of respondents who use various social network sites.
SNS associations quantified by the cosine similarity can be used to create networks in which SNSs are linked according to the similarity of their user bases. Figure 3 shows the top 10 SNS associations for the entire sample. The darker and thicker the line between two sites the higher their similarity score, that is, the higher their shared user base. A similarity score of 1 would indicate that two platforms have the exact same users, while a score of 0 would mean that the sites do not share any users. We find that the similarity scores in our full data sample range from 0.43 to 0.65. The site pairs Facebook–Pinterest (similarity = 0.65), Instagram–Snapchat (0.63), Facebook–Instagram (0.61), and Twitter–Instagram (0.61) are the strongest by association, sharing the most users. Conversely, Twitter–GooglePlus (similarity = 0.43) and Tumblr–Reddit (.44) are the least similar based on their shared user base in the top 10. For a list of associations between all possible pairs of SNSs, see the table in the Appendix. This table indicates that the overall weakest associations all involve MySpace (similarities between 0.1 and 0.22). The next smallest shared user bases are registered between GooglePlus and Reddit (similarity = 0.28), and Pinterest and Reddit (similarity = 0.28).

Network of the top 10 SNS associations.
This initial, full-sample analysis is helpful for identifying general patterns of pairs. However, because we know from prior literature that people of different backgrounds select into the use of social media platforms at different rates, it is also important to examine SNS pairings by user type. Next, we look at whether the association between SNSs is stronger or weaker for certain types of users. We do this by calculating the association scores by age, gender, education, and Internet skills to show to what extent different types of people use different mixes of platforms.
Table 2 lists the top site associations for men and women, while Figure 4 shows them graphically. We find that the user base of various SNSs very much varies by gender. While some SNS associations are similar for both men and women (e.g., Facebook–Instagram, Twitter–Instagram), others are mostly relevant for just one or the other group. For example, Reddit–Snapchat have the 12th highest association among men, while that association is not among the top 20 for women. In contrast, the highest similarity score (0.76) for women is Facebook–Pinterest, which is only in the eighth spot with a considerably lower score (0.49) for men. These numbers arise from the fact that while 86.5% of women use Facebook and 58.8% use Pinterest, 54.1% use both SNSs. At the same time, 72.1% of men use Facebook, 23.1% use Pinterest, and 19.9% use both. Accordingly, the similarity between Facebook and Pinterest in the full sample (0.65) is driven mostly by the female sub-population. This is an important nuance that the original analysis would have not revealed. Our gender-based SNS association analysis also allows identifying the pairs of platforms that could be used in a communication campaign to target the two groups while minimizing redundancy. For instance, relying on Pinterest and Snapchat would minimize the male users who are exposed to the message twice (23.1% men use Pinterest, 17% use Snapchat, only 6.9% use both, similarity is 0.35), while using Snapchat and LinkedIn would minimize the overlap in the female sample (31.2% women use LinkedIn, 23.3% use Snapchat, 8.8% use both, similarity is 0.33).
SNS Similarity Scores by Gender (Female on Left, Male on Right).

Network of the top SNS associations by gender (female on left, male on right).
Next, we investigate SNS associations by age. When comparing the youngest (age 18–33 years) and oldest (age 62–94 years) respondents (chosen as the lowest and highest quartiles in the sample’s age distribution), we find that the younger sample has systematically higher associations ranging between 0.79 and 0.56 than the older sample with associations between 0.52 and 0.35 (see Table 3). Among younger people, the most strongly associated SNS pair is Facebook–Instagram, which here is followed by the Instagram–Snapchat pair (similarity is 0.76). Among the older age group, the strongest association is between Facebook and Pinterest, a pair that is in top positions for several user types as we will see in the following. Figure 5 shows how the two groups vary in their SNS associations.
SNS Similarity Scores by Age.
SNS: social network site.

Network of the top SNS associations by age.
Table 4 lists the top SNS associations by level of education (also see Figure 6). Again, we observe considerable variation by user background. For example, while Facebook and LinkedIn are the most highly associated for those with a college degree or more (78.4% use Facebook, 48.2% use LinkedIn, 43.2% use both, similarity is 0.7), it is not in the top 10 associations among those with no more than a high school education (76.5% use Facebook, 14.5% use LinkedIn, 12.5% use both, similarity is 0.37). For the latter group, the highest association is for the Instagram–Snapchat pair (30.6% use Instagram, 21.4% use Snapchat, 16.6% use both, similarity is 0.65), which is also top 2 among those with some college education (34.6% use Instagram, 23.3% use Snapchat, 18.4% use both, similarity is 0.65), but only in sixth place for those with a college degree (36.1% use Instagram, 17.1% use Snapchat, 14.9% use both, similarity is 0.6). There is one association that is in a top position regardless of education, however, Facebook–Pinterest.
SNS Similarity Scores by Level of Education.
SNS: social network site.

Network of the top SNS associations by education.
Finally, we look at how SNS associations may vary by Internet skills (see Table 5 and Figure 7). Here, we find that the Facebook–Instagram association is especially strong between users with high Internet skills (87.8% use Facebook, 54.4% use Instagram, 51.6% use both, similarity is 0.75). The associations between Facebook–LinkedIn (0.72), Facebook–Pinterest (0.7), and Twitter–Instagram (0.7) are also high and actually the top eight associations are stronger than any associations based on the full sample. This indicates that there is a high agreement among Internet-savvy users in terms of the SNSs they use. In contrast, SNS associations between users with the lowest Internet skills are significantly weaker, that is, they are equal to or
SNS Similarity Scores by Internet Skills.
SNS: social network site.

Network of the top SNS associations by Internet skills.
Discussion and Conclusion
Inspired by the media repertoire and media multiplexity perspectives, this article takes as its premise that most people use a combination of social media, and thus research looking at their adoption should do the same. While previous research has shown that different people select into the use of various SNSs at different rates, such work has only looked at the adoption of sites in isolation of considering adoption of other such sites (e.g., Blank & Lutz, 2017; Hargittai, 2020; Hargittai & Litt, 2011). Knowing that certain types of people are more or less likely to use various social media does not allow us to determine whether this selection also applies to SNS pairs. This article contributes to the literatures on media multiplexity and on SNS use by showing the shared user bases of various social media platforms by user attributes.
The networks mapped out in this study reflect how dis/similar SNSs are in terms of appeal and affordances to the same pairs of users. Our research highlights the fact that more platforms do not necessarily mean more diverse voices represented on platforms if they are being used by similar people. The findings heighten potential concerns associated with questions of social inclusion, segregation and empowerment through social media (e.g., Costanza-Chock, 2020; Jackson et al., 2020; Tufekci, 2017), because they show that not using one platform can be non-trivially connected to not using a suite of other platforms.
This study thus adds a new dimension of applying social network analysis to media use scholarship about media repertoires and the multiplexity of people’s media uses by presenting connections between social media platforms based on their usage patterns. While prior work has done this in the realm of news and journalism (e.g., Mukerjee et al., 2018; Taneja & Webster, 2016), we know of no other work to have done so for SNS associations. Our networks map out the structural underpinnings of SNS use. Moreover, we investigate and find striking differences in the associations between SNSs based on user attributes. More broadly, our work emphasizes how individual SNS uses are compounding to system-level SNS networks with clusters that end up being more or less accessible for people with different backgrounds.
In addition to the theoretical implications of the findings, there are also practical ones. For example, if an organization wishes to reach people across the population such as at the time of a crisis or for political campaigning purposes, it may think that targeting different social media platforms will meet its needs. However, as we show, which particular mix of social media it targets is crucial to reaching its goals. For example, basing such outreach on Twitter and Snapchat would oversample men without guaranteeing to reach women just as going for a mix of Facebook and Pinterest would leave out a considerable portion of men. Rather, our results indicate that the communication strategy would be more successful by choosing only one of the SNSs from the highly similar pairs to save resources and avoid disseminating information to the same people multiple times. Extending prior work, our results also suggest that not including multiple SNSs in information campaigns might lead to a large-scale and systematic exclusion of certain demographic groups which in turn can increase harmful information gaps (Gans, 2020). It is essential to target users efficiently through multiple SNSs and our results inform such efforts by highlighting which SNS combinations will increase coverage of different demographic groups the most.
Our findings also have important implications for data-collection projects that rely on SNSs as their sampling frames. Prior work has already noted that basing a study’s sampling frame on only one platform poses major limitations as the data will bias against populations that are systematically less likely to be on said platforms (Blank & Lutz, 2017; Hargittai, 2015; Tufekci, 2014). What our study shows is that simply casting a wider net to include more than one platform may not in and of itself address the sampling biases of individual platforms. If projects target sites that have high associations for particular users then they will likely reach the same people rather than a diverse set. Our findings suggest that not only are individual SNSs biased in whose voices they represented, so are SNS pairs.
Future work could extend the used method to larger clusters of SNSs than dyads. A generalized approach focusing on clusters of three or more SNSs would allow communication campaigns to find their audiences more flexibly depending on the size of their effort (i.e., the number of SNSs whose users they can target) and the expected overlaps between those sites’ user bases.
While the article makes important novel contributions to the literature on social media adoption and optimal communication strategy using such channels, it also has limitations. Our measure of social media use is a basic one simply gauging whether someone visits a platform at least sometimes or not. What people do on such sites, how they approach the content they see, and how it fits into their broader information-seeking repertoires are factors we are not able to consider. In addition, people’s use of various platforms fluctuates over time, and so findings about specific sites and their associations with others may be different as people abandon some sites and join others (Lazer et al., 2021). Nonetheless, the general point about platform associations holds regardless of specific usage levels.
With these limitations in mind, the most important contribution of this article is the proof of concept regarding bias by SNS associations not which specific associations may be more or less popular across sample groupings at any particular time. Theoretically, the findings support the importance of the media repertoire and media multiplexity perspectives showing that there is important insight to be gained from focusing on more than one platform when studying people’s communication practices. Given that we show considerable variation in SNS dyads across population groups, it is important that future research with up-to-date platform popularity also investigates such divergences so as to avoid the biases that can stem from relying on sites that cater to the same user groups.
