Sage Journals: Discover world-class research

Abstract

The COVID-19 pandemic in the United States has been characterized by political partisan differences and opinions that influenced individual behavior and political policies, correlating with varying outcomes. Measuring these differences in opinion using traditional methods, such as opinion polls, can be costly and provides only a snapshot of sentiment at a given time. Social media platforms like X (formerly Twitter) may offer a real-time tool for assessing opinions to explore the relationship between differing viewpoints about the pandemic and how these differences impact the occurrence and severity of disease burden. Utilizing an open-source data set of 1.75 million keyword-selected X (Twitter) posts, updated weekly from January 1, 2020, through December 31, 2021, along with publicly available COVID-19 case and death counts and county-level data on election outcomes from the 2020 election, we analyzed how the volume of tweets on the X (Twitter) platform related to COVID-19 has changed over time and how patterns of use vary across the partisan divide. We discovered that Democratic-leaning counties exhibited higher X (Twitter) volume associated with COVID-19 topics compared to Republican-leaning counties in response to changes in case or death rates. In addition, we found a higher proportion of tweets in urban counties compared to rural ones.

Keywords

partisanship engagement X (Twitter)COVID-19

The SARS-CoV-2 (COVID-19) pandemic has reached every corner of the world. Understanding the spatial and temporal patterns of behavior, social dynamics, and policies, as well as their interrelations, can provide critical knowledge for preparatory measures and effective responses to future pandemics. One puzzling area, regardless of how well our preparatory and response mechanisms are established, is the variability in people’s perceptions and reactions to a pandemic and its mitigation policies.

Views on the response to the COVID-19 pandemic have varied at the individual level, and state-level policies have reflected different political landscapes, resulting in stark differences of opinion regarding vaccine uptake (Sehgal et al., 2022), social distancing (Bisbee & Lee, 2021), and mask use (Young et al., 2022). Even before the vaccine rollout, political affiliation was associated with differing perspectives on vaccine policy—by May 2020, the odds of a Democratic-leaning respondent holding a favorable view of a COVID-19 vaccine were 5.4 times that of a Republican respondent, although this odds ratio decreased to 2.4 by October 2020 (Golos et al., 2022). Partisan differences were evident even in the earliest days of the pandemic, with Democrats 8.8% more likely to wash their hands and 12% more likely to avoid gatherings in March 2020 (Gadarian et al., 2021).

The initial months of the COVID-19 pandemic had a more significant impact on Democratic-leaning counties in terms of cases and deaths, particularly in large urban centers (Kaashoek et al., 2022). Later, Republican counties experienced higher rates of COVID-19 cases and deaths (Chen & Karim, 2022; Desmet & Wacziarg, 2022). These Republican counties faced greater COVID-19 impacts due to a combination of behavioral, structural, and policy differences, including lower vaccine uptake (Sehgal et al., 2022).

A major challenge in studying the social and behavioral aspects of COVID-19 is that traditional data collection methods are slow, labor-intensive, and expensive. Moreover, these methods may inadequately measure key population characteristics and behaviors, even with robust probability sampling. In contrast, social media data, including information from social networking services (e.g. X (Twitter), Facebook, and Instagram) provide vast amounts of rich textual data in real time. For instance, researchers conducted sentiment analysis on Facebook and X (Twitter) posts to gauge public attitudes toward COVID-19 vaccines (Hussain et al., 2021). Among these data sources, X (Twitter) offers a highly accessible Big Data stream and has attracted attention from researchers across various disciplines (Ruths & Pfeffer, 2014). By analyzing tweet mentions over space and time, researchers can track phenomena such as public awareness of COVID-19 (Sakun and Skunkan, 2020), attitudes toward social distancing and lockdowns (Jia et al., 2020), and concerns about reopening communities and the economy (Rahman et al., 2021), as well as events related to public health and safety, such as anti-Asian sentiment (Jang et al., 2021) and mental health problems (Valdezten et al., 2020). Also, because of the large data set, investigators can extract meaningful information from even small geographic units (e.g. counties and neighborhoods) and fine time intervals (e.g. specific days).

X (Twitter) may serve as a tool for exploring political partisanship and social responses to COVID-19. One study identified a positive correlation between COVID-19-related deaths and tweets at state and county levels during the early phases of the pandemic, from January 25 through May 10, 2020 (Feng & Zhou, 2022). An analysis of individual X (Twitter) accounts from February to July 2020 revealed that national and state politicians differ in their presentation of information on the platform, with Democratic users emphasizing the pandemic and government response, while Republican users focused more on individual actions, support for businesses, and external political entities such as other countries (Jing & Ahn, 2021). A 2020 study noted that political messaging varied by party, even in the first year of the pandemic (Engel-Rebitzer et al., 2022). Leading up to the 2020 U.S. election, there was a significant partisan divide in X (Twitter) discussions, with Democrats concentrating more on health and COVID-19 as an election issue, while Republicans prioritized the economy (VanDusky-Allen et al., 2022). There is some evidence of geographic differences in partisan responses to COVID-19; one study examining sentiment in non-metropolitan and metropolitan areas on X (Twitter) found that tweets from rural locations often expressed more negative sentiment regarding prevention-related topics such as vaccines, demonstrating polarized reactions to politicians (Liu et al., 2023). This supports the notion of an urban/rural divide on X (Twitter).

However, the studies examining discourse on X (Twitter) regarding COVID-19 were conducted at the national or state level but not at more granular levels, and they are limited in terms of time scope. In this study, we use geotagged tweets collected in 2020 and 2021 to compare tweet frequency with COVID-19 topics. We investigated how Twitter usage related to COVID-19 varies across counties based on their political affiliation as defined by voting behavior in the 2020 presidential election. However, there is no existing research that has studied ongoing activity over the months and years that followed. Although X (Twitter) Inc. has restricted data downloads by imposing a high access fee (X (Twitter) Developer Platform 2024), leveraging the extensive data collected during the pandemic can provide insights into the use of future (potentially vast) social media data to study people’s perceptions and reactions to future pandemics and mitigation policies.

Data and Methods

We downloaded daily counts of COVID-19 cases and deaths at the county level from The New York Times (2021), which were based on reports from state and local health agencies, and calculated the daily changes in cases and deaths per county. Between January 1, 2020, and December 31, 2021, we collected 53,472,344 geotagged tweets with their locations set to the centroid of each county in the United States. From this collection, we created a subset of 1,744,099 tweets tagged with COVID-19-specific keywords. Methods for collecting the X (Twitter) data set are discussed in E-Supplement 1.

To investigate data on partisanship in these counties, we used county-level data from the 2020 election accessed through the Massachusetts Institute of Technology (MIT) Election and Data Science Lab (MIT Election Data and Science Lab, 2018). We classified each county as partisan Republican, partisan Democrat, or politically neutral based on the percentage of votes cast for each party. A county with more than 60% of votes for the Republican candidate (Donald Trump) was categorized as a Republican-leaning county, a county with more than 60% of votes for the Democratic candidate (Joseph Biden) was categorized as a Democratic-leaning county, and the remaining counties were classified as politically neutral. We also analyzed this with a lower cutoff of 55%. In addition, counties were classified as either metropolitan or non-metropolitan (rural) based on Census definitions (Hall et al., 2006). The counts of COVID-19 cases, deaths, and tweets were averaged over the preceding 7 days to obtain a rolling average, and we created a “tweet fraction” as the number of COVID-19-tagged tweets per day divided by the total number of tweets per day for each county. For comparison across counties with different populations, we used US Census Data to standardize the number of COVID-19 cases and deaths, as well as daily tweet counts per 100,000 population (Schroeder et al., 2025).

We created plots that illustrate temporal trends in nationwide tweets and COVID-19 cases over the entire study period. We then conducted a simple ordinary least squares (OLS) regression to examine whether COVID-19-related X (Twitter) frequency was influenced by the rates of COVID-19 cases and deaths during the 2-year period from January 2020 through December 2021. We analyzed both the total number of tweets (averaged over the previous 7 days) and the proportion of COVID-19-tagged tweets. To account for changing engagement with COVID-19 on X (Twitter) over time, we included a quarterly fixed effect to accommodate temporal variation in a time-stratified model. We repeated this analysis using a definition of “partisanship” based on the partisan divide of the 2020 election outcome and further explored results by level of partisanship, with separate analyses for Democratic, Republican, and politically neutral counties. Results were also replicated for the 60% cutoff and 55% cutoff. As a robustness check, we examined whether the association varied by the metropolitan/non-metropolitan status of the counties. In addition, we reran our analysis using a monthly term and a natural cubic spline with 7 degrees of freedom per year to address temporal patterns. We tested this model with an interaction term, rather than stratifying, to further investigate potential differences in the associations between COVID-19 cases or deaths and tweeting patterns across metropolitan/non-metropolitan or partisan divides. Finally, we examined a model that included both metropolitan and partisan status, along with case or death rates, to assess the relative impact of these factors.

Analyses were carried out in Stata/MP 17.0.

Results

COVID-19 and Tweet Behavior

Figure 1a illustrates national trends in COVID-19 cases per 100,000 population and COVID-19-tagged tweets per 100,000 population from January 1, 2020, through December 31, 2021. Figure 1b presents trends in COVID-19 deaths per 100,000 and COVID-19-tagged tweets per 100,000 population. Figure 1a indicates that COVID-19 cases rose over time in a series of waves, with COVID-19-tagged tweets peaking early and then stabilizing. In Figure 1b, we observe that the pattern of COVID-19 deaths also followed a series of waves, but deaths did not continue to rise over time, likely because of vaccines and prior immunity.

Figure 1.

(a) Daily rate of COVID-19 cases per 100,000 population and daily rate of COVID-19 keyword geotagged tweets per 100,000 US population. (b) COVID-19 deaths per 100,000 US population and rate of COVID-19 tweets per 100,000 US population. Cases, deaths, and tweets are smoothed with a 7-day rolling average.

There was significant variability in day-to-day reporting of both COVID-19 cases and tweet mentions, so we included a smoothed 7-day rolling average of cases (Bergman et al., 2020). The volume of COVID-19-tagged tweets surged with the initial onset of COVID-19 in Spring 2020, gradually decreasing to a relative plateau during the second wave before dropping to lower levels over the following 2 years, measured both in absolute terms and as a fraction of the total number of tweets. The irregular pattern of tweets is due to missing days of X (Twitter) data, which are indistinguishable from days with zero tweets. The spike in Fall 2021 is an artifact resulting from a large reporting period of previous deaths on that date.

Partisanship and COVID-19-Related Tweet Behavior

We summarized the total count of geotagged tweets containing COVID-19 keywords (i.e. COVID-19-tagged tweets), the total count of all daily geotagged tweets, the proportion of tweets coded as COVID-19-tagged, and the total daily counts of COVID-19 cases and deaths. In addition, we calculated cases, deaths, and tweets per 100,000 population for the nation, as well as separately for counties that were Democratic-leaning, Republican-leaning, or politically neutral (Table 1). Seven-day rolling averages of COVID-19-tagged tweets per 100,000 population are presented for metropolitan and non-metropolitan counties, and separately for Democratic, Republican, and neutral counties in Figure 2a and b. Cases per 100,000 population are displayed by metro/non-metro status (Figure 2c) and by Democratic, Republican, or neutral status (Figure 2d), while deaths by metro/non-metro status and partisan status are shown in Figure 2e and f.

Table 1.

Total Number of COVID-19-Tagged Tweets Collected From January 1, 2020, Through December 31, 2021, Mean of Total Daily Tweets, Fraction of all Tweets Coded as COVID-19 Related, Total Daily COVID-19 Cases, and Total Daily COVID-19 Deaths, as Well as Cases, Deaths, and Tweets per 100,000 Population for the Nation as a Whole and Separately for Democratic-Leaning, Republican-Leaning, and Politically Neutral Counties.

	Total COVID-19-tagged tweets	Total daily COVID-19 cases	Total daily COVID-19 deaths	Fraction of tweets related to COVID-19	Total daily COVID-19- related tweets
		Mean (SD)	Mean (SD)	Mean (SD)	Mean (SD)
All Counties	1728546	74158.2 (75708.4)	1121.5 (1078.0)	.043 (.026)	2377.6 (2072.2)
Neutral	585531	31210.9 (32393.3)	438.9 (475.832)	.039 (.02)	805.4 (680.9)
Democratic (> 60% D)	945367	23875.3 (27134.9)	357.40 (363.4)	.049 (.033)	1300.4 (1177.1)
Republican (> 60% R)	197648	19098.2 (19909.2)	325.6 (365.4)	.032 (.017)	272.2 (221.4)
Non-metro	139334	11185.2 (11571.8)	200.7 (213.1)	.031 (.018)	191.9 (160.7)
Metro	1589212	62988.4 (65718.5)	921.1 (912.3)	.044 (.028)	2186.0 (1915.2)
	Cases/100 K pop (SD)		Death/100 K pop (SD)		COVID-19 tweets/100k pop (SD)
All Counties	21.5 (18.4)		.332 (.247)		.702 (.612)
Neutral	22.9 (19.3)		.329 (.248)		.603 (.510)
Democratic	19.8 (18.3)		.307 (.277)		1.114 (1.008)
Republican	25.2 (22.7)		.432 (.375)		.361 (.293)
Non-metro	24.3 (21.6)		.438 (.366)		.419 (.351)
Metro	22.0 (19.0)		.330 (.251)		.781 (.684)

Figure 2.

Rates of tweets per 100,000 population by (a) metro/non-metro and (b) county-level partisanship (Democratic/Republican/politically neutral); cases per 100,000 population by (c) metro/non-metro and (d) county-level partisanship; and deaths per 100,000 population by (e) metro/non-metro and (f) county-level partisanship.

Results revealed that Republican-leaning counties had higher daily case rates per 100,000 (25.2; SD 22.7) than politically neutral counties (22.9; SD 19.3) (p = .036), and Democratic-leaning counties (19.8; SD 18.3) had lower case rates per 100,000 than politically neutral counties (22.9; SD 19.3) (p = .003). For deaths per 100,000 population, we similarly found that Republican-leaning counties showed higher death rates (.432; SD .375) than politically neutral counties (.329; SD .248) (p = .000). Democratic-leaning counties had comparable daily mortality rates per 100,000 (.307; SD .277) compared with politically neutral counties (.329; SD .248) (p = .168). Furthermore, Democratic-leaning counties had a higher rate of COVID-tagged tweets per 100,000 population (1.11; SD 1.008) than politically neutral counties (.603; SD .510) (p = .000), while Republican-leaning counties had a lower rate of daily tweets per 100,000 population (.361; SD .293) than politically neutral counties (.603; SD .510) (p = .000).

We conducted simple OLS regressions to examine the association between daily COVID-19 case rates per 100,000 and tweet behavior, adjusting for partisanship at the county level. To account for the long-term trend of decreasing frequency of COVID-19-related terms on X (Twitter), we included a variable for the quarter of the year to capture these temporal trends. In addition, we tested this association using daily deaths per 100,000 population. We then performed these regressions separately based on the partisan status of each county. The regression results are presented in Table 2.

Table 2.

OLS Regression Results for Relationship Between Change in Fraction of all Geotagged Tweets With a COVID-19 Keyword per 100,000 Population and Cases or Deaths per 100,000 Population for all Counties Combined, Adjusted for Partisanship and Temporal Term for Quarter of Year, and by Partisan Subgroup Adjusted for Quarter; and Change Association in Daily Total of Tweets per 100,000 Population and Cases or Deaths per 100,000 Population for all Counties Combined Adjusted for County-Level Partisanship and Quarterly Temporal Term, and Separate Results by Partisanship Adjusted for Quarter.

		COVID-19 tweets per 100 K population		Fraction of total tweets with a COVID-19 keyword
		Estimate (95% CI)	p	Estimate (95% CI)	p
All Counties
	Deaths per 100 K population	.699 [.617, .780]	< .001	.024 [.021, .027]	< .001
	Cases per 100 K population	.004 [.003, .005]	< .001	.0001 [.0001, .0002]	< .001
Metro/Non-Metro Characteristic of County
Non-Metro	Deaths per 100 K population	−.007 [−.079, .066]	.855	−.002 [−.006, .003]	.473
	Cases per 100 K population	−.000 [−.002, .001]	.537	.000 [−.0001, .0001]	.525
Metro	Deaths per 100 K population	.545 [.359, .731]	.000	.024 [.0157, .032]	.000
	Cases per 100 K population	.002 [−.0002, .005]	.071	.0001 [.0000, .0002]	.027
Partisan Characteristic of County
Neutral	Deaths per 100 K population	.186 [.048, .324]	.008	.009 [.003, .016]	.004
	Cases per 100 K population	.001 [−.001, .003]	.226	.000 [.000, .000]	.402
Democratic-Leaning	Deaths per 100 K population	1.443 [1.181, 1.705]	.000	.0543 [.0455, .0631]	.000
	Cases per 100 K population	.005 [.001, .008]	.006	.0002 [.0001, .0003]	.000
Republican-Leaning	Deaths per 100 K population	.011 [−.0458, .068]	.702	.0001 [−.004, .004]	.965
	Cases per 100 K population	.000 [−.001, .001]	.952	.000 [−.0001, .0001]	.724

Regression results indicate that, after adjusting for county-level partisanship and a quarterly fixed-effect term for time, each additional single-case increase in the daily count of cases per 100,000 corresponds to an overall increase in the tweet fraction of .0001 (95% CI = [.0001, .0002]). When we examined the results by partisanship, Democratic-leaning counties exhibited a more pronounced effect, with a .0002 higher fraction of COVID-19 tagged tweets for each single-case increase per 100,000 (95% CI = [.0001, .0003]). There was no evidence of an association between changes in daily cases per 100,000 and tweet fraction for politically neutral counties (.00004; 95% CI = [−.00005, .0001]) or Republican-leaning counties (−.00001; 95% CI = [−.0001, .0001]). This pattern was consistent for the daily count of tweets per 100,000 population, with an increase of .004 additional tweets per 100,000 population (95% CI = [.003, .005]) after adjusting for county partisanship and quarter. Democratic-leaning counties showed an additional .005 tweets per 100,000 population per day (95% CI = [.001, .008]), while there was no evidence of an association for Republican-leaning (−.00002; 95% CI = [−.001, .001]) or politically neutral counties (.001; 95% CI = [−.001, .003]).

When we examined the change in tweet fraction associated with daily COVID-19 deaths per 100,000, we again found the strongest correlation in Democratic-leaning counties. Overall, after adjusting for quarter and county partisanship, each additional death per 100,000 was associated with a .024 increase in the fraction of COVID-19-tagged tweets (95% CI = [.021, .027]). When we analyzed counties separately by partisan level, Democratic-leaning counties exhibited a .054 increase in the fraction of COVID-19-tagged tweets for each additional daily death per 100,000 (95% CI = [.045, .063]), with a lower association observed in politically neutral counties (.009; 95% CI = [.003, .016]) and no evidence of association in Republican-leaning counties (.0001; 95% CI = [−.004, .004]).

When we examine total tweets per 100,000 population and death rates per 100,000 for all counties, we find that each additional daily death per 100,000 population corresponds to an increase of .699 tweets per 100,000 population, adjusted for the county’s partisanship and the quarter of the year (95% CI = [.617, .780]). Democratic-leaning counties experienced an increase of 1.44 tweets per 100,000 population for each additional daily death per 100,000 population (95% CI = [1.18, 1.70]), politically neutral counties had .185 more tweets per 100,000 population for each additional death per 100,000 population (95% CI = [.047, .325]), while there was no evidence of association in Republican-leaning counties (.011; 95% CI = [−.046, .068]).

Metropolitan and Non-Metropolitan Differences in COVID-19 Cases, Deaths, and COVID-19-Related Tweet Activity

As part of our sensitivity analysis, we further divided tweets and COVID-19 cases during our study period into metropolitan and non-metropolitan counties. Figure 2a displays daily geotagged tweet mentions of COVID-19 per 100,000 population for both metropolitan and non-metropolitan counties. The tweet mention rate was approximately twice as high in metropolitan counties compared to non-metropolitan counties throughout the study period.

Figure 2a shows daily COVID-19 cases per 100,000 population for metropolitan and non-metropolitan counties. The overall trend in COVID-19 case rates was similar between metropolitan and non-metropolitan areas. Although the case rate was higher in metropolitan counties prior to August 2020, it was subsequently surpassed by the rate in non-metropolitan counties. In Table 1, we see that non-metropolitan counties experienced higher daily cases per 100,000 population (24.3; SD 21.6) compared to metropolitan counties (22.0; SD 19.0), higher daily deaths per 100,000 population (.438; SD .366) than metropolitan counties (.330; SD .251), and fewer tweets per 100,000 population (.419; SD .351) than metropolitan counties (.781; SD .684).

An OLS regression showed no association between COVID-19 case rates or death rates and either COVID-19-related tweets per 100,000 population or the fraction of all tweets that were COVID-19 related for non-metro areas (Table 2). In metropolitan areas, each case per 100,000 population was associated with a nonsignificant increase in the rate of COVID-19-related tweets per 100,000 population (.002, 95% CI = [–.0002, .0045]). However, in these metropolitan counties, the fraction of tweets increased by .0001 for each case per 100,000 (95% CI = [.0000, .0002), and each additional death per 100,000 population was associated with .55 additional daily COVID-19-related tweets per 100,000 population (95% CI = [.36, .73]) and a .024 increase in the proportion of total tweets that were COVID-19 related (95% CI = [.016, .032]).

Robustness Checks

Models using a .55 cutoff to define partisan counties showed a slightly reduced effect size compared to models with a .60 threshold. Stratified models using monthly terms for time and a 7-degree-of-freedom natural cubic spline term demonstrated some variation in the stratified effects. These results are presented in Supplemental Table 1. Models testing associations between case rates and partisan status, case rates and metropolitan status, death rates and partisan status, and death rates and metropolitan status—both for the frequency of COVID tweets and the fraction of COVID tweets, using a 7-degree-of-freedom cubic spline and a 55% cutoff for partisanship—are shown in Supplemental Table 2. We also examined the interaction terms in these models and used the results to generate marginal plots for analyses with the fraction of tweets as the outcome, which are displayed in Figure 3. In these models, we found that cases per 100,000 and deaths per 100,000 are independent predictors of increasing tweet frequency and the fraction of tweets associated with COVID-19, adjusted for metro/non-metro status or partisanship of the county. In addition, metro counties exhibited higher tweet frequency or COVID keyword fraction when adjusted for both death and case rates. In each of these models, at the .55 cutoff for partisanship, Democratic-leaning counties had higher tweet frequency and tweet fraction than neutral counties, adjusted for either case rates or death rates, while Republican-leaning counties had lower tweet frequency or fraction than neutral counties, adjusted for case or death rates. The significance of the interaction terms varied, as illustrated in the marginal plot (Figure 3). Overall, the fraction of tweets containing COVID-19-related terms increased as rates of COVID-19 cases or deaths rose, and generally, Democratic or metropolitan counties exhibited a higher fraction based on cases or deaths. The marginal plots suggest some convergence in proportions for increasing rates of cases by metropolitan status and partisanship, and some divergence for increasing rates of death, although not all interaction terms are significant.

Figure 3.

Change in proportion of tweets with COVID keywords associated with increasing cases per 100,000 population by (a) metro/non-metro metro/non-metro counties and (b) county-level partisanship (Democratic/Republican/neutral); Change in proportion of tweets with COVID keywords associated with increasing deaths per 100,000 population by (c) metro/non-metro metro/non-metro counties and (d) county-level partisanship (Democratic/Republican/neutral).

As a final model, we included both the partisanship of counties (using the 55% threshold) and metropolitan versus non-metropolitan status, allowing counties to be categorized as Democratic/metropolitan, Democratic/non-metropolitan, neutral/metropolitan, neutral/non-metropolitan, Republican/metropolitan, or Republican/non-metropolitan. We estimated the effect of COVID rates, measured by cases or deaths, on the frequency of COVID tweets and the fraction of tweets containing COVID keywords, adjusting for both metropolitan/non-metropolitan residence and partisanship. These results are presented in Supplemental Table 3. The effects of cases and deaths on both COVID tweets per 100,000 population and the fraction of tweets associated with COVID are robust to adjustments for both metropolitan status and partisanship. As in our main models, fully adjusted models suggest that higher rates of COVID-19 cases or deaths predict increases in tweeting, as does metropolitan residence, with counties that are partisan Democratic showing a higher tweet response than those that are partisan Republican. Each additional death per 100,000 is a predictor of both increased tweet rates per 100,000 population and a higher fraction of tweets. In these full models, metropolitan counties exhibit a higher frequency and proportion of COVID-19 tagged tweets than non-metropolitan counties. Compared to neutral counties, those with a 55% or higher Democratic vote have a greater frequency and fraction of COVID-19 tweets. To illustrate this, we conducted a three-way interaction and plotted the margins stratified by metropolitan status in Figure 4. As shown in the marginal plot, tweet activity increases with rising cases and deaths, but the response is most pronounced in metropolitan, Democratic counties.

Figure 4.

Marginal plots of the change in Twitter activity associated with changes in rates of COVID, metropolitan status, and partisanship, stratified by metropolitan status for models for proportion of tweets with COVID keyword and (a) cases*metropolitan*partisan and (b) deaths*metropolitan*partisan, COVID-tagged tweets per 100,000 population and (c) cases*metropolitan*partisan and (d) deaths*metropolitan*partisan.

Discussion

Our work aligns with previous studies suggesting that the COVID-19 pandemic has disproportionately affected individuals in non-metropolitan counties and in counties that voted Republican in the 2020 election, resulting in higher case rates and higher death rates from COVID-19 (Wallace et al., 2022). The parallel findings regarding metropolitan versus non-metropolitan areas and partisanship support research indicating an increasing geographical partisan divide, with urban core areas becoming increasingly Democratic and rural areas becoming more conservative (Scala & Johnson, 2017). Our study adds detail and context to the limited body of previous work suggesting a connection between X (Twitter) activity and public response to COVID-19; it is the first study to investigate the association between COVID-19 impacts and X (Twitter) behavior using a multi-year data set at the county level, which allows us to see how this association varies across both the partisan divide, and by metro/non-metropolitan counties at a level of detail that would not be possible at the state or national level. The results indicate that tweeting behavior was associated with responses to COVID-19, particularly in counties that voted for the Democratic candidate in the 2020 election and in metropolitan counties. The total number of geotagged tweets did not vary with changes in daily cases per 100,000 population or overall in relation to COVID-19 deaths per 100,000 population. However, Democratic-leaning counties exhibited a higher overall number of tweets on days with higher reported COVID-19 deaths. Specifically examining COVID-19-tagged tweets, we found that Democratic-leaning counties had a higher total of COVID-19-tagged tweets during periods of elevated COVID-19 case rates and death rates, as well as a higher fraction of COVID-19-tagged tweets. In Republican-leaning counties, tweet patterns showed no association with COVID-19 cases and deaths, regardless of whether we looked at total tweets, total COVID-19-tagged tweets, or the fraction of COVID-19-tagged tweets. In politically neutral counties, tweeting about COVID-19 both in total and as a fraction of total tweets increased as COVID-19 death rates rose. However, there was no evidence of an association between COVID-19 case rates and tweeting behavior, as measured by total tweets, total COVID-19-tagged tweets, or the fraction of COVID-19-tagged tweets.

Based on existing literature suggesting that Democratic-leaning X (Twitter) users are more likely to support intervention measures (Gadarian et al., 2021; Golos et al., 2022), including mask wearing (Callaghan et al., 2021) and social distancing (Zang et al., 2021), it is possible that residents of Democratic-leaning counties are generally more responsive to COVID-19 concerns than those in Republican-leaning counties. Previous studies indicate that Democratic-majority counties had higher vaccination rates (Sun & Monnat, 2022; Ye, 2023) and that Republican-leaning counties experienced higher mortality rates (Aron & Muellbauer, 2022). Combined with these findings, our data open up rich avenues for future research questions that could be theoretically valuable—that is, do social media have an effect on shaping partisan responses? Or is partisanship an agenda set outside social media but reflected in social media feeds? Or do social media perhaps simply reflect personal values? The mediation of social media likely occurs through a series of theoretical mechanisms, ranging from selective exposure to partisan sources (Romer & Jamieson, 2021) to communication about party norms by political elites with significant followings (Shin et al., 2022), to beliefs about the nature of the disease, its progression, and potential remedies (Meirick, 2023), and to the cascading strength of bandwagon support (Wang et al., 2024) surrounding specific claims about the state and outlook of the pandemic. All these factors reflect the “new political communication ecology” (Shah et al., 2017), where the concept of “conversation” has shifted from interpersonal interactions to social media posts and reactions from followers.

The asymmetry observed in our study regarding the responses of Republican and Democratic-leaning counties aligns with recent findings on the differing levels of exposure to homogeneous information and politically motivated misinformation among conservatives and liberals in the U.S. (González-Bailón et al., 2023) and the degree to which they share that information on social media (Sundar et al., 2025). One possibility is that these findings suggest that while all partisans may engage in motivated reasoning and selective exposure, the degree to which they do so may depend on their party affiliation. It is possible that Democrats’ trust in official COVID information may differ from Republicans’ distrust of it. This could arise from how opinion leaders on either side communicate with their followers on social media (Gallagher et al., 2021; Hodson et al., 2022), including how and to what extent they amplify populist sentiments (Rojecki et al., 2024). Further research, including qualitative methods, may help to better explore this phenomenon.

Our study has several limitations. This work examines correlations and cannot establish causality. First, it does not clarify the tone or sentiment of the X (Twitter) tweets—only that they occurred. Consequently, tweets mentioning COVID-19-related topics should not be interpreted as inherently supportive of public health measures. Mentions of COVID-19 may reflect a wide range of perspectives, including supportive, critical, conspiratorial, humorous, or neutral commentary. Future research incorporating sentiment analysis will better characterize the nature of this tweeting behavior. In addition, the county-level exposure rates do not provide insight into the behavior of individual people, but rather the average number of tweets and the average number of COVID-19 cases or deaths.

The selection of temporal controls with time series data can affect results, resulting in over- or under-fitting. We used three different temporal controls (quarterly, monthly, and spline) and found similar results, but there are other potential methods for temporal controls, such as Fourier terms or SARIMAX models, that could be tested in future research. Although our results across models were similar and showed consistency, we would be careful not to claim that we could predict results outside of our data set. That said, we feel that our consistent results using different controls demonstrate the robustness of our main findings.

Decisions related to tweeting may be influenced by demographics, as X (Twitter) users tend to be younger than the overall U.S. population and more Democratic (36%) compared to non-X (Twitter) users (30%) (Mitchell, 2019). These demographics, particularly age, could impact COVID-19 mortality figures. However, we found that the proportion of COVID-19-tagged tweets was lower in Republican-leaning counties, suggesting reduced Twitter usage overall and indicating that X (Twitter) users in these counties may have differing levels of interest in COVID-19-related topics. It is important to note that only a small percentage of Twitter users enable geotagging on their posts, which decreases the overall sample size compared to the entire population of Twitter users. Previous research indicates that geotagged tweets tend to over-represent certain user demographics and behaviors and may also reflect geographic biases. For example, geotagging rates are typically higher in urban areas and among younger users, and lower in rural areas (Yin, Chi, & Van Hook, 2018; Yin, Gao, & Chi, 2022). Moreover, geotagging may correlate with political attitudes in ways that are challenging to fully control. Consequently, urban-rural or partisan disparities in our findings should be interpreted cautiously, as they may partially reflect differences in geotagging propensity rather than solely differences in underlying behaviors. While we account for population size and urbanization level in our models, we cannot completely mitigate this sampling bias, which represents a significant limitation of this type of analysis. In addition, this study did not examine how tweet-sharing dynamics (i.e. retweeting) influence interactions on X (Twitter), as our data set captured tweets in real time during an evolving event but lacked temporally resolved retweet cascade data (i.e. the complete tree of retweeting over time).

Future work will examine the sentiment of tweets at the county level and investigate how tweet sentiment varies by county. In addition, it will explore the demographic characteristics of those counties to identify other factors related to COVID-19 outcomes, such as age and economic conditions. A more in-depth network-based analysis of retweet dynamics and follower influence should be conducted to understand the factors contributing to the county-level differences in information spread and influence. This research could incorporate geospatial methods, including county-level clustering, to investigate potential spatial heterogeneity.

Conclusion

This is the longest-duration study to date exploring X (Twitter) engagement with topics on COVID, and it is the first of its kind to examine X (Twitter) engagement and COVID-19 outcomes across the political spectrum at the U.S. county level, building on previous research investigating how social media can influence, or be influenced by, the policy agenda. Partisan politics have been linked to different COVID-19 outcomes. In this study, we found that in addition to known factors such as varying vaccine uptake and differing COVID-19 practices, X (Twitter) users in Republican-leaning counties demonstrated less responsiveness to information about COVID-19 related to changes in cases or deaths. This divide underscores the value of using X (Twitter) to study social responses to significant events, but it also suggests that X (Twitter) is not utilized for communication equally across different groups in society.

Statistical Analysis

We ran simple OLS regressions to test whether changes in county-level tweet behavior were associated with changes in COVID-19 cases and deaths. We examined both total tweets (averaged over the previous 7 days) and the fraction of total tweets containing a COVID-19-related keyword. To account for the changing activity associated with COVID-19 on X (Twitter) over time, we included a quarter fixed effect to allow for temporal variation. We repeated this analysis using a variable for “partisanship” based on the partisan divide of the 2020 election outcome and further explored results by level of partisanship, conducting separate analyses for Democratic-leaning, Republican-leaning, and politically neutral counties. As tests of robustness, we retested the results for partisanship using a lower threshold and conducted additional models with monthly fixed effects and a natural cubic spline to further investigate temporal effects. Analyses were carried out in Stata/MP 17.0.

Supplemental Material

sj-docx-1-sms-10.1177_20563051261419387 – Supplemental material for County Partisanship Affects Social Media Posting Behavior During a Pandemic

Supplemental material, sj-docx-1-sms-10.1177_20563051261419387 for County Partisanship Affects Social Media Posting Behavior During a Pandemic by M. Luke Smith, Guangqing Chi, Junjun Yin, Yosef Bodovski and S. Shyam Sundar in Social Media + Society

Footnotes

Acknowledgements

We thank Susan H. McHale for her suggestions and comments on early versions of the work.

ORCID iDs

M. Luke Smith

Guangqing Chi

Junjun Yin

S. Shyam Sundar

Ethical considerations

The X (Twitter) data used in this study are classified as non-human-subject research. This is because the data did not include private identifiable data, which is publicly available.

Author contributions

G.C. and M.L.S. conceived the initial concept. G.C., M.L.S., and J.Y. developed the research design and obtained the data. M.L.S., J.Y., and Y.B. conducted data analysis and interpretation. G.C., J.Y., M.L.S., and Y.B. contributed to the initial draft of the manuscript. S.S. reviewed the manuscript. All authors performed editing for the important intellectual content.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported in part by the National Science Foundation (Awards # SES-1823633 and # OPP-2032790), the College of Arts and Sciences at Indiana University, the Social Science Research Institute and the Population Research Institute of The Pennsylvania State University, and the MSIT (Ministry of Science, ICT), Korea, under the Global Scholars Invitation Program (RS-2024-00459638) supervised by the IITP (Institute for Information & Communications Technology Planning & Evaluation).

Data availability statement

The data sets and code used in this paper for the creation of the data set are openly accessible from a Harvard Dataverse repository (https://doi.org/10.7910/DVN/YCWUC7). The first data set is a static copy of geotagged tweets related to COVID-19 stored in .csv text files collected from January 15 forward (currently through November 30, 2021). The second data set is a .csv file summarizing the cumulative and daily numbers of tweets by country and subnational level in the same period. A live version of the data record, which will be updated on a bimonthly basis, can be found in the same repository. Although the data are open source, all users must agree to the terms listed in the data-usage license included in the repository. All codes developed from this study are openly accessible from the GitHub repository named “covid-19_geo_tweets” (). The codes include those for geotagged X (Twitter) data collection, extraction of tweets related to COVID-19-based on keywords, spatial aggregation based on a “point-in-polygon” operation, and a statistical summary of the number of tweets by country and date. All code was developed using the programming language Python 3 (VanRossum and Drake 2016).

Supplemental material

Supplemental material for this article is available online.

Author biographies

M. Luke Smith (PhD, University of Minnesota) is an assistant research professor of environmental epidemiology at The Pennsylvania State University, USA. His research interests are in climate change and health.

Guangqing Chi (PhD, University of Wisconsin-Madison) is a Provost Professor of Geography at Indiana University, USA. His research interests include environmental demography, climate migration, and the use of social media data for social science research.

Junjun Yin (PhD, Dublin Institute of Technology, Ireland) is an assistant professor of data science at The George Washington University, USA. His research interests center on GIScience with a focus on understanding human dynamics in urban environments.

Yosef Bodovski (MS, Pennsylvania State University) is a research associate of spatial analysis at The Pennsylvania State University, USA. His research interests are in the spatial analysis of social data.

S. Shyam Sundar (PhD, Stanford University) is an Evan Pugh University Professor and James P. Jimirro Professor of media effects at The Pennsylvania State University, USA. His research interests include extensive examination of user responses to online sources, including machine sources such as chatbots, smart speakers, and other forms of artificial intelligence.

References

Aron

Muellbauer

(2022). Excess mortality versus COVID-19 death rates: A spatial analysis of socioeconomic disparities and political allegiance across U.S. states. Review of Income and Wealth, 68(2), 348–392. https://doi.org/10.1111/roiw.12570

Bergman

Sella

Agre

Casadevall

(2020). Oscillations in U.S. COVID-19 incidence and mortality data reflect diagnostic and reporting factors. mSystems5:10.1128/msystems.00544-20.https://doi.org/10.1128/msystems.00544-20

Bisbee

Lee

D. D. I.

(2021). Objective facts and elite cues: Partisan responses to COVID-19. The Journal of Politics, 84(3), 1278–1291. https://doi.org/10.1086/716969

Callaghan

Lueck

J. A.

Lunz Trujillo

Ferdinand

A. O.

(2021). Rural and urban differences in COVID-19 prevention behaviors. The Journal of Rural Health, 37(2), 287–295. https://doi.org/10.1111/JRH.12556

Chen

H.-F.

Karim

S. A.

(2022). Relationship between political partisanship and COVID-19 deaths: Future implications for public health. Journal of Public Health, 44(3), 716–723. https://doi.org/10.1093/pubmed/fdab136

Desmet

Wacziarg

(2022). JUE insight: Understanding spatial variation in COVID-19 across the United States. Journal of Urban Economics, 127, 103332. https://doi.org/10.1016/j.jue.2021.103332

Engel-Rebitzer

Stokes

D. C.

Meisel

Z. F.

Purtle

Doyle

Buttenheim

A. M.

(2022). Partisan differences in legislators’ discussion of vaccination on Twitter during the COVID-19 era: Natural language processing analysis. JMIR Infodemiology, 2(1), Article e32372. https://doi.org/10.2196/32372

Feng

Zhou

(2022). Work from home during the COVID-19 pandemic: An observational study based on a large geo-tagged COVID-19 Twitter dataset (UsaGeoCov19). Information Processing & Management, 59(2), 102820. https://doi.org/10.1016/j.ipm.2021.102820

Gadarian

S. K.

Goodman

S. W.

Pepinsky

T. B.

(2021). Partisanship, health behavior, and policy attitudes in the early stages of the COVID-19 pandemic. PLOS ONE, 16(4), Article e0249596.

10.

Gallagher

R. J.

Doroshenko

Shugars

Lazer

Welles

B. F.

(2021). Sustained online amplification of COVID-19 elites in the United States. Social Media + Society, 7(2), 20563051211024956. https://doi.org/10.1177/20563051211024957

11.

Golos

A. M.

Hopkins

D. J.

Bhanot

S. P.

Buttenheim

A. M.

(2022). Partisanship, messaging, and the COVID-19 vaccine: Evidence from survey experiments. American Journal of Health Promotion, 36(4), 602–611. https://doi.org/10.1177/08901171211049241

12.

González-Bailón

Lazer

Barberá

Zhang

Allcott

Brown

Crespo-Tenorio

Freelon

Gentzkow

Guess

A. M.

Iyengar

Kim

Y. M.

. . . Tucker

J. A.

(2023). Asymmetric ideological segregation in exposure to political news on Facebook. Science, 381(6656), 392–398. https://doi.org/10.1126/science.ade7138

13.

Hall

S. A.

Kaufman

J. S.

Ricketts

T. C.

(2006). Defining urban and rural areas in U.S. epidemiologic studies. Journal of Urban Health, 83(2), 162–175. https://doi.org/10.1007/s11524-005-9016-3

14.

Hodson

O’Meara

Thompson

Houlden

Gosse

Veletsianos

(2022). “My people already know that”: The imagined audience and COVID-19 health information sharing practices on social media. Social Media + Society, 8(3), 20563051221122464. https://doi.org/10.1177/20563051221122463

15.

Hussain

Tahir

Hussain

Sheikh

Gogate

Dashtipour

Ali

Sheikh

(2021). Artificial intelligence-enabled analysis of public attitudes on Facebook and Twitter toward COVID-19 vaccines in the United Kingdom and the United States: Observational study. Journal of Medical Internet Research, 23(4), Article e26627. https://doi.org/10.2196/26627

16.

Jang

Rempel

Roth

Carenini

Janjua

N. Z.

(2021). Tracking COVID-19 discourse on Twitter in North America: Infodemiology study using topic modeling and aspect-based sentiment analysis. Journal of Medical Internet Research, 23(2), Article e25431. https://www.jmir.org/2021/2/e25431

17.

Jia

Chen

Zheng

Zhu

(2020). Twitter discussions and emotions about the COVID-19 pandemic: Machine learning approach. Journal of Medical Internet Research, 22(11), Article e20550. https://doi.org/10.2196/20550

18.

Jing

Ahn

Y.-Y.

(2021). Characterizing partisan political narrative frameworks about COVID-19 on Twitter. EPJ Data Science, 10(1), 53. https://doi.org/10.1140/epjds/s13688-021-00308-4

19.

Kaashoek

Testa

Chen

J. T.

Stolerman

L. M.

Krieger

Hanage

W. P.

Santillana

(2022). The evolving roles of US political partisanship and social vulnerability in the COVID-19 pandemic from February 2020–February 2021. PLOS Global Public Health, 2(12), Article e0000557. https://doi.org/10.1371/journal.pgph.0000557

20.

Liu

Yin

Yan

Wan

Malin

(2023). Examining rural and urban sentiment difference in COVID-19-related topics on Twitter: Word embedding-based retrospective study. Journal of Medical Internet Research, 25, Article e42985. https://doi.org/10.2196/42985

21.

Meirick

(2023). News sources, partisanship, and political knowledge in COVID-19 beliefs. American Behavioral Scientist, 0(0). https://doi.org/10.1177/00027642231164047

22.

MIT Election Data and Science Lab. (2018). County presidential election returns 2000–2020. Harvard Dataverse. https://doi.org/10.7910/DVN/VOQCHQ

23.

Mitchell

(2019). Sizing up Twitter users. Pew Research Center: Internet, Science & Tech.

24.

The New York Times. (2021). Coronavirus (Covid-19) data in the United States. https://github.com/nytimes/covid-19-data

25.

Rahman

M. M.

Ali

G. G. M. N.

X. J.

Samuel

Paul

K. C.

Chong

P. H. J.

Yakubov

(2021). Socioeconomic factors analysis for COVID-19 US reopening sentiment with Twitter and census data. Heliyon, 7(2), Article e06200. https://doi.org/10.1016/J.HELIYON.2021.E06200

26.

Rojecki

Conner

V. A.

Royal

(2024). Live free and die: How social media amplify populist vaccine resistance. Social Media + Society, 10(3), 20563051241277292. https://doi.org/10.1177/20563051241277293

27.

Romer

Jamieson

K. H.

(2021). Conspiratorial thinking, selective exposure to conservative media, and response to COVID-19 in the US. Social Science & Medicine, 291, 114480. https://doi.org/10.1016/j.socscimed.2021.114480

28.

Ruths

Pfeffer

(2014). Social media for large studies of behavior. Science, 346(6213), 1063–1064. https://doi.org/10.1126/science.346.6213.1063

29.

Sakun

B. I.

Skunkan

(2020). Public perception of the COVID-19 pandemic on Twitter: Sentiment analysis and topic modeling study. JMIR Public Health and Surveillance, 6(4), Article e21978. https://publichealth.jmir.org/2020/4/e21978

30.

Scala

D. J.

Johnson

K. M.

(2017). Political polarization along the rural-urban continuum? The geography of the presidential vote, 2000–2016. The ANNALS of the American Academy of Political and Social Science, 672(1), 162–184. https://doi.org/10.1177/0002716217712696

31.

Schroeder

Van Riper

Manson

Knowles

Kugler

Roberts

Ruggles

(2025). IPUMS national historical geographic information system: Version 20.0 [Database]. https://doi.org/10.18128/D050.V13.0

32.

Sehgal

N. J.

Yue

Pope

Wang

R. H.

Roby

D. H.

(2022). The Association between COVID-19 mortality and the county-level partisan divide in the United States. Health Affairs, 41(6), 853–863. https://doi.org/10.1377/hlthaff.2022.00085

33.

Shah

D. V.

McLeod

D. M.

Rojas

Cho

Wagner

M. W.

Friedland

L. A.

(2017). Revising the communication mediation model for a new political communication ecology. Human Communication Research, 43(4), 491–504. https://doi.org/10.1111/hcre.12115

34.

Shin

Yang

Liu

Kim

H. M.

Zhou

Sun

(2022). Mask-wearing as a partisan issue: Social identity and communication of party norms on social media among political elites. Social Media + Society, 8(1), 20563051221086230. https://doi.org/10.1177/20563051221086233

35.

Sun

Monnat

S. M.

(2022). Rural-urban and within-rural differences in COVID-19 vaccination rates. The Journal of Rural Health, 38, 916–922. https://doi.org/10.1111/JRH.12625

36.

Sundar

S. S.

Cho Snyder

Liao

Yin

Wang

Chi

(2025). Sharing without clicking on news in social media. Nature Human Behaviour, 9(1), 156–168. https://doi.org/10.1038/s41562-024-02067-4

37.

Valdezten

Thij

Bathina

Rutter

L. A.

Bollen

(2020). Social media insights into US mental health during the COVID-19 pandemic: Longitudinal analysis of Twitter data. Journal of Medical Internet Research, 22(12), Article e21418. https://www.jmir.org/2020/12/e21418

38.

VanDusky-Allen

J. A.

Utych

S. M.

Catalano

(2022). Partisanship, policy, and Americans’ evaluations of state-level COVID-19 policies prior to the 2020 election. Political Research Quarterly, 75(2), 479–496.

39.

Wallace

Goldsmith-Pinkham

Schwartz

J. L.

(2022). Excess death rates for republicans and democrats during the COVID-19 pandemic (National Bureau of Economic Research Working Paper Series, No. 30512). https://doi.org/10.3386/w30512

40.

Wang

Sundar

S. S.

Ram

(2024). Can social media engagement predict election results? Bandwagon effects of tweets about US senate candidates. Social Media + Society, 10(4), 20563051241298450. https://doi.org/10.1177/20563051241298449

41.

X (Twitter) Developer Platform. (2024). https://developer.twitter.com/

42.

(2023). Exploring the relationship between political partisanship and COVID-19 vaccination rate. Journal of Public Health, 45(1), 91–98. https://doi.org/10.1093/pubmed/fdab364

43.

Yin

Chi

Van Hook

(2018). Evaluating the representativeness in the geographic distribution of twitter user population. In Proceedings of the 12th workshop on geographic information retrieval (pp. 1–2).

44.

Yin

Gao

Chi

(2022). An evaluation of geo-located Twitter data for measuring human migration. International Journal of Geographical Information Science, 36(9), 1830–1852.

45.

Young

D. G.

Rasheed

Bleakley

Langbaum

J. B.

(2022). The politics of mask-wearing: Political preferences, reactance, and conflict aversion during COVID. Social Science & Medicine, 298, 114836. https://doi.org/10.1016/j.socscimed.2022.114836

46.

Zang

West

Kim

Pao

(2021). U.S. regional differences in physical distancing: Evaluating racial and socioeconomic divides during the COVID-19 pandemic. PLOS ONE, 16(11), Article e0259665. https://doi.org/10.1371/JOURNAL.PONE.0259665

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

1.37 MB

0.00 MB