Abstract
The SARS-CoV-2 (COVID-19) pandemic has reached every corner of the world. Understanding the spatial and temporal patterns of behavior, social dynamics, and policies, as well as their interrelations, can provide critical knowledge for preparatory measures and effective responses to future pandemics. One puzzling area, regardless of how well our preparatory and response mechanisms are established, is the variability in people’s perceptions and reactions to a pandemic and its mitigation policies.
Views on the response to the COVID-19 pandemic have varied at the individual level, and state-level policies have reflected different political landscapes, resulting in stark differences of opinion regarding vaccine uptake (Sehgal et al., 2022), social distancing (Bisbee & Lee, 2021), and mask use (Young et al., 2022). Even before the vaccine rollout, political affiliation was associated with differing perspectives on vaccine policy—by May 2020, the odds of a Democratic-leaning respondent holding a favorable view of a COVID-19 vaccine were 5.4 times that of a Republican respondent, although this odds ratio decreased to 2.4 by October 2020 (Golos et al., 2022). Partisan differences were evident even in the earliest days of the pandemic, with Democrats 8.8% more likely to wash their hands and 12% more likely to avoid gatherings in March 2020 (Gadarian et al., 2021).
The initial months of the COVID-19 pandemic had a more significant impact on Democratic-leaning counties in terms of cases and deaths, particularly in large urban centers (Kaashoek et al., 2022). Later, Republican counties experienced higher rates of COVID-19 cases and deaths (Chen & Karim, 2022; Desmet & Wacziarg, 2022). These Republican counties faced greater COVID-19 impacts due to a combination of behavioral, structural, and policy differences, including lower vaccine uptake (Sehgal et al., 2022).
A major challenge in studying the social and behavioral aspects of COVID-19 is that traditional data collection methods are slow, labor-intensive, and expensive. Moreover, these methods may inadequately measure key population characteristics and behaviors, even with robust probability sampling. In contrast, social media data, including information from social networking services (e.g. X (Twitter), Facebook, and Instagram) provide vast amounts of rich textual data in real time. For instance, researchers conducted sentiment analysis on Facebook and X (Twitter) posts to gauge public attitudes toward COVID-19 vaccines (Hussain et al., 2021). Among these data sources, X (Twitter) offers a highly accessible Big Data stream and has attracted attention from researchers across various disciplines (Ruths & Pfeffer, 2014). By analyzing tweet mentions over space and time, researchers can track phenomena such as public awareness of COVID-19 (Sakun and Skunkan, 2020), attitudes toward social distancing and lockdowns (Jia et al., 2020), and concerns about reopening communities and the economy (Rahman et al., 2021), as well as events related to public health and safety, such as anti-Asian sentiment (Jang et al., 2021) and mental health problems (Valdezten et al., 2020). Also, because of the large data set, investigators can extract meaningful information from even small geographic units (e.g. counties and neighborhoods) and fine time intervals (e.g. specific days).
X (Twitter) may serve as a tool for exploring political partisanship and social responses to COVID-19. One study identified a positive correlation between COVID-19-related deaths and tweets at state and county levels during the early phases of the pandemic, from January 25 through May 10, 2020 (Feng & Zhou, 2022). An analysis of individual X (Twitter) accounts from February to July 2020 revealed that national and state politicians differ in their presentation of information on the platform, with Democratic users emphasizing the pandemic and government response, while Republican users focused more on individual actions, support for businesses, and external political entities such as other countries (Jing & Ahn, 2021). A 2020 study noted that political messaging varied by party, even in the first year of the pandemic (Engel-Rebitzer et al., 2022). Leading up to the 2020 U.S. election, there was a significant partisan divide in X (Twitter) discussions, with Democrats concentrating more on health and COVID-19 as an election issue, while Republicans prioritized the economy (VanDusky-Allen et al., 2022). There is some evidence of geographic differences in partisan responses to COVID-19; one study examining sentiment in non-metropolitan and metropolitan areas on X (Twitter) found that tweets from rural locations often expressed more negative sentiment regarding prevention-related topics such as vaccines, demonstrating polarized reactions to politicians (Liu et al., 2023). This supports the notion of an urban/rural divide on X (Twitter).
However, the studies examining discourse on X (Twitter) regarding COVID-19 were conducted at the national or state level but not at more granular levels, and they are limited in terms of time scope. In this study, we use geotagged tweets collected in 2020 and 2021 to compare tweet frequency with COVID-19 topics. We investigated how Twitter usage related to COVID-19 varies across counties based on their political affiliation as defined by voting behavior in the 2020 presidential election. However, there is no existing research that has studied ongoing activity over the months and years that followed. Although X (Twitter) Inc. has restricted data downloads by imposing a high access fee (X (Twitter) Developer Platform 2024), leveraging the extensive data collected during the pandemic can provide insights into the use of future (potentially vast) social media data to study people’s perceptions and reactions to future pandemics and mitigation policies.
Data and Methods
We downloaded daily counts of COVID-19 cases and deaths at the county level from
To investigate data on partisanship in these counties, we used county-level data from the 2020 election accessed through the Massachusetts Institute of Technology (MIT) Election and Data Science Lab (MIT Election Data and Science Lab, 2018). We classified each county as partisan Republican, partisan Democrat, or politically neutral based on the percentage of votes cast for each party. A county with more than 60% of votes for the Republican candidate (Donald Trump) was categorized as a Republican-leaning county, a county with more than 60% of votes for the Democratic candidate (Joseph Biden) was categorized as a Democratic-leaning county, and the remaining counties were classified as politically neutral. We also analyzed this with a lower cutoff of 55%. In addition, counties were classified as either metropolitan or non-metropolitan (rural) based on Census definitions (Hall et al., 2006). The counts of COVID-19 cases, deaths, and tweets were averaged over the preceding 7 days to obtain a rolling average, and we created a “tweet fraction” as the number of COVID-19-tagged tweets per day divided by the total number of tweets per day for each county. For comparison across counties with different populations, we used US Census Data to standardize the number of COVID-19 cases and deaths, as well as daily tweet counts per 100,000 population (Schroeder et al., 2025).
We created plots that illustrate temporal trends in nationwide tweets and COVID-19 cases over the entire study period. We then conducted a simple ordinary least squares (OLS) regression to examine whether COVID-19-related X (Twitter) frequency was influenced by the rates of COVID-19 cases and deaths during the 2-year period from January 2020 through December 2021. We analyzed both the total number of tweets (averaged over the previous 7 days) and the proportion of COVID-19-tagged tweets. To account for changing engagement with COVID-19 on X (Twitter) over time, we included a quarterly fixed effect to accommodate temporal variation in a time-stratified model. We repeated this analysis using a definition of “partisanship” based on the partisan divide of the 2020 election outcome and further explored results by level of partisanship, with separate analyses for Democratic, Republican, and politically neutral counties. Results were also replicated for the 60% cutoff and 55% cutoff. As a robustness check, we examined whether the association varied by the metropolitan/non-metropolitan status of the counties. In addition, we reran our analysis using a monthly term and a natural cubic spline with 7 degrees of freedom per year to address temporal patterns. We tested this model with an interaction term, rather than stratifying, to further investigate potential differences in the associations between COVID-19 cases or deaths and tweeting patterns across metropolitan/non-metropolitan or partisan divides. Finally, we examined a model that included both metropolitan and partisan status, along with case or death rates, to assess the relative impact of these factors.
Analyses were carried out in Stata/MP 17.0.
Results
COVID-19 and Tweet Behavior
Figure 1a illustrates national trends in COVID-19 cases per 100,000 population and COVID-19-tagged tweets per 100,000 population from January 1, 2020, through December 31, 2021. Figure 1b presents trends in COVID-19 deaths per 100,000 and COVID-19-tagged tweets per 100,000 population. Figure 1a indicates that COVID-19 cases rose over time in a series of waves, with COVID-19-tagged tweets peaking early and then stabilizing. In Figure 1b, we observe that the pattern of COVID-19 deaths also followed a series of waves, but deaths did not continue to rise over time, likely because of vaccines and prior immunity.

(a) Daily rate of COVID-19 cases per 100,000 population and daily rate of COVID-19 keyword geotagged tweets per 100,000 US population. (b) COVID-19 deaths per 100,000 US population and rate of COVID-19 tweets per 100,000 US population. Cases, deaths, and tweets are smoothed with a 7-day rolling average.
There was significant variability in day-to-day reporting of both COVID-19 cases and tweet mentions, so we included a smoothed 7-day rolling average of cases (Bergman et al., 2020). The volume of COVID-19-tagged tweets surged with the initial onset of COVID-19 in Spring 2020, gradually decreasing to a relative plateau during the second wave before dropping to lower levels over the following 2 years, measured both in absolute terms and as a fraction of the total number of tweets. The irregular pattern of tweets is due to missing days of X (Twitter) data, which are indistinguishable from days with zero tweets. The spike in Fall 2021 is an artifact resulting from a large reporting period of previous deaths on that date.
Partisanship and COVID-19-Related Tweet Behavior
We summarized the total count of geotagged tweets containing COVID-19 keywords (i.e. COVID-19-tagged tweets), the total count of all daily geotagged tweets, the proportion of tweets coded as COVID-19-tagged, and the total daily counts of COVID-19 cases and deaths. In addition, we calculated cases, deaths, and tweets per 100,000 population for the nation, as well as separately for counties that were Democratic-leaning, Republican-leaning, or politically neutral (Table 1). Seven-day rolling averages of COVID-19-tagged tweets per 100,000 population are presented for metropolitan and non-metropolitan counties, and separately for Democratic, Republican, and neutral counties in Figure 2a and b. Cases per 100,000 population are displayed by metro/non-metro status (Figure 2c) and by Democratic, Republican, or neutral status (Figure 2d), while deaths by metro/non-metro status and partisan status are shown in Figure 2e and f.
Total Number of COVID-19-Tagged Tweets Collected From January 1, 2020, Through December 31, 2021, Mean of Total Daily Tweets, Fraction of all Tweets Coded as COVID-19 Related, Total Daily COVID-19 Cases, and Total Daily COVID-19 Deaths, as Well as Cases, Deaths, and Tweets per 100,000 Population for the Nation as a Whole and Separately for Democratic-Leaning, Republican-Leaning, and Politically Neutral Counties.

Rates of tweets per 100,000 population by (a) metro/non-metro and (b) county-level partisanship (Democratic/Republican/politically neutral); cases per 100,000 population by (c) metro/non-metro and (d) county-level partisanship; and deaths per 100,000 population by (e) metro/non-metro and (f) county-level partisanship.
Results revealed that Republican-leaning counties had higher daily case rates per 100,000 (25.2;
We conducted simple OLS regressions to examine the association between daily COVID-19 case rates per 100,000 and tweet behavior, adjusting for partisanship at the county level. To account for the long-term trend of decreasing frequency of COVID-19-related terms on X (Twitter), we included a variable for the quarter of the year to capture these temporal trends. In addition, we tested this association using daily deaths per 100,000 population. We then performed these regressions separately based on the partisan status of each county. The regression results are presented in Table 2.
OLS Regression Results for Relationship Between Change in Fraction of all Geotagged Tweets With a COVID-19 Keyword per 100,000 Population and Cases or Deaths per 100,000 Population for all Counties Combined, Adjusted for Partisanship and Temporal Term for Quarter of Year, and by Partisan Subgroup Adjusted for Quarter; and Change Association in Daily Total of Tweets per 100,000 Population and Cases or Deaths per 100,000 Population for all Counties Combined Adjusted for County-Level Partisanship and Quarterly Temporal Term, and Separate Results by Partisanship Adjusted for Quarter.
Regression results indicate that, after adjusting for county-level partisanship and a quarterly fixed-effect term for time, each additional single-case increase in the daily count of cases per 100,000 corresponds to an overall increase in the tweet fraction of .0001 (95% CI = [.0001, .0002]). When we examined the results by partisanship, Democratic-leaning counties exhibited a more pronounced effect, with a .0002 higher fraction of COVID-19 tagged tweets for each single-case increase per 100,000 (95% CI = [.0001, .0003]). There was no evidence of an association between changes in daily cases per 100,000 and tweet fraction for politically neutral counties (.00004; 95% CI = [−.00005, .0001]) or Republican-leaning counties (−.00001; 95% CI = [−.0001, .0001]). This pattern was consistent for the daily count of tweets per 100,000 population, with an increase of .004 additional tweets per 100,000 population (95% CI = [.003, .005]) after adjusting for county partisanship and quarter. Democratic-leaning counties showed an additional .005 tweets per 100,000 population per day (95% CI = [.001, .008]), while there was no evidence of an association for Republican-leaning (−.00002; 95% CI = [−.001, .001]) or politically neutral counties (.001; 95% CI = [−.001, .003]).
When we examined the change in tweet fraction associated with daily COVID-19 deaths per 100,000, we again found the strongest correlation in Democratic-leaning counties. Overall, after adjusting for quarter and county partisanship, each additional death per 100,000 was associated with a .024 increase in the fraction of COVID-19-tagged tweets (95% CI = [.021, .027]). When we analyzed counties separately by partisan level, Democratic-leaning counties exhibited a .054 increase in the fraction of COVID-19-tagged tweets for each additional daily death per 100,000 (95% CI = [.045, .063]), with a lower association observed in politically neutral counties (.009; 95% CI = [.003, .016]) and no evidence of association in Republican-leaning counties (.0001; 95% CI = [−.004, .004]).
When we examine total tweets per 100,000 population and death rates per 100,000 for all counties, we find that each additional daily death per 100,000 population corresponds to an increase of .699 tweets per 100,000 population, adjusted for the county’s partisanship and the quarter of the year (95% CI = [.617, .780]). Democratic-leaning counties experienced an increase of 1.44 tweets per 100,000 population for each additional daily death per 100,000 population (95% CI = [1.18, 1.70]), politically neutral counties had .185 more tweets per 100,000 population for each additional death per 100,000 population (95% CI = [.047, .325]), while there was no evidence of association in Republican-leaning counties (.011; 95% CI = [−.046, .068]).
Metropolitan and Non-Metropolitan Differences in COVID-19 Cases, Deaths, and COVID-19-Related Tweet Activity
As part of our sensitivity analysis, we further divided tweets and COVID-19 cases during our study period into metropolitan and non-metropolitan counties. Figure 2a displays daily geotagged tweet mentions of COVID-19 per 100,000 population for both metropolitan and non-metropolitan counties. The tweet mention rate was approximately twice as high in metropolitan counties compared to non-metropolitan counties throughout the study period.
Figure 2a shows daily COVID-19 cases per 100,000 population for metropolitan and non-metropolitan counties. The overall trend in COVID-19 case rates was similar between metropolitan and non-metropolitan areas. Although the case rate was higher in metropolitan counties prior to August 2020, it was subsequently surpassed by the rate in non-metropolitan counties. In Table 1, we see that non-metropolitan counties experienced higher daily cases per 100,000 population (24.3;
An OLS regression showed no association between COVID-19 case rates or death rates and either COVID-19-related tweets per 100,000 population or the fraction of all tweets that were COVID-19 related for non-metro areas (Table 2). In metropolitan areas, each case per 100,000 population was associated with a nonsignificant increase in the rate of COVID-19-related tweets per 100,000 population (.002, 95% CI = [–.0002, .0045]). However, in these metropolitan counties, the fraction of tweets increased by .0001 for each case per 100,000 (95% CI = [.0000, .0002), and each additional death per 100,000 population was associated with .55 additional daily COVID-19-related tweets per 100,000 population (95% CI = [.36, .73]) and a .024 increase in the proportion of total tweets that were COVID-19 related (95% CI = [.016, .032]).
Robustness Checks
Models using a .55 cutoff to define partisan counties showed a slightly reduced effect size compared to models with a .60 threshold. Stratified models using monthly terms for time and a 7-degree-of-freedom natural cubic spline term demonstrated some variation in the stratified effects. These results are presented in Supplemental Table 1. Models testing associations between case rates and partisan status, case rates and metropolitan status, death rates and partisan status, and death rates and metropolitan status—both for the frequency of COVID tweets and the fraction of COVID tweets, using a 7-degree-of-freedom cubic spline and a 55% cutoff for partisanship—are shown in Supplemental Table 2. We also examined the interaction terms in these models and used the results to generate marginal plots for analyses with the fraction of tweets as the outcome, which are displayed in Figure 3. In these models, we found that cases per 100,000 and deaths per 100,000 are independent predictors of increasing tweet frequency and the fraction of tweets associated with COVID-19, adjusted for metro/non-metro status or partisanship of the county. In addition, metro counties exhibited higher tweet frequency or COVID keyword fraction when adjusted for both death and case rates. In each of these models, at the .55 cutoff for partisanship, Democratic-leaning counties had higher tweet frequency and tweet fraction than neutral counties, adjusted for either case rates or death rates, while Republican-leaning counties had lower tweet frequency or fraction than neutral counties, adjusted for case or death rates. The significance of the interaction terms varied, as illustrated in the marginal plot (Figure 3). Overall, the fraction of tweets containing COVID-19-related terms increased as rates of COVID-19 cases or deaths rose, and generally, Democratic or metropolitan counties exhibited a higher fraction based on cases or deaths. The marginal plots suggest some convergence in proportions for increasing rates of cases by metropolitan status and partisanship, and some divergence for increasing rates of death, although not all interaction terms are significant.

Change in proportion of tweets with COVID keywords associated with increasing cases per 100,000 population by (a) metro/non-metro metro/non-metro counties and (b) county-level partisanship (Democratic/Republican/neutral); Change in proportion of tweets with COVID keywords associated with increasing deaths per 100,000 population by (c) metro/non-metro metro/non-metro counties and (d) county-level partisanship (Democratic/Republican/neutral).
As a final model, we included both the partisanship of counties (using the 55% threshold) and metropolitan versus non-metropolitan status, allowing counties to be categorized as Democratic/metropolitan, Democratic/non-metropolitan, neutral/metropolitan, neutral/non-metropolitan, Republican/metropolitan, or Republican/non-metropolitan. We estimated the effect of COVID rates, measured by cases or deaths, on the frequency of COVID tweets and the fraction of tweets containing COVID keywords, adjusting for both metropolitan/non-metropolitan residence and partisanship. These results are presented in Supplemental Table 3. The effects of cases and deaths on both COVID tweets per 100,000 population and the fraction of tweets associated with COVID are robust to adjustments for both metropolitan status and partisanship. As in our main models, fully adjusted models suggest that higher rates of COVID-19 cases or deaths predict increases in tweeting, as does metropolitan residence, with counties that are partisan Democratic showing a higher tweet response than those that are partisan Republican. Each additional death per 100,000 is a predictor of both increased tweet rates per 100,000 population and a higher fraction of tweets. In these full models, metropolitan counties exhibit a higher frequency and proportion of COVID-19 tagged tweets than non-metropolitan counties. Compared to neutral counties, those with a 55% or higher Democratic vote have a greater frequency and fraction of COVID-19 tweets. To illustrate this, we conducted a three-way interaction and plotted the margins stratified by metropolitan status in Figure 4. As shown in the marginal plot, tweet activity increases with rising cases and deaths, but the response is most pronounced in metropolitan, Democratic counties.

Marginal plots of the change in Twitter activity associated with changes in rates of COVID, metropolitan status, and partisanship, stratified by metropolitan status for models for proportion of tweets with COVID keyword and (a) cases*metropolitan*partisan and (b) deaths*metropolitan*partisan, COVID-tagged tweets per 100,000 population and (c) cases*metropolitan*partisan and (d) deaths*metropolitan*partisan.
Discussion
Our work aligns with previous studies suggesting that the COVID-19 pandemic has disproportionately affected individuals in non-metropolitan counties and in counties that voted Republican in the 2020 election, resulting in higher case rates and higher death rates from COVID-19 (Wallace et al., 2022). The parallel findings regarding metropolitan versus non-metropolitan areas and partisanship support research indicating an increasing geographical partisan divide, with urban core areas becoming increasingly Democratic and rural areas becoming more conservative (Scala & Johnson, 2017). Our study adds detail and context to the limited body of previous work suggesting a connection between X (Twitter) activity and public response to COVID-19; it is the first study to investigate the association between COVID-19 impacts and X (Twitter) behavior using a multi-year data set at the county level, which allows us to see how this association varies across both the partisan divide, and by metro/non-metropolitan counties at a level of detail that would not be possible at the state or national level. The results indicate that tweeting behavior was associated with responses to COVID-19, particularly in counties that voted for the Democratic candidate in the 2020 election and in metropolitan counties. The total number of geotagged tweets did not vary with changes in daily cases per 100,000 population or overall in relation to COVID-19 deaths per 100,000 population. However, Democratic-leaning counties exhibited a higher overall number of tweets on days with higher reported COVID-19 deaths. Specifically examining COVID-19-tagged tweets, we found that Democratic-leaning counties had a higher total of COVID-19-tagged tweets during periods of elevated COVID-19 case rates and death rates, as well as a higher fraction of COVID-19-tagged tweets. In Republican-leaning counties, tweet patterns showed no association with COVID-19 cases and deaths, regardless of whether we looked at total tweets, total COVID-19-tagged tweets, or the fraction of COVID-19-tagged tweets. In politically neutral counties, tweeting about COVID-19 both in total and as a fraction of total tweets increased as COVID-19 death rates rose. However, there was no evidence of an association between COVID-19 case rates and tweeting behavior, as measured by total tweets, total COVID-19-tagged tweets, or the fraction of COVID-19-tagged tweets.
Based on existing literature suggesting that Democratic-leaning X (Twitter) users are more likely to support intervention measures (Gadarian et al., 2021; Golos et al., 2022), including mask wearing (Callaghan et al., 2021) and social distancing (Zang et al., 2021), it is possible that residents of Democratic-leaning counties are generally more responsive to COVID-19 concerns than those in Republican-leaning counties. Previous studies indicate that Democratic-majority counties had higher vaccination rates (Sun & Monnat, 2022; Ye, 2023) and that Republican-leaning counties experienced higher mortality rates (Aron & Muellbauer, 2022). Combined with these findings, our data open up rich avenues for future research questions that could be theoretically valuable—that is, do social media have an effect on shaping partisan responses? Or is partisanship an agenda set outside social media but reflected in social media feeds? Or do social media perhaps simply reflect personal values? The mediation of social media likely occurs through a series of theoretical mechanisms, ranging from selective exposure to partisan sources (Romer & Jamieson, 2021) to communication about party norms by political elites with significant followings (Shin et al., 2022), to beliefs about the nature of the disease, its progression, and potential remedies (Meirick, 2023), and to the cascading strength of bandwagon support (Wang et al., 2024) surrounding specific claims about the state and outlook of the pandemic. All these factors reflect the “new political communication ecology” (Shah et al., 2017), where the concept of “conversation” has shifted from interpersonal interactions to social media posts and reactions from followers.
The asymmetry observed in our study regarding the responses of Republican and Democratic-leaning counties aligns with recent findings on the differing levels of exposure to homogeneous information and politically motivated misinformation among conservatives and liberals in the U.S. (González-Bailón et al., 2023) and the degree to which they share that information on social media (Sundar et al., 2025). One possibility is that these findings suggest that while all partisans may engage in motivated reasoning and selective exposure, the degree to which they do so may depend on their party affiliation. It is possible that Democrats’ trust in official COVID information may differ from Republicans’ distrust of it. This could arise from how opinion leaders on either side communicate with their followers on social media (Gallagher et al., 2021; Hodson et al., 2022), including how and to what extent they amplify populist sentiments (Rojecki et al., 2024). Further research, including qualitative methods, may help to better explore this phenomenon.
Our study has several limitations. This work examines correlations and cannot establish causality. First, it does not clarify the tone or sentiment of the X (Twitter) tweets—only that they occurred. Consequently, tweets mentioning COVID-19-related topics should not be interpreted as inherently supportive of public health measures. Mentions of COVID-19 may reflect a wide range of perspectives, including supportive, critical, conspiratorial, humorous, or neutral commentary. Future research incorporating sentiment analysis will better characterize the nature of this tweeting behavior. In addition, the county-level exposure rates do not provide insight into the behavior of individual people, but rather the average number of tweets and the average number of COVID-19 cases or deaths.
The selection of temporal controls with time series data can affect results, resulting in over- or under-fitting. We used three different temporal controls (quarterly, monthly, and spline) and found similar results, but there are other potential methods for temporal controls, such as Fourier terms or SARIMAX models, that could be tested in future research. Although our results across models were similar and showed consistency, we would be careful not to claim that we could predict results outside of our data set. That said, we feel that our consistent results using different controls demonstrate the robustness of our main findings.
Decisions related to tweeting may be influenced by demographics, as X (Twitter) users tend to be younger than the overall U.S. population and more Democratic (36%) compared to non-X (Twitter) users (30%) (Mitchell, 2019). These demographics, particularly age, could impact COVID-19 mortality figures. However, we found that the proportion of COVID-19-tagged tweets was lower in Republican-leaning counties, suggesting reduced Twitter usage overall and indicating that X (Twitter) users in these counties may have differing levels of interest in COVID-19-related topics. It is important to note that only a small percentage of Twitter users enable geotagging on their posts, which decreases the overall sample size compared to the entire population of Twitter users. Previous research indicates that geotagged tweets tend to over-represent certain user demographics and behaviors and may also reflect geographic biases. For example, geotagging rates are typically higher in urban areas and among younger users, and lower in rural areas (Yin, Chi, & Van Hook, 2018; Yin, Gao, & Chi, 2022). Moreover, geotagging may correlate with political attitudes in ways that are challenging to fully control. Consequently, urban-rural or partisan disparities in our findings should be interpreted cautiously, as they may partially reflect differences in geotagging propensity rather than solely differences in underlying behaviors. While we account for population size and urbanization level in our models, we cannot completely mitigate this sampling bias, which represents a significant limitation of this type of analysis. In addition, this study did not examine how tweet-sharing dynamics (i.e. retweeting) influence interactions on X (Twitter), as our data set captured tweets in real time during an evolving event but lacked temporally resolved retweet cascade data (i.e. the complete tree of retweeting over time).
Future work will examine the sentiment of tweets at the county level and investigate how tweet sentiment varies by county. In addition, it will explore the demographic characteristics of those counties to identify other factors related to COVID-19 outcomes, such as age and economic conditions. A more in-depth network-based analysis of retweet dynamics and follower influence should be conducted to understand the factors contributing to the county-level differences in information spread and influence. This research could incorporate geospatial methods, including county-level clustering, to investigate potential spatial heterogeneity.
Conclusion
This is the longest-duration study to date exploring X (Twitter) engagement with topics on COVID, and it is the first of its kind to examine X (Twitter) engagement and COVID-19 outcomes across the political spectrum at the U.S. county level, building on previous research investigating how social media can influence, or be influenced by, the policy agenda. Partisan politics have been linked to different COVID-19 outcomes. In this study, we found that in addition to known factors such as varying vaccine uptake and differing COVID-19 practices, X (Twitter) users in Republican-leaning counties demonstrated less responsiveness to information about COVID-19 related to changes in cases or deaths. This divide underscores the value of using X (Twitter) to study social responses to significant events, but it also suggests that X (Twitter) is not utilized for communication equally across different groups in society.
Statistical Analysis
We ran simple OLS regressions to test whether changes in county-level tweet behavior were associated with changes in COVID-19 cases and deaths. We examined both total tweets (averaged over the previous 7 days) and the fraction of total tweets containing a COVID-19-related keyword. To account for the changing activity associated with COVID-19 on X (Twitter) over time, we included a quarter fixed effect to allow for temporal variation. We repeated this analysis using a variable for “partisanship” based on the partisan divide of the 2020 election outcome and further explored results by level of partisanship, conducting separate analyses for Democratic-leaning, Republican-leaning, and politically neutral counties. As tests of robustness, we retested the results for partisanship using a lower threshold and conducted additional models with monthly fixed effects and a natural cubic spline to further investigate temporal effects. Analyses were carried out in Stata/MP 17.0.
Supplemental Material
sj-docx-1-sms-10.1177_20563051261419387 – Supplemental material for County Partisanship Affects Social Media Posting Behavior During a Pandemic
Supplemental material, sj-docx-1-sms-10.1177_20563051261419387 for County Partisanship Affects Social Media Posting Behavior During a Pandemic by M. Luke Smith, Guangqing Chi, Junjun Yin, Yosef Bodovski and S. Shyam Sundar in Social Media + Society
Footnotes
Acknowledgements
We thank Susan H. McHale for her suggestions and comments on early versions of the work.
Ethical considerations
The X (Twitter) data used in this study are classified as non-human-subject research. This is because the data did not include private identifiable data, which is publicly available.
Author contributions
G.C. and M.L.S. conceived the initial concept. G.C., M.L.S., and J.Y. developed the research design and obtained the data. M.L.S., J.Y., and Y.B. conducted data analysis and interpretation. G.C., J.Y., M.L.S., and Y.B. contributed to the initial draft of the manuscript. S.S. reviewed the manuscript. All authors performed editing for the important intellectual content.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported in part by the National Science Foundation (Awards # SES-1823633 and # OPP-2032790), the College of Arts and Sciences at Indiana University, the Social Science Research Institute and the Population Research Institute of The Pennsylvania State University, and the MSIT (Ministry of Science, ICT), Korea, under the Global Scholars Invitation Program (RS-2024-00459638) supervised by the IITP (Institute for Information & Communications Technology Planning & Evaluation).
Data availability statement
The data sets and code used in this paper for the creation of the data set are openly accessible from a Harvard Dataverse repository (https://doi.org/10.7910/DVN/YCWUC7). The first data set is a static copy of geotagged tweets related to COVID-19 stored in .csv text files collected from January 15 forward (currently through November 30, 2021). The second data set is a .csv file summarizing the cumulative and daily numbers of tweets by country and subnational level in the same period. A live version of the data record, which will be updated on a bimonthly basis, can be found in the same repository. Although the data are open source, all users must agree to the terms listed in the data-usage license included in the repository. All codes developed from this study are openly accessible from the GitHub repository named “covid-19_geo_tweets” (
). The codes include those for geotagged X (Twitter) data collection, extraction of tweets related to COVID-19-based on keywords, spatial aggregation based on a “point-in-polygon” operation, and a statistical summary of the number of tweets by country and date. All code was developed using the programming language Python 3 (VanRossum and Drake 2016).
Supplemental material
Supplemental material for this article is available online.
Author biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
