Differential impact from individual versus collective misinformation tagging on the diversity of Twitter (X) information engagement and mobility

Kim, Junsol; Wang, Zhao; Shi, Haohan; Ling, Hsin-Keng; Evans, James

doi:10.1038/s41467-025-55868-0

Download PDF

Article
Open access
Published: 24 January 2025

Differential impact from individual versus collective misinformation tagging on the diversity of Twitter (X) information engagement and mobility

Nature Communications volume 16, Article number: 973 (2025) Cite this article

10k Accesses
6 Citations
90 Altmetric
Metrics details

Subjects

Abstract

Fears about the destabilizing impact of misinformation online have motivated individuals and platforms to respond. Individuals have increasingly challenged others’ online claims with fact-checks in pursuit of a healthier information ecosystem and to break down echo chambers of self-reinforcing opinion. Using Twitter (now X) data, here we show the consequences of individual misinformation tagging: tagged posters had explored novel political information and expanded topical interests immediately prior, but being tagged caused posters to retreat into information bubbles. These unintended consequences were softened by a collective verification system for misinformation moderation. In Twitter’s new feature, Community Notes, misinformation tagging was peer-reviewed by other fact-checkers before revelation to the poster. With collective misinformation tagging, posters were less likely to retreat from diverse information engagement. Detailed comparison demonstrated differences in toxicity, sentiment, readability, and delay in individual versus collective misinformation tagging messages. These findings provide evidence for differential impacts from individual versus collective moderation strategies on the diversity of information engagement and mobility across the information ecosystem.

Toolbox of individual-level interventions against online misinformation

Article 13 May 2024

Measuring exposure to misinformation from political elites on Twitter

Article Open access 21 November 2022

Misunderstanding the harms of online misinformation

Article 05 June 2024

Introduction

The visibility of mis- and disinformation online have attracted substantial attention around the world with demonstrations of their direct influence on major collective action in the world^1,2,3,4,5. These actions range from buying and selling stocks² and avoidance of vaccines³ to the attempted coup and occupation of the U.S. Capitol by rioters⁴. Legitimate fears about the destabilizing influence of false online information have inspired and put pressure on both individuals and platforms to respond. Individuals proactively correct others’ claims by deploying links to fact-checking websites, such as PolitiFact and Snopes^6,7,8,9,10. With the potential for amplifying misinformation through filter bubbles^11,12, social media platforms like Twitter and Facebook have come under public and political pressure to implement misinformation moderation strategies^13,14,15.

Individuals have become empowered to challenge others’ online claims with misinformation tags (or fact-checks) in pursuit of a healthy information ecosystem and to break down ideological echo chambers^6,7,8. These misinformation tags tend to target political outgroups^6,7,9, exposing tagged posters to opposing ideological perspectives. It is less clear, however, whether their misinformation tagging motivates targeted posters to explore diverse political contents afterward. Earlier research on motivated reasoning suggests that misinformation tags contradicting targeted poster’s beliefs could backfire and reinforce preexisting beliefs^16,17, which could discourage people from exploring diverse information¹⁸. By contrast, a growing body of research argues that misinformation tagging does not backfire, but reduces engagement with misinformation and expands it with diverse information^13,14,19,20. These mixed findings suggest that the effects of misinformation tagging could depend on the method of correcting misinformation. Individual misinformation tagging by other users often involves toxic and intolerant messages that dehumanize targeted posters^9,21, potentially hindering their willingness to explore diverse information²².

Platforms have experimented with institutionalized systems that verify the accuracy of content through collective inputs from a wider distribution of users. Notably, on Twitter’s new platform, Community Notes (formerly Birdwatch), misinformation tags undergo a formal peer-review process by diverse users before being revealed to the original posters and broader Twitter user community^8,13,14. Other platforms, including YouTube and Facebook, have recently tested or announced plans to implement features similar to Community Notes^23,24. Rather than indiscriminately exposing users to misinformation tags, Community Notes selectively exposes misinformation tags that receive votes from heterogeneous user groups, ensuring that they are verified across a broad spectrum of perspectives¹³ to activate the wisdom of crowds^25,26. The platform also assesses the alignment of users’ prior contributions with the crowd’s decisions, filtering out voters who frequently oppose and backlash against valid fact-checks on misinformation. Although individual tags may be noisy and less effective, aggregating them collectively could lead to high-quality crowd judgments that align with expert fact-checks across a range of topics, from COVID-19 to politics^14,27,28,29. Furthermore, the Community Notes platform has specifically instituted norms that deter toxic and intolerant misinformation tagging messages³⁰, potentially enhancing the efficacy of misinformation moderations and gently encouraging posters to leave their echo chambers and explore a broader world of diverse information.

In this study, we explore the impacts of individual and collective misinformation tagging on tagged posters’ echo chambers. Echo chambers refer to “bounded, enclosed media spaces that have the potential to both magnify messages delivered within them and insulate them from rebuttal”^31,32, which could increase susceptibility to misinformation^11,33,34. One indicator of echo chambers is their lack of interaction with politically diverse, cross-cutting sources of information. Prior research has measured echo chambers by selective engagement with like-minded news sources, which insulate people from opposing perspectives that could empower rebuttal^35,36. This measure strongly correlates with other echo chamber indicators, such as intensive interactions with like-minded users (i.e., homophily)^37,38. Literature suggests that lack of exposure to and cross-verification through opposing perspectives could erode the ability to find, evaluate, and use information effectively^11,39,40. It could provide users with the illusion that their views are publicly supported^41,42, weakening their overall immunity against misinformation.

The other key indicator of echo chambers is their absence of content diversity resulting from limited engagement with diverse, unfamiliar topics. Emerging literature has documented the rise of socio-political endogamy, noting that both left and right increasingly develop distinct topical interests, encompassing knowledge bases, cultural tastes, and lifestyles^43,44,45. For example, left-leaning individuals are more likely to engage with basic science books about physics, astronomy, and zoology, while right-leaning individuals prefer those about applied and commercial sciences like criminology, medicine, and geophysics⁴⁵. In this way, political polarization spills over into a variety of other topics, leading to multi-dimensional segregation where opposing political groups share progressively less common ground and inhabit different realities even in topics apparently unrelated to politics^43,46. Topical echo chambers, which magnify topics prevalent within one political group and insulate them from others, can problematize intergroup communication and interaction.

Does exposure to each type of misinformation tagging encourage or discourage posters from exploring diverse information and breaking out of echo chambers? To answer this question, we use large-scale digital traces from the platform formerly known as Twitter (X as of July, 2023) to identify posters exposed to each approach of misinformation tagging. First, we identify posters targeted by individual misinformation tags. These posters’ tweets received other individuals’ voluntary replies, citing fact-checking articles from PolitiFact, one of the largest and most studied professional fact-checking organizations in the United States^7,10. Second, we examine posters targeted by collective misinformation tags. These posters’ tweets received notes that contain collectively verified fact-checks through Twitter’s Community Notes platform. Figure 1a visualizes the mechanism of each type of misinformation tagging, which represent the most prevalent misinformation moderation strategies on Twitter^{6,7,8,9,10,13,14,15}. Supplementary Fig. 1 presents an example of individual and collective tags that correct topically identical, COVID-19 misinformation.

**Fig. 1: Misinformation Tagging and Outcomes Measurement.**

Using 712,948 tweets that cite news sources—including posts, retweets, and quotes—posted by 7733 users before and after they were targeted by misinformation tags, we estimate the effects of these tags on the posters’ echo chambers. Specifically, we measure echo chambers using political and content diversity in their posting and sharing behavior (see Fig. 1b). Political diversity measures whether a poster’s tweet cites a source with opposing political stance (e.g., a right-leaning poster references left-leaning articles)^5,47. Content diversity measures whether a tweet discusses novel topics unfamiliar in the poster’s historical tweets. We apply a transformer-based sentence embedding model (SentenceBERT) to extract a high-dimensional, semantic vector representation for each tweet, and aggregate the vectors of each author’s historical tweets to produce an average semantic vector for each poster. We then measure the distance between a particular tweet and the poster to assess the degree to which this tweet expands the poster’s content diversity. As our data focus on tweets citing news sources, we assume that the increase of content diversity indicates the exploration of novel political news topics. For example, consider a user who regularly consumes and shares news about COVID-19 but begins to discuss U.S. tax and labor issues as well. This shift indicates an increase in the user’s content diversity, as detailed in Supplementary Table 1. We consider both political and content diversity because they represent different dimensions that could reinforce one another in limiting exposure to information and exacerbating echo chambers on social media^17,43,48.

Results

We aim to investigate the effects of individual and collective misinformation tagging on political and content diversity using large-scale Twitter data. In our observational data, treatments (i.e., exposure to misinformation tagging), however, are not randomly assigned to misinformation posters, which pose challenges for identifying the causal effects of misinformation tagging. To address these concerns, we apply interrupted time series (ITS) and delayed feedback (DF) analysis, which help eliminate non-causal explanations under certain assumptions.

Interrupted time series (ITS) analysis

Interrupted Time Series (ITS) analysis investigates whether the trend in political and content diversity shifts after misinformation tagging. ITS assumes that without the intervention of misinformation tagging, the pre-treatment trend (i.e., before misinformation tagging) would persist, and the immediate change in trend after misinformation tagging is attributed to effects from tagging. We control for user-level fixed effects to correct for time-invariant user characteristics.

Figure 2a and Table 1 report results from our ITS analysis (Political Diversity: R² = 0.173, Content Diversity: R² = 0.243). Posters manifest an increasing tendency to explore novel political information before being fact-checked by misinformation tags. Specifically, before individual and collective misinformation tagging, posters increase the political diversity (β = 0.237, 95% CI = [0.125, 0.349], t(418115) = 4.14, p < 0.001) and content diversity (β = 0.007, 95% CI = [0.004, 0.010], t(418115) = 4.79, p < 0.001) of their information engagement over time.

**Fig. 2: Political and content diversity change with the intervention of individual and collective misinformation tagging.**

Table 1 Interrupted Time Series (ITS) Model Results for Political and Content Diversity

Full size table

Having their posts criticized by individual misinformation tags, however, causes posters to retreat within an information bubble. Immediately after tagging, posters significantly decrease the political diversity (β = −1.009, 95% CI = [−1.447, −0.571], t(418115) = −4.52, p < 0.001) and content diversity (β = −0.030, 95% CI = [−0.042, −0.019], t(418115) = −5.10, p < 0.001) of their posts. After tagging, the slope becomes nearly flat, indicating that posters’ future posts continue to collapse in both political diversity (β = 0.087, 95% CI = [−0.020, 0.194], t(418115) = 1.60, p = 0.110) and content diversity (β = −0.003, 95% CI = [−0.006, 0.000], t(418115) = −1.95, p = 0.51).

By contrast, there is no statistically significant evidence that collective misinformation tagging causes individuals to retreat within their prior information bubble. The data even reveals a slight, although not significant, increase in political diversity (β = 0.270, 95% CI = [−0.824, 1.363], t(418115) = 0.48, p = 0.629) and a significant increase in content diversity (β = 0.040, 95% CI = [0.012, 0.069], t(418115) = 2.74, p = 0.006) immediately after tagging. Nevertheless, collective misinformation tagging has only a temporary effect on individual posters. Especially, the slope for content diversity changes significantly after tagging (β = −0.014, 95% CI = [−0.024, −0.003], t(418115) = −2.60, p = 0.009), eventually converging to levels experienced before the initial misinformation tags occur. Despite the steepness of the slope following collective tagging, our analysis indicates that content diversity does not significantly drop below the pre-tagged period (see Supplementary Method 1).

We find that the gap between the effects of individual and collective misinformation tagging is significant, particularly regarding the immediate intercept change in political diversity (β_Individual = −1.009, β_Collective = 0.270, β_Collective−β_Individual = 1.279, 95% CI = [0.101, 2.457], t(418115) = 2.13, p = 0.033) and in content diversity (β_Individual = −0.030, β_Collective = 0.040, β_Collective-β_Individual = 0.070, 95% CI = [0.039, 0.102], t(418115) = 4.44, p < 0.001).

Additional analyses reveal the effects of misinformation tagging on the proximity between posters and misinformation taggers. This suggests that Twitter navigation likely makes posters more visible to fact-checkers as they venture into foreign territory (see Fig. 2b). Exposure to fact-checks causes them to retreat back into their information bubbles, distancing them from the foreign stances that fact-checked them (see Supplementary Method 2).

Because time-variant confounders (e.g., viral news, platform algorithm changes, or significant external events) can affect ITS outcomes, we conduct additional analyses to control for these factors. First, we control for major events during the study period through sensitivity analyses. Second, we apply comparative interrupted time series (CITS) analyses. These additional analyses support our initial findings (see Supplementary Method 3). Additionally, to address autocorrelated posting behaviors among social media users, we include autoregressive terms in the ITS models, further enhancing the robustness of our findings (see Supplementary Method 4).

To better understand what happens when posters retreat to their information bubbles, we conduct a series of descriptive analyses (see Supplementary Table 2). When posters reduce their political and content diversity, the number of tweets (comprising posts, retweets, and quotes) posted per day significantly increases, indicating that users are more active within their information bubbles. Specifically, the number of tweets per day is negatively correlated with political diversity (r = −0.107, t(712946) = −90.87, 95% CI = [−0.109, −0.105], p < 0.001) and content diversity (r = −0.052, t(712946) = -43.97, 95% CI = [ − 0.054, − 0.050], p < 0.001). Similarly, we find that the type of posting is different; the proportion of retweets (i.e., tweets simply sharing other users’ tweets) out of the entire tweets per day is negatively correlated with political diversity (r = −0.046, t(712946) = −38.88, 95% CI = [ − 0.048, −0.044], p < 0.001) but positively correlated with content diversity (r = 0.012, t(712946) = 10.13, 95% CI = [0.010, 0.014], p < 0.001). This indicates that users actively post tweets rather than passively retweet other users’ tweets when they exhibit low political diversity. To demonstrate the significant effects of misinformation tagging on political and content diversity, irrespective of these factors, we have adjusted for the number of tweets posted per day. We have also controlled for the proportion of retweets per day, which did not meaningfully change our results (see Supplementary Table 3).

Delayed feedback (DF) analysis

We employ delayed feedback (DF) analysis to further strengthen our causal inference⁴⁹. In our DF analysis, we estimate baseline changes (i.e., changes in outcomes that occur without tags) to answer the question: “Are shifts in political and content diversity attributable to tagging, or do similar changes occur even without tagging?” Pairs of tweets containing similar misinformation, targeted by misinformation tagging at different times, are matched to construct a control group, consisting of posters whose problematic tweets have not yet been tagged due to delayed feedback, and a treatment group of posters who have. For instance, Supplementary Fig. 2 presents an illustrative example involving a pair of matched tweets and tags.

In Fig. 3a, post-treatment (t₁) represents the time window when treatment tweets are tagged but control tweets are not, and pre-treatment (t₀) represents the time window with equal duration t₁ when both treatment and control tweets are untagged. Changes in the outcomes between t₀ and t₁ in the control group reflect baseline changes, which indicate changes without tags. Changes between t₀ and t₁ in the treatment group reflect treated changes, which indicate changes with tags. We compare the difference in pre-post change between control and treatment groups (i.e., baseline vs. treated changes) to identify the effects of misinformation tagging on political and content diversity. DF analysis assumes that, in the absence of treatment, both control and treatment groups would exhibit parallel trends. We control for user-level fixed effects to control for time-invariant, user-specific characteristics.

**Fig. 3: Delayed feedback (DF) analysis.**

Figure 3b and Table 2 present results from the DF analysis (Political Diversity: R² = 0.274, Content Diversity: R² = 0.358). Our DF analysis demonstrates that changes are indeed due to tagging, showing that treated changes are significant above and beyond baseline changes. Consistent with the ITS findings, DF analysis indicates that individual misinformation tags lead to a significant decrease in political diversity (β = −5.886, 95% CI = [−9.633, −2.138], t(8182) = −3.08, p = 0.002). Nevertheless, individual misinformation tagging does not significantly affect content diversity (β = 0.018, 95% CI = [0.145, 0.403], t(8182) = 4.17, p = 0.652). Although ITS analyses show that content diversity decreases after tagging, DF analyses indicate no statistically significant evidence that content diversity decreases beyond baseline changes observed without tags. Collective misinformation tags, by contrast, do not produce a significant decrease in political diversity (β = 1.219, 95% CI = [−4.777, 7.215], t(8182) = 0.40, p = 0.690) and even increase content diversity following tagging (β = 0.274, 95% CI = [0.145, 0.403], t(8182) = 4.17, p < 0.001). The gap between the effects of individual and collective tagging is significant for both political diversity (β = 7.105, 95% CI = [0.069, 14.140], t(8182) = 1.98, p = 0.048) and content diversity (β = 0.256, 95% CI = [0.105, 0.407], t(8182) = 3.32, p = 0.001).

Table 2 Delayed feedback (DF) model results for political and content diversity

Full size table

Linguistic characteristics of misinformation tags

Individual and collective misinformation tagging messages manifest different linguistic characteristics. As shown in Fig. 4 and Supplementary Table 4, we find that individual misinformation tags exhibit twice the toxic content (Mean_Individual = 0.139, Mean_Collective = 0.076, Mean_Collective-Mean_Individual = −0.063, t(7496) = 9.86, p < 0.001, Cohen’s d = −0.228) and convey more negative sentiment compared to collective misinformation tags (Mean_Individual = −0.082, Mean_Collective = −0.050, Mean_Collective-Mean_Individual = 0.032, t(7731) = 2.14, p = 0.033, Cohen’s d = 0.049). Collective tags express slightly higher positive sentiment and produce messages with more neutral sentiment than individual tags. Furthermore, individual tag messages are much shorter (Mean_Individual=179.31, Mean_Collective = 288.87, Mean_Collective-Mean_Individual = 109.56, t(7731) = 26.95, p < 0.001, Cohen’s d = 0.613) and more readable (χ2(7) = 155.32, p < 0.001, Cramer’s V = 0.155) than collective tags. While 53.53% of individual tags necessitate a college-level reading comprehension or higher, 75.77% of collective tags demand this level. Moreover, the delay between posting misinformation and receiving fact-checks is shorter for individual than collective tagging (Mean_Individual=3.037, Mean_Collective = 6.322, Mean_Collective-Mean_Individual = 3.285, t(7731) = 2.13, p = 0.033, Cohen’s d = 0.048). These findings demonstrate that individual tags convey their messages quickly through messages that are succinct, straightforward, emotive, and sometimes toxic. In contrast, collective tags are more slowly communicated through lengthy, complex messages, devoid of emotional undertone or toxicity.

**Fig. 4: Linguistic characteristics of fact-checking messages.**

Based on linguistic differences between individual and collective tags, we question whether gaps in the effects of individual versus collective tags persist, even when the linguistic characteristics of these tags are similar. First, we control for toxicity by excluding tags with a toxicity level higher than 0.4 and retrain only non-toxic tags. Second, we control for sentiment by removing tags with either positive (> 0.2) or negative (< −0.2) sentiments, keeping only neutral tags. Third, we control for length by excluding tags longer than 400 characters and retaining short tags. Fourth, we control for readability by excluding tags that require college-level or higher readability and selecting tags that are relatively easy to read. Fifth, we control for delay by omitting any tags associated with delays longer than 48 hours (log-transformed delay > 1.10) and focusing on quick tags.

We find that the gap between individual and collective tagging remains statistically significant, except when controlling for length. As shown in Supplementary Table 5, the gap in political diversity is not statistically significant after controlling for length (β = 1.071, 95% CI = [−0.231, 2.373], t(399236) = 1.61, p = 0.107). Nevertheless, controlling for length only accounts for 16.26% of the gap between individual and collective tagging in political diversity. This indicates that linguistic characteristics explain a modest but nontrivial portion of the differential impacts between individual and collective tagging. Nevertheless, these measured qualities do not account for the vast majority of the difference.

Control analyses

In this section, we identify systematic differences in misinformation that receive individual versus collective tagging, as well as differences in the posters corrected by each type. Even after controlling for these differences in additional interrupted time series (ITS) analyses, individual and collective tagging have significantly different effects in the directions identified by our unconstrained analysis.

First, we observe that individual taggers focus more on political topics, while collective taggers correct a more diverse range of topics (see Supplementary Table 6). As shown in Supplementary Table 7, the nine most frequent topics in our dataset include political topics known to trigger divisive, polarized reactions in US politics (see “Methods”: Topic modeling). These topics account for 84.06% of the corrections made through individual tagging but only 59.49% of the corrections made through collective tagging.

Therefore, we control for topics of the corrected misinformation, finding that the gaps between individual and collective tags are significant and even slightly larger when they correct identical topics of misinformation. Specifically, we employ propensity score weighting (PSW) method (see Supplementary Method 5). The results demonstrate that even when individual and collective tagging correct topically identical messages, the gap between individual and collective tagging is significant, both in the immediate change of political diversity (β = 2.380, 95% CI = [0.200, 4.560], t(296544) = 2.14, p = 0.032) and content diversity (β = 0.048, 95% CI = [0.003, 0.092], t(296544) = 2.11, p = 0.035). We note that collective tagging is less likely to correct political topics than individual tagging but is more effective in causing original posters to explore diverse content when successfully deployed on political topics. Refer to Supplementary Table 8 for details.

Second, we find that the proportion of right-leaning users corrected by individual tagging is 53.17% while right-leaning users corrected by collective tagging is 44.14%. We also analyze the distribution of political stance among taggers (i.e., those who write individual tags) and voters (i.e., those who vote on the exposure of collective tags) (see Supplementary Method 6 and Supplementary Table 9). We compare the effects of individual and collective tagging in the common scenario where right-leaning posters are corrected by left-leaning ones. Specifically, we focus on cases where right-leaning posters are corrected either by individual tags from left-leaning taggers or by collective tags approved by a majority of left-leaning voters (i.e., Community notes approved by voters, where at least 50% of those with identifiable political stances are left-leaning). In this analysis, the difference between the effects of individual and collective tagging is still significant, both in the immediate change of political diversity (β = 1.780, 95% CI = [0.118, 3.441], t(238081) = 2.10, p = 0.036) and content diversity (β = 0.076, 95% CI = [0.033, 0.119], t(238081) = 3.46, p = 0.001). Refer to Supplementary Table 10 for details.

Third, we find that popular users are more likely to receive collective tags than individual tags, which is consistent with prior literature (see Supplementary Fig. 3)⁸. To examine the differences between individual and collective tags when focusing on less popular, everyday users, we exclude those whose number of followers exceeds 2967, the average number of followers among users corrected by individual tags. We find the results are consistent overall (see Supplementary Table 11), but suggest that collective tagging of low popularity posters is slightly more effective, relative to individual tagging, than with high popularity users. In particular, the difference between individual and collective tagging is significant, both in the immediate change of political diversity (β = 3.612, 95% CI = [0.824, 6.399], t(235632) = 2.54, p = 0.011) and content diversity (β = 0.081, 95% CI = [0.000, 0.162], t(238081) = 1.97, p = 0.049). This may indicate the inoculation of popular users to critique, an increased sensitivity among unpopular users to collective nudges⁵⁰, or both.

Robustness checks

We verify our findings with a battery of robustness checks. First, we seek to avert concerns over the presence of bots on Twitter by reanalyzing our data excluding identified bot accounts^2,5. Second, we reanalyze the relationship controlling for potentially insincere informational activities, such as citing sources of low credibility and intentionally spreading fake news. Third, we attempt to avoid situations in which posters simply criticize distant information without honest consideration by filtering out posts with negative sentiment. Fourth, we identify all tweets within the sample that mention keywords related to receiving community notes broadly and remove them, as they could confound our measure of content diversity. To address concern regarding the effect of replying directly to individual taggers, which could confound the measure of political diversity, we also identify and remove all tweets that reply directly to individual taggers. Fifth, to strictly identify individual tags (i.e., PolitiFact links) that correct the original posters, we prompt ChatGPT to annotate whether the links are used to correct the original poster rather than support them. Then, we limit the sample to links that correct the original posters. Sixth, considering the low visibility of individual tags in Twitter’s message-reply interface^6,8, we restrict the sample to original posters who replied to (and thereby read) the individual tags and remove non-responders. These alterations do not meaningfully impact our reported outcomes (see “Method”: Robustness Checks).

Discussion

This study provides empirical evidence regarding the impact of individual and collective misinformation tagging on echo chambers. Before misinformation tagging, posters show an increased curiosity in diverse political and topical content. This challenges the conception that misinformation is generated and corrected when people retreat into echo chambers^11,33. On the contrary, posters become fact-checked when they venture outside those bubbles. Why is exploration followed by misinformation tagging? First, posters could misinterpret unfamiliar and diverse information from a lack of information literacy⁵¹, increasing the chance of posting the misinformation being tagged. Second, news feed algorithms may increase the probability that posters’ tweets become visible to people from political outgroups, who are highly motivated to fact-check foreign posters^6,7,14. Our analysis shows that posters increase the closeness to misinformation taggers before fact-checks, which could increase the chance of appearing in fact-checkers’ news feeds.

Individual misinformation tagging discourages posters from exploring diverse information. Posters tagged by individuals manifest an immediate drop in political diversity, as evidenced by both interrupted time series (ITS) and delayed feedback (DF) analyses. Content diversity also decreases in ITS analyses, although DF analyses do not reveal a significant drop. This suggests that while content diversity decreases after tagging, it does not fall below the baseline change expected without tags. These unintended consequences are mitigated by collective misinformation tagging. Unlike individual tagging, there is no statistically significant evidence that collective tagging diminishes political and content diversity in both ITS and DF analsyzes; moreover, it results in a short-term rise in content diversity.

Our analyses show that individual tagging involves short, toxic, and emotion-driven messages. Collective tagging, on the other hand, involves longer, less toxic, emotionally neutral, and deliberative messages revealed to posters longer after their offending posts. These results suggest the trade-off between the effectiveness of established systems for promoting openness and mobility across the information ecosystem, but the efficiency of individuals in cleaning it. Low visibility of individual misinformation tagging in Twitter’s message-reply interface^6,8 may motivate taggers to use short and potentially toxic messages. Community Notes responded by implementing a more visible interface for collaborative tagging, which reduces the tendency to terseness, facilitating long and deliberate discussion. Also, norms and values underlying participation in Community Notes could prevent taggers from disseminating succinct yet inflammatory messages viewed as unhelpful and instead source diverse perspectives¹³.

What mechanisms drive differences in the effects of individual and collective misinformation tagging on echo chambers? We find that linguistic characteristics, such as toxicity, sentiments, and length only partially explain differential impacts between individual and collective tagging. This implies that differences in quality other than linguistic characteristics also exert a direct influence. Literature on the wisdom of crowds suggests that while individual tags are susceptible to biases and noise, aggregating tags collectively could correct individual bias, increasing the quality of nonexpert fact-checks^28,52,53. For example, compared to individual tags, collective tags are more closely aligned with professional fact-checks from experts on a variety of topics, ranging from COVID-19 to politics^14,27,28,29. Even though we focus on individual tags that cite professional fact-checks (i.e., PolitiFact), it is possible that interpretations within individual tags might be less effective when not cross-validated like collective tags. For example, individual tags might fail to convey the key points of PolitiFact articles or clearly articulate the relevance of these articles to the original post. Additionally, when multiple fact-checkers co-validate collective tags, these decisions may be perceived as more legitimate and less susceptible to biases, encouraging the original posters to seek out more diverse and cross-validating information²⁸.

Overall, our findings suggest that misinformation is posted and fact-checked when original posters who were accustomed to like-minded sources associated with low credibility (see Supplementary Table 2) suddenly increase their political and content diversity. In the short term, some might believe that pushing them back into their echo chambers with individual tags seems like an effective way to curb misinformation. Nevertheless, over the long term, this approach could expand the cluster of users immersed in misinformation, depriving them of opportunities to educate themselves with opposing perspectives. The ethical and normative aspects of our research remain open questions, but we suggest that collective tagging encouraging exploration might be better for the long-term health of the information ecosystem.

Our analyses have several notable limitations. First, our method for assessing posters’ political stances is indirect, through their posting behavior⁵. This approach has been successfully applied to predict political party affiliation and self-described ideology in previous literature⁵³, but using a direct measure of political ideology or affiliation with social media and survey data would strengthen our assessments. Second, our quasi-experimental methodologies (ITS and DF) depend on assumptions for causal inference. We employ topic modeling and matching to enhance tweet comparability within treatment and control groups, but acknowledge that unobserved time-variant confounders may influence posters’ responses. Third, although we have employed a popular bot detection algorithm, recent studies have suggested that algorithmic removal of bots is challenging and may introduce additional bias⁵⁴. Therefore, we report the full results with and without the algorithmic removal of bots, demonstrating that our results are consistent. To thoroughly remove bots, future research could match social media data with survey or administrative data (e.g., voter records) to ensure the authenticity of participants⁵⁵. Fourth, Twitter (X as of July, 2023) closed access to the Academic Research API, which had been freely available to eligible researchers until May 2023. This could limit other researchers’ ability to reproduce our findings with recent data after May 2023⁵⁶. Collective tagging systems are increasingly being deployed across social media platforms, such as Twitter’s Community Notes and similar features currently being tested on platforms like Facebook and YouTube^23,24. Future research should examine whether our findings are reproducible across different platforms, time periods, and cultural contexts. Fifth, we employ the topic modeling and propensity score weighting (PSW) method to control for semantic differences between tweets tagged by individual and collective tagging (refer to Supplementary Method 5). Nevertheless, PSW might fail to address the confounding effects of unobserved semantic differences beyond topics. Despite these limitations, our study uncovers a significant and substantial relationship between fact-checks and reduced information diversity. We also demonstrate the power of designed institutions, like collective fact-checking on Twitter, to moderate the negative, narrowing effects of fact-checking on information exploration.

Methods

Data

Our study complied with the terms of all data sources used in the study (including but not limited to Twitter/X). Using the Twitter API v2.0 with academic research access, we collected Twitter data to explore the effects of individual and collective misinformation tagging. First, we identified 9,372 users targeted by individual misinformation tagging from 2021/10/1 to 2022/3/25. We selected users whose tweets received fact-checking replies that contain URLs to fact-checking articles from “politifact.com.” Second, we identified 1,465 users targeted by collective tagging from 2022/12/19 to 2023/3/31, when Community Notes were made public to Twitter users globally⁵⁷. In Community Notes, users can flag any tweets as misinformation with notes, and other members vote for the helpfulness of the notes. (Users also have the option to flag tweets they believe are free from misinformation; however, these instances have been excluded from our analysis.) Collectively verified notes that received the above-threshold helpfulness votes from a diverse set of users are then made public to the original user (who posted the misinformation) and the broad Twitter audience¹³. In our work, we only considered notes with above-threshold helpfulness votes. Note that the platform also assesses the alignment of users’ prior contributions with the crowd’s decisions, filtering out voters who frequently oppose and backlash against valid fact-checks on misinformation (see Supplementary Method 7).

Due to the rate limit of Twitter API, we only collected data from regular Twitter users, excluding organizations’ and celebrities’ accounts with 50,000 or more followers. Additionally, to focus on individual users, rather than organizational accounts (e.g., CNN, Fox News, etc), we removed 1,659 users identified as organization accounts by the M3Inference library^58,59. We further removed 1445 users who were fact-checked more than once within the period of data collection to avoid the potential for them to become desensitized for repeated fact-checks. After filtering the data, our final dataset included 7733 users, where 6760 users were targeted by individual misinformation tagging and 973 users were targeted by collective misinformation tagging. We found that individual tagging is more frequent than collective tagging in our dataset due to the cross-validation process required to expose collective tags. This leads to an imbalance in group size between users corrected by individual and collective tags. Nevertheless, our statistical models (interrupted time series and delayed feedback models) do not assume equal group size for comparison between the effects of individual and collective tagging. Also, we found that 16.33% of tweets that received individual tags and 15.60% of tweets that received collective tags were removed by Twitter or by the original poster. The probability of removal is similar between individual and collective tags (Difference = 0.73%, z = 0.629, p = 0.529).

Finally, we collected users’ historical tweets—including posts, retweets, and quotes—which span two months before posting tagged tweets and two months after exposure to misinformation tagging, resulting in 1,409,845 tweets in total. Posts typically indicate active engagement with diverse political sources and topics, allowing users to express their opinions. In contrast, retweets and quotes—which involve sharing others’ tweets—suggest more passive engagement, not necessarily reflecting personal views. We utilize these three types of behaviors for a more comprehensive measurement of users’ information engagement^60,61. We assume that individual misinformation taggings are exposed to users when they are posted, and collective misinformation taggings are exposed to users when they are made public following the above-threshold helpfulness votes. For our statistical analyses, we included 712,948 tweets with observed political and content diversity scores. This research study received a determination from the University of Chicago Social & Behavioral Sciences Institutional Review Board that the study is not considered human subjects research and does not require review (Institutional Review Board Protocol IRB24-0051).

Political diversity

Political diversity measures whether a user posted a tweet that referenced sources having an opposite political stance. Specifically, we determine the political stance of the referenced source by extracting the domain (e.g., cnn.com) of the source and check it from MediaBias/FactCheck database (MBFC; https://mediabiasfactcheck.com/)^5,47. MBFC provides a continuous score for 4874 websites to indicate each source’s political stance, ranging from -1 (extreme left) to 1 (extreme right). Our additional analysis shows that political stance scores from MBFC show significant inter-rater reliability with another database of the political stance of news media, AllSides.com (see Supplementary Method 8).

We then calculate a user’s political stance by averaging the political stance scores of sources referenced in their historical tweets which span two months before posting tagged tweets and two months after misinformation tagging (see Supplementary Fig. 4). Users who predominantly cite left-leaning media are considered left, and those who cite right-leaning media are considered right. Specifically, users with negative average political stance scores are categorized as left, while those with positive scores are categorized as right. Finally, we assign a binary value to represent a user’s political diversity: 1 (diverse) if a user cited a source that has an opposite political stance from the user’s own political stance, 0 (not-diverse) if a user cited a source with the same political stance.

The mean political diversity score is 0.166, and the standard deviation is 0.372 (N = 712,948). Political diversity is negatively correlated with the number of tweets posted per day (r(712946) = −0.107, 95% CI = [−0.109, −0.105], p < 0.001) and the proportion of retweets (r(712946) = −0.046, 95% CI = [ − 0.048, −0.044], p < 0.001). This indicates that users are more active within information bubbles, actively posting tweets rather than passively retweeting other users’ tweets within these bubbles (see Supplementary Table 2).

Content diversity

Content diversity measures whether a user posted a tweet with a topic that is rarely discussed in the user’s historical tweets. We apply the Twitter4SSE model, a transformer-based sentence embedding model (SentenceBERT) that was initialized from BERTweet (a RoBERTa model trained on 850 million tweets from 2012/1 to 2019/8 and 5 million tweets related to COVID-19 pandemic), to encode the meaning of a tweet into a 768-dimensional vector^62,63. The model was further optimized based on recent data (75 million tweets from 2020/11 to 2020/12) using Multiple Negatives Ranking Loss (MNRL) to identify semantic similarity based on the principle that tweets quoting or replying to the same original tweet are likely discussing related ideas⁶². If a pair of tweets quoted or replied to the same tweet, the semantic similarity between them is assumed to be high.

To apply the Twitter4SSE model, we first conduct the identical data preprocessing steps to clean the tweets, which includes: eliminate URLs and mentions and transform the text to lowercase to reduce the presence of generic texts⁶². Next, we represent each tweet with a 768-dimensional semantic embedding (Supplementary Fig. 5 shows the visualization). Finally, we measure the cosine distance between the user embedding and tweet embedding (see Fig. 1b) to represent the content diversity of the current tweet. The user embedding is the average embedding of the user’s historical tweets (see Fig. 1b). Estimating the distance in the embedding space has been frequently used to quantify the diversity of user activities in the online platform^48,64. The distance ranges from 0 to 0.835, with 0 representing homogeneous content and 0.835 representing extremely diverse content. The mean content diversity score is 0.357, and the standard deviation is 0.109 (N = 712,948). We find that political and content diversity are slightly correlated (r(712946) = 0.020, 95% CI = [0.018, 0.022], p < 0.001), assessing conceptually distinct aspects of diversity.

Table 1 shows an example of how content diversity scores are assigned. In this example, the user primarily shows interests in COVID-19 related misinformation. However, as the user explores diverse topics—tax, LGBTQ + , international issues, and labor—the content diversity score increases.

Content diversity is negatively correlated with the number of tweets posted per day (r(712946) = −0.052, 95% CI = [ − 0.054, −0.050], p < .001) but positively correlated with the proportion of retweets (r(712946) = 0.012, 95% CI = [0.010, 0.014], p < 0.001). In other words, users tend to retweet others’ tweets rather than posting their own tweets when increasing content diversity (see Supplementary Table 2).