Abstract
Keywords
Introduction
Twitter is used by individuals, grassroots movements, and political and social elites to directly communicate to the public and influence opinion (see, e.g., Rogers, 2013). The platform appears relatively accessible to researchers because the majority of accounts post publicly and its application programming interface (API) remains rather open in comparison with other online social networks (OSNs). However, after Twitter started introducing increasingly restrictive rate limits and enforcing stricter terms of service regarding the sharing of data in 2012 (see, for example, Puschmann & Burgess, 2014), the global follow network stopped being accessible to the majority of researchers (for exceptions, see Myers et al., 2014).
This development resulted in a lack of independent research on a key mechanism for information diffusion and a global infrastructure for influence. After all, despite sponsored content and algorithmic sampling, follow or subscription networks are for many OSNs still a main predictor for content exposure. While it is a widespread research practice to address this lack by using proxies for networks of attention on Twitter, such as mention, co-hashtag, or retweet networks (e.g., Himelboim et al., 2017), all of these rely on active communication. Therefore, widespread practices such as silent listening may be underrepresented.
This situation is aggravated by a recent change in how Twitter assigns account identification numbers (IDs), which allow the easy retrieval of information from its API. Its consecutive numbering scheme allowed a few independent, technically and monetarily costly projects such as the Australian Tracking Infrastructure for Social Media Analysis (TrISMA) (Bruns et al., 2016) to collect details of public accounts globally. Based on this, national follow networks could be captured and analyzed (Bruns & Enli, 2018; Bruns et al., 2017). However, Twitter closed this possibility by assigning account IDs at random, undermining further data mining efforts following this collection strategy.
These technical restrictions place considerable limitations on research that aims to assess Twitter’s relevance for issues such as news framing, opinion leadership, and intermedia agenda setting processes (Iyengar et al., 2010; Scheufele & Tewksbury, 2007; Watts & Dodds, 2007). For example, Barberá (2015) derives a measure of individual ideological scaling based on the network position of Twitter users, demonstrating that the latter reliably predicts the former. Conway et al. (2015) are able to show that intermedia agenda setting takes place in US politics, with the Twitter messages of political candidates in 2012 both predicting and echoing mainstream media messages, while Colleoni et al. (2014) investigate political homophily in US politics in networks of reciprocated and non-reciprocated ties among the followers of the two main parties. In contradiction to widespread beliefs regarding online echo chambers as largely self-contained and insular environments, Valenzuela et al. (2017) find a greater effect of Twitter on television news than vice versa. However, all such studies have in common that in those cases where they describe processes
Therefore, we present a large-scale test of a new Twitter follow network mining technique, building on the so-called rank degree method (Salamanos et al., 2017a, 2017b, 2017c; Voudigari et al., 2016), which we describe in the “Methods and Analysis” section below. As this walk-based technique only requires local information to sample a graph, we were able to adapt it as a data mining method for the follow networks of influential Twitter users, using the cost-free Twitter standard APIs. This approach has been tested by using the method to draw a sample of the German-speaking Twittersphere.
Our analysis shows that the sample exhibits influence and activity measures orders of magnitude higher than a random sample of the same size from a near-complete collection of German-using Twitter accounts from the 2016 TrISMA dataset (Bruns et al., 2016). This is evidence that our sample represents an influential backbone, an approximation of the proverbial, highly influential “top 10 percent” of this language-based Twittersphere. A test study employing community detection and keyword extraction shows that this network sample is suitable for investigating large-scale topical communication structures, corresponding with issue publics or communities in a language-based Twittersphere.
In light of the continuously tightening restrictions on the Twitter standard APIs, our adaptation of the rank degree method for gathering network samples is a valuable alternative to more brute-force approaches, as it can be implemented by small teams at low costs in terms of time and budget. Even though Twitter has made it almost impossible for independent researchers to capture comprehensive, large-scale follow networks, our method works around these restrictions to produce an overview of the overall structure of such networks, in this case, for the German Twittersphere.
Opportunities Opened Up by Follow Network Samples
Nation-level data about follow networks enable a multitude of avenues of enquiry. Among others, they make further data collection possible, such as constant monitoring of the most influential accounts, without being restricted to single topics. For example, this allows for a platform-independent assessment of trending topics and public opinion on a platform, of behavioral changes that are the result of adjustments made to recommendation algorithms, or of the positions, roles, and influence of automated accounts.
Combined, these data support the further development of media and communication theory as well as social theory regarding networked public spheres on an empirical basis. In the case of Australia, this has already been shown by Bruns and Highfield (2016): they can base their reappraisal of public sphere theory by Habermas (1962)—in the form of a more up-to-date concept of a networked public sphere—on a complete collection of the Australian Twitter follow network, detectable topical communities, and the localized spread of hashtags within it.
Also, research methods and paradigms that have their roots in more qualitative practices can benefit from this kind of large-scale data. For example, Dehghan (2018) grounds his discourse analysis of polarizing discussions on Twitter about the Australian Racial Discrimination Act on the same dataset.
Beyond academic research, the usefulness of large-scale social media data of this kind in a political or commercial context is clearly given. In public relations and (influencer) marketing, the benefit of being able to get an overview of the communication landscape of an OSN is obvious. Furthermore, finding answers to questions of (media) policy, for example, regarding the fragmentation of the media audience as described by McQuail (2010, pp. 444–445), can be supported with these data.
The hypothesized fragmentation of online news audiences represents another area of research in which a holistic view of national Twitterspheres can yield decisive results, in some cases allowing inferences about news consumption that extend beyond social media. For example, there is evidence for shared patterns of public attention that reach from social media to mass media on a transnational level and exhibit fragmentation that is accompanied by a high degree of audience duplication in select contexts (Fletcher & Nielsen, 2017; Tewksbury, 2005; Webster & Ksiazek, 2012). Such research can be augmented by relating the news sharing behavior on Twitter to the network structure to determine whether online news audiences are fragmented along network community structures or, conversely, no visible relationship between news preferences and network structure exists.
Problem: Restricted API Access for Researchers to Gather Follow Network Samples
This kind of data, however, has become less and less accessible in the past years. In part, this has resulted from increasingly strict privacy legislation (such as the European GDPR) taking effect and the subsequent arrangements that Twitter has made to comply with such regulation. However, the primary purpose of Twitter’s APIs has never been to support academic research, but to enable developers to build products that make use of Twitter data. Therefore, information that is not necessarily needed for such products may be withheld or not even stored, rather than being made accessible to the research community.
In fact, Twitter offers three different kinds of APIs—the standard, the premium, and the enterprise APIs (Tornes, 2017)—and the free standard APIs are the most restrictive. The Standard APIs allow its users to retrieve account information and also to query tweets. For both functions, API calls are limited and have a cooldown time. For example, when looking up the friends 1 of a user, a single API user may execute a maximum of 15 calls every 15 min, retrieving 5,000 friends each call—so a maximum of 75,000 friends of 1 to 15 accounts can be retrieved within 15 min.
The main advantages of the premium and enterprise APIs are higher rate limits for receiving tweets as well as access to a longer history of tweets and account activity-related features. 2 However, researchers face affordability and accessibility issues with both of these services, as access to the enterprise API is only sold through direct contact with a salesperson (prices are not mentioned online), and access to the premium API is also subjected to case-by-case approval. The latter seems to be more restrictive than for the Twitter Standard API.
In the past, researchers were able to make use of Twitter’s cost-free standard API to gather large collections of Twitter accounts, culminating in a collection by Bruns et al. (2016). Their collection method exploited the fact that Twitter assigned consecutive account IDs. It queried the free API for every consecutive possible account ID, gathering data for almost every global account in 2016. Twitter has since changed its policy of assigning account IDs to a random system. This involves much higher numbers than there are Twitter accounts, and thus renders more recent collections using this method impossible.
The APIs are also undergoing continuous changes, as certain user account properties are becoming protected, thus not accessible to researchers via the API anymore. In the past, these properties included geotags, user time zone, and the interface language used on Twitter, which are all inaccessible by now.
Objectives: Test of a Sampling Method and Data Mining of the German 3 Twittersphere
Given those restrictions, this project’s purpose was two-fold: first, we were testing an adaptation of a sampling method for influential nodes in a network that has shown promising “lab” results (see below) as a data mining method “in the wild”—using the cost-free Twitter standard APIs only. Our focus here was especially to test the practical feasibility of the method for a small research team with limited resources and to explore which adaptations have to be made for it to work under this objective. Furthermore, we also probed the usefulness of the so-gained sample for some of the research opportunities mentioned above, by identifying topical communities among most influential accounts in the German-speaking Twittersphere.
Second, this project’s objective was to open up further avenues of enquiry by either providing data on the German Twittersphere for other projects (especially when this is not possible while staying compliant with Twitter’s terms of service) or to provide other researchers with the method and its implementation, that is, the code for a prototype, for sampling either the same data or other language-based Twitterspheres. It can be found under an open-source license in the Supplementary Materials of this article as well as in an online git repository. 4 We invite other interested parties in supporting us with its further development there.
Background
While Twitter does not represent the general population of a country or language domain, it is an interesting population in itself. This has motivated a number of successful attempts to sample or completely collect other national Twitterspheres. However, all these attempts were either not based on the Twitter follow network or relied on properties of the Twitter standard APIs that are no longer available.
Representativeness of Twitter Data and Representativity of Social Network Samples in General
A common criticism of Twitter research, and social media research in general, is that its users are not representative of the general population of a country or language domain at large (e.g., Blank, 2017; Mellon & Prosser, 2017). We do not claim this kind of representativity. However, Twitter users by themselves represent a population of interest in many countries worldwide. Even in Germany, where Twitter plays a comparatively smaller role (it is ranked 16th in Germany according to Alexa,
5
compared to rank 8 in the USA
6
and rank 11 worldwide
7
), 4% of the German-speaking population over 14 are using Twitter at least once per week and this value appears to be stable over the last few years according to representative surveys (Frees & Koch, 2018). This makes Twitter a niche network, compared to competitors like Facebook and Instagram. However, its reach is comparable with the 3% of weekly users of audio podcasts in Germany in 2018 (Frees & Koch, 2018), or 4.5% who had a subscription to a national newspaper in Germany in 2017 (Pasquay, 2018). Moreover, the results of a study by von Nordheim et al. (2018) of broadsheet newspapers in the United Kingdom (
These top 10% being responsible for 80% of the tweets within the dataset analyzed by Wojcik and Hughes (2019) points to another issue with representativity in OSNs. It lies in the fact that OSNs usually exhibit heavily skewed distributions regarding activity and connectivity. This means that a traditional understanding of representativity based on common statistical approaches, which often assume normal distributions, will not be useful. Consequently, in this article, we move our focus in terms of sample quality from typicality (i.e., traditional representativeness) to another goal: getting the most influential accounts and a backbone structure of the network at the smallest possible cost.
Related Research: Location- or Language-Based Twittersphere Collections
This project is preceded by and builds on previous successful endeavors to collect other language- or nation-based Twitterspheres. To our knowledge, the first project mapping a national Twittersphere was conducted by Ausserhofer and Maireder (2013). However, while innovative at the time in their methods and with impressive results given the exploratory nature of their study, they analyzed @-mention networks of only a few hundred accounts whose collection was based on the keywords related to Austria. Geenen et al. (2016) followed a similar approach for the Netherlands, also analyzing @-mention networks based on a keyword search for specifically Dutch terms. Ausserhofer and Maireder (2013) argue that @-mentions would be a “better” measure of influence than follower numbers. However, this claim may be an overinterpretation from a study by Cha et al. (2010), which showed that for the top 10 percent of Twitter accounts in terms of follower numbers the number of retweets and replies was only weakly rank-correlated with the follower numbers. Closer inspection of this study also shows that retweet- and mention-based influence fluctuates over time and by topic. In any case, while @-mention networks are easier to collect, because their collection is less restricted by the Twitter standard APIs, the explanatory power if used alone depends on the definition of influence and neglects the influence of highly followed accounts on silent listeners on Twitter. Especially because a reaction to a tweet with an @-mention or retweet depends on having seen this tweet first, they should be analyzed in combination with follow networks.
Bruns et al. (2017) presented an analysis of a follow network based on a comprehensive collection of Australian Twitter accounts in 2016 by TrISMA (Bruns et al., 2016). As explained above, the global dataset it was based on comprises all Twitter users of 2016 and its collection was possible due to the fact that Twitter assigned user IDs in a consecutive way. This dataset could then be filtered down to accounts who are likely to be located in Australia, and all their connections could be collected, even though with great effort. While the same method allowed the analysis of the Norwegian Twittersphere (Bruns & Enli, 2018), this has been rendered impossible by now.
Methods and Analysis
In this section, we will first describe our sampling method, which we based on the so-called rank degree method (Voudigari et al., 2016), adapted for directed networks, and optimized for practicality and efficiency with the cost-free Twitter standard APIs. After this, we describe our assessment of the sample’s quality in accordance with our sampling goals: the collected accounts’ influence as measured by coverage, reach, activity, and follower numbers. Finally, we present methods and results of a test study using our sample, which yields promising results: by means of community detection, keyword extraction from tweets, and a manual inspection of account descriptions, we are able to detect topical communities within this highly influential sample of the German Twittersphere. Our results showcase an intuitive overview of the structure of this part of the German networked public sphere that resembles preceding analyses of the Australian Twittersphere.
Sampling
Our sample was collected from the mid of December 2018 until the end of May 2019 (Figure 1). While we have almost reached our initial goal of collecting 1 million accounts (937,809 accounts were collected; the goal was mostly determined by the time and resources available for the overall project and represents about 5% of the number of German-using accounts as determined by TrISMA), our collection was stopped by the fact that the Twitter standard API has been changed and the interface language account property, on which our method relied, was made private. However, while this collection was stopped, our collection tool, as it is available today, has been updated so that it infers an account’s language by identifying the Tweet language(s).

Sample size as measured by the total number of edges over time (December 17, 2018, to May 28, 2019).
The Rank Degree Method
To draw a sample of the most influential accounts within the German Twittersphere, we chose to develop and test a modification of the rank degree method.
The rank degree method (Voudigari et al., 2016) is a mostly deterministic, walk-based algorithm that only requires local information to sample a graph. Assume an undirected network (or a directed network in which all edges are reciprocal), given initial nodes as seeds and a desired sample size
For each seed
Update the sample with the selected edges (
Update the source graph by removing the edges selected in Step 2;
Update seeds so that only the new node
If all
Repeat Steps 1–5 until sample is of size
From the steps outlined above, it becomes evident that the process is deterministic in that it completely depends on the initial seeds chosen (and Step 5 is executed on the new initial seeds). Moreover, any selected node can be visited multiple times, but each time from a different node. If any two walkers visit the same node at the same time, those two walkers will collapse and further proceed as one. By updating the graph every time, the process ultimately alters the nodes’ degree and ranking, resulting in a dynamic sampling process.
Note that there exist two versions of this algorithm. The first version works as outlined above, while with the second version, instead of only the top connected node in Step 1, the top
Our Adaptation and Implementation of the Rank Degree Method
With any
In contrast to Voudigari et al. (2016) and Salamanos et al. (2017a, 2017b, 2017c), the full network is not available to us, therefore we had to access the required information directly from the Twitter standard API. While it was possible to increase the collection speed by using 12 API keys provided by personal accounts of the authors and other project contributors at our institutions, all API endpoints have a quota on calls per time-window and per API key basis.
Before starting the sampling algorithm, we had to define a
From the
Look up the last 5,000 friends of
Choose the friend
If
Update the sample with the selected follow connection (
“Burn”/save the follow connection so that it cannot be walked again;
Repeat Steps 2–7 with
To use the full capacity of the API calls available to us, we used 200 parallel walkers. Contrasting to the original rank degree method, we do not let the walkers collapse when they land on the same node for time-efficiency reasons but let them execute consecutively. While close to the feasible maximum with the API restrictions in place, 200 is a massively lower number of walkers than what has been used in the tests of the original rank degree method, which was 1% of the total number of nodes. In our case this would equal over 25,000 walkers. 8
This is not the only adaptation that we decided to make for practicality reasons. Instead of choosing the friend with the highest degree, that is, the one with the most connections, our process (1) only looks up the last 5,000 friends and then (2) chooses the one with the most followers if (3) it has its interface language set to German.
Adaption (1) was done for three reasons. First, it takes exactly one API call to access up to 5,000 most recent friends of an account. By default, only 15 API calls every 15 min are allowed for this endpoint per API key. Since there exist Twitter accounts with 50,000 friends and more, the time needed would have been increased unjustifiably for those accounts. As every account can pay only limited attention to its friends, the more friends one account has, the less they count individually. Therefore, including all friends of an account that follows several thousand accounts would give those connections an importance which they most likely do not possess. This renders spending precious API calls beyond a 5,000 friends limit unjustifiable from our perspective. Third, while the original rank degree method for undirected networks uses the degree, we use the follower number only as the (approximate) in-degree, because the number of followed accounts, or out-degree, is no indicator of an account’s influence (Cha et al., 2010).
Adaption (2) is a simplicity trade-off, acknowledging the fact that our sampling method can only be an approximation of the rank degree method. We do not possess knowledge about the whole network, and the usage of the follower number is only a heuristic for assessing influence, an approximation of the in-degree in our network of interest, the German Twittersphere. The actual in-degree within the German network is, in some cases, actually lower, due to non-German followers, and subjected to change during the months of our collection. Therefore, dynamically adapting the degree was neglected in favor of an easier parallelization of the walkers and a cache database of fixed account details, such as the friend connections and follower numbers, that led to a significant speedup of our collection.
Finally, since our aim was to collect a sample of the German Twittersphere, we used the accounts’ interface language as a filter criterion, hence (3).
Another significant change to the original algorithm is made in Step 5. Voudigari et al. (2016) and Salamanos et al. (2017a, 2017b, 2017c) only apply their method to undirected networks, so the symmetric connection is always added to the sample in their case. In the directed case, there is not necessarily a symmetric connection, but, as in the original rank degree method, we add it, if there is one. This is done because we expect this to ensure that accounts with only a few but high in-degree followers will receive more representative scores regarding centrality measures such as Page Rank, Betweenness, or Eigenvector centrality. Note, however, that, as in the original rank degree method, even though both edges are added to the sample, one of them, (

Our adaptation of the rank degree algorithm. The top panel represents the sample after every iteration, and the bottom panel represents the underlying network without the removed edges. The example network is based on a student interaction network (Heidler et al., 2014), filtered for in-degree > 3, as available from https://github.com/gephi/gephi/wiki/Datasets.
Evaluating the Sample Quality
As we do not possess knowledge about the whole network, assessing the sample quality, as done for the original rank degree method, was not possible for the German follow network. Neither could we expect or were aiming for any kind of typicality of properties of the sampled accounts in comparison with a population as would be necessary for a representative sample in a traditional sense. Our aim was to approximate the proverbial 1% to 10% at the top of the German Twitter population which are characterized by orders of magnitude higher activity, coverage, reach, and follower numbers than the remaining 99% to 90%.
Activity and Centrality
As the seeds are just randomly selected in order to find influential accounts, we excluded those seeds from the sample that do not have an incoming edge from another node for assessing the activity and follower numbers of our sample. This filtering left us with about 197,000 unprotected accounts, whose activities and follower numbers we could access. As can be seen in Figure 3, even in this subsample, there is a large number of accounts who have not been active for years. Whether this is due to actual inactivity or simply silent usage of the platform cannot be determined here. However, over 42% of our sample have posted at least one tweet from the beginning of 2019 until the end of our network collection in May.

Distribution of the date of the last status by accounts in our sample at the end of the network collection timeframe (May 2019).
Nevertheless, as depicted in Figure 4, where we compare the distributions of the number of tweets per day since the account creation day by accounts in this sample with the same data in the subset of German-using accounts from the TrISMA collection, we can see that there is a pronounced qualitative difference in activity. The accounts in our sample are orders of magnitude more active in terms of tweets per day than this benchmark.

Comparison of our sample (“Sample”) with all accounts collected in 2016 by TrISMA (“Benchmark 2016”) regarding the distribution of the statuses per day since account creation.
A similar picture is drawn if we inspect follower numbers: Again, Figure 5 shows a comparison between our sample (without seeds that remained leaves) and the entirety of Twitter accounts in 2016 that had set their interface language to German. Here too, the distributions of follower numbers show a substantial qualitative difference, with the typical follower numbers in our sample being multiple times higher than in the benchmark.

Comparison of our sample with all accounts collected in 2016 by TrISMA regarding the distribution of the follower count at the time of the sample collection. The spike between 100 and 1000 accounts is caused by a fully connected bot-net.
In summary, this sample exhibits indeed a high-influence profile in terms of activity and in-degree centrality (as measured by the follower numbers reported by the Twitter API). We therefore will refer to it as the influencer sample from now on.
Coverage and Reach
However, activity alone does not translate to influence in terms of content exposure. Therefore, we tested what we call the “coverage” of our influencer sample: the typical percentage of a German-using Twitter account’s friends that are in our sample. Again and for the same reasons as above, we filtered out seeds that have no incoming edges in our sample. However, as we did not require protected information this time, the influencer sample size remained slightly higher at about 199,000 accounts for the following tests.
For this purpose, we drew a random sample of 1,000 accounts from the German TrISMA collection and retrieved their actual friends (including those with another language than German as interface language) from the Twitter API. From here on, this sample will be called the test sample. Of course, the final size of this test sample was reduced due to deleted and protected accounts. Furthermore, we excluded accounts with less than two friends to avoid misleading coverage values of 100% and 0%. 9
As a baseline, we drew a random sample from the German-using accounts in the TrISMA collection with the size of our influencer sample (ca. 199,000 accounts). Then, we evaluated the coverage of the influencer and the baseline sample for accounts in the test sample.
As can be seen in Table 1, the mean and median of the coverage of our influencer sample is at 40%, compared to 0.5% mean and almost 0% median coverage of the baseline sample. However, as distributions are often heavily skewed in networks, mean and median do not tell the whole story. As can be seen in the distribution plots in Figures 6 to 9, our influencer sample differs extremely from the baseline sample in terms of coverage distribution, so it is evident that we observe a different class of accounts here. When ignoring accounts with 0% coverage, for the influencer sample, Figure 6 shows a distribution of coverage resembling a normal distribution around the mean/median of 40%. In other words, on average, 4 out of 10 friends of a German-using Twitter account are in our sample.
Count, mean, standard deviation, minimum, quartiles, and maximum of the number of friends and the percentages of friends in the influencer and baseline sample for public accounts in the test sample with at least two friends.

Distribution of accounts in the test sample over the percentage of their friends that can be found in the influencer sample (filtered for in-degree ⩾1, leaving 199,180 accounts).

Distribution of accounts in the test sample over the percentage of their friends that can be found in the baseline sample (199,180 accounts drawn randomly from German-using accounts in TrISMA collection).

Rank-coverage distribution of accounts in the test sample with at least two friends for the influencer sample (filtered for in-degree ⩾1, leaving 199,180 accounts).

Rank-coverage distribution of accounts in the test sample with at least two friends for the baseline sample (199,180 accounts drawn randomly from German-using accounts in TrISMA collection).
The categorical difference between our influencer sample and the baseline sample becomes even more clear when examining the rank-distributions of coverage: while Figure 8 shows a linear decline of coverage with rank for the influencer sample, Figure 9 illustrates that in the random baseline Twitter sample, coverage follows a seemingly exponential decline. The same holds true for the more intuitive concept of reach, that is, the percentage of accounts in the test sample reached by accounts in the influencer and baseline sample, respectively. Here, Figures 10 and 11 show that while the top 10 accounts in the influencer sample each reach 8% to 10% of the test sample, not even the top account in the baseline sample reaches 2% in the test sample. In total, the influencer sample reaches 85% of the test sample accounts with more than one friend.

Rank-reach distribution of accounts in the influencer sample (filtered for in-degree ⩾1, leaving 199,180 accounts).

Rank-reach distribution of accounts in the baseline sample (199,180 accounts drawn randomly from German-using accounts in TrISMA collection).
In summary, our influencer sample shows not only a class difference in activity and follower numbers compared with the average, but it also contains on average 40% of the friends of a German-using Twitter account and reaches 85% of accounts in the test sample with more than one friend. If we use 2.5 million weekly active accounts in Germany (Frees & Koch, 2018) as a conservative population estimate (instead of 15 million based on the TrISMA collection), our sample still represents less than 10% of this population. Taking this and everything above into account, we conclude that the influencer sample is a good approximation of the most influential core of the German-using Twittersphere.
Test Case: Topical Communities in the German Twittersphere
To test the suitability of our adaptation of the rank degree method to investigate the overall structure of a language-based Twittersphere, we replicated an analysis of the full Australian Twitter follow network by Münch (2019, Chapter 6) with the 3-core of the full sample. 10 In the Australian case, this analysis combined community detection within the follow network and keyword extraction from the tweets of the respective communities to detect Twitter accounts with common topical interests and reveal the overall structure of the Australian Twittersphere. The filtering for the 3-core was done in order to avoid trivial star-shaped follow-back communities, which seem to be an artifact of the sampling method and affected the detection of useful communities. This filtering left us with a network of about 66,000 nodes and ca. 655,000 edges, that is, less than 10% of the full network’s accounts but over 40% of its edges. Consequently, it has to be noted again that this analysis focuses on the central core of influential accounts in the German Twittersphere and not on average German Twitter accounts.
Community Detection
Instead of the Parallel Louvain Method (PLM; Staudt et al., 2016) that was used by Münch (2019) and is based on modularity maximization, ergo on a density-based understanding of community (Coscia et al., 2011), we used the non-hierarchical, non-overlapping version of the Infomap algorithm (Rosvall & Bergstrom, 2008; Rosvall et al., 2009). This entropy-based algorithm is based on shortening the theoretical description length of the path of a random walker through the network by grouping nodes together. As a result, areas where a random walker would likely spend more time in a row are grouped together. In our case, if a tweet would be randomly shared along the network, it would, on average, stay within those communities for a longer time before leaving them. This intuitive interpretation and the fact that it allows for a directed interpretation of the network, as well as other statistical advantages of the Infomap method, led to the decision to present its results instead of the results of the modularity maximization–based algorithm. 11
Keyword Detection
In order to determine topical keywords for the detected communities, we retrieved the last 200 tweets for every unprotected account in the 3-core of our sample. This dataset was filtered for tweets by accounts in the 93 communities with more than 100 accounts. Then, we filtered those for tweets posted in the last 7 days. Within this dataset, only 4.4% of the active accounts had tweeted more than 200 tweets. The collection took about 2 days. Therefore, to avoid having more tweets from accounts that were collected later in the collection period, we cut-off the last 2 days of these tweets. This left us with about 455,000 tweets by ca. 20,000 accounts over a period of 5 days (June 9–14, 2019).
The keyword detection process followed the same procedure as described by Münch (2019, p. 227), except for the use of German stop-words from the python-stop-words project 12 and the use of the unfiltered communities instead of their k-cores, due to the already filtered nature of the sample’s 3-core. The keyword detection is based on the chi-square statistic and is common in corpus linguistics (Rayson et al., 2004). The process returns a list of keywords ranked by how significantly being assigned to a group is correlated with the use of these keywords. We keep the top 50 keywords 13 and filter out keywords that have been used by less than 5% 14 of the respective community.
Results
Figure 12 shows the result of the community detection. On a first glance, it becomes clear that the Infomap algorithm in most cases still finds communities that align with the force-directed layout (done with Force Atlas 2; Bastian et al., 2009)—as would a modularity maximizing algorithm. Already an inspection of the account names (the top 10 accounts by degree of communities with more than 100 active accounts within the 3-core are available in Supplemental Appendix A) revealed that most of the largest communities have a topical focus.

Central communities in the 3-core of our sample network; colored by largest communities detected with the Infomap community detection algorithm (Rosvall & Bergstrom, 2008; Rosvall et al., 2009); node size represents Page Rank (Brin & Page, 1998); layout done with Force Atlas 2 in Gephi (Bastian et al., 2009); (colored version available online).
This was confirmed by a close reading of the Twitter profiles of the top 10 accounts by degree in the 3-core and triangulated with an interpretation of the keyword analysis. As the account names, the keywords and the tags reflecting our interpretation of the communities with more than 100 active accounts in the analyzed time period can be found in Supplemental Appendix A. A selection in Table 2 demonstrates the topical clarity that enabled us to summarize the keywords and accounts to a topical tag. However, as the “Hard Right” community demonstrates, it is important to stress that belonging to a community in this analysis does not necessarily mean endorsement of its majority’s activities: While most of the top 10 accounts in this community can be identified as members of the German right-wing party AfD or accounts obviously supporting this party, we also find “krone_at,” the account of an Austrian tabloid, and “MSF_austria,” the account of Doctors Without Borders in Austria. For the latter, we could determine that this is likely due to the fact that prominent accounts in the “Hard Right” cluster follow “MSF_austria.” “MSF_austria” does not follow them back.
Selected communities’ keywords (translated), top accounts by in-degree in the 3-core of our sample and our summarizing tags.
Finally, the summary of this test study is depicted in the community graph in Figure 13 which contains the tags summarizing our interpretation of the keywords and top accounts. If we could not find a clear interpretation, the community is tagged as “Group of” the account with the highest degree in the 3-core. While many communities and their connections are filtered out for clarity and only the largest, most active communities and strongest connections between them remain, it gives a useful bird’s-eye view on the structure of the analyzed network, which, according to our results above, represents the influential core of the German Twittersphere. As such, it exhibits intuitively sensible patterns at first sight: Swiss Politics is strongly connected with Swiss Sports and vice versa; Hard Right, and Digital Rights Culture appear as satellites of the dominant German Politics community; Porn is remote from most communities and follows more than it is followed; and YouTubers & Gamers are connected to the rest of the network mostly through Entertainment. In short, this result resembles the results for the Australian Twittersphere by Münch (2019, Chapter 6) and provides a good overview of the influential core of the German-speaking Twittersphere.

Community graph of communities in the 3-core of our sample with over 300 accounts, at least 80 active accounts during the examined timeframe, and edges with a weight of at least 150; edge width represents weight; edge direction follows clockwise curvature; edges colored by source node; node size represents the number of accounts in each community; node colors correspond with Figure 12; node labels based on interpretation of keywords and top accounts (see Supplemental Material); (colored version available online).
Conclusion
Summarizing the methods and results described in detail in the section above:
We have adapted the rank degree method in a way that makes it practically useful for a small team in order to gather Twitter follow networks of most influential accounts that use a certain interface language using the cost-free Twitter standard APIs;
We provide evidence that a network sample collected with this method exhibits activity and follower numbers in the orders of magnitude higher than the average of this language domain;
We provide evidence that the influencer sample, that is, a subsample of accounts with at least one incoming edge within our sample, represents less than 10% (likely much less) of the whole population, but reaches 85% of accounts in a random test sample of German-using accounts with more than one friend;
We show that accounts within the influencer sample exhibit substantially higher reach than accounts in a random baseline sample of the same size;
We provide evidence that for an average German account 40% of its friends are in our influencer sample; therefore, given the higher activity of the sample, likely more than 40% of an average German Twitter account’s timeline is produced by our sample.
Altogether, this lets us conclude that the adapted sampling technique is able to approximate the set of most influential accounts, based on the follow network, within a language-based Twittersphere. In comparison with the original rank degree method, our adaptation of the method is optimized to be parallelizable and efficient concerning API calls and can therefore be used by small research teams and social media professionals. Furthermore, we are confident that with some adaptations it is also suitable to mine not only Twitter follow networks but also comparable platforms that have a subscription network.
As our test study demonstrates, the data retrieved with this method enable a researcher to conduct research projects that hitherto relied on much larger datasets and data collections. Bruns et al. (2017), Bruns and Enli (2018), and Münch (2019) all relied on the full follow network that was collected based on the global Twitter account collection by TrISMA. While they filtered a complete dataset down to a manageable or useful set of likely influential accounts, our method restricts the collection to those accounts in the first place. It produces a comparatively small sample of connections in the follow network based on a random sample of accounts 15 leading to a backbone network of the most central, thus likely the most influential accounts. As a positive and important side-effect, this also leads to less ethical issues, as it restricts the data collection mostly to accounts that are already popular on Twitter and therefore more likely to be aware of the public availability of their data.
Despite the smaller scale of the produced dataset, we were able to retrieve meaningful, comparable results to the three studies above, using and triangulating their methods. While the focus of this article is on the sampling method and we do not dive deeper into the theoretical implications of the test study, it is clear that the test study presents a fertile ground for theory development comparable to Bruns and Highfield (2016), for example, or discourse analysis as done by Dehghan (2018). We want to especially highlight the observed similarities between our representation of the overall structure of the German Twittersphere and the Australian Twittersphere as drawn by Bruns et al. (2017) and Münch (2019, pp. 237–238), which hints at an untapped potential for international comparative communication and (social) media studies.
Outlook
This study provides evidence that our adaptation of the rank degree method enables drawing a representative language-based sample of influential Twitter accounts. However, while we are confident that the method collects the most influential accounts despite our adaptations, for example, the exact ranking of these accounts by different centrality measures, as well as community structures, might be differently preserved than in the original method. Therefore, this method still has to be tested with known networks, as the Australian or Norwegian Twittersphere, to ensure that the centrality-preserving qualities of the original rank degree method do not suffer from our adaptations. Especially the application to a directed network, as well as the non-dynamic handling of degree and ranking in the sampling process, might lead to significant differences in the sample quality.
Nevertheless, now that the practical feasibility of the method is proven, the quality of the sample can be assessed more thoroughly, especially regarding the question how coverage and reach change with the sample size—a most important question for researchers and social media professionals.
Moreover, for the presented version of the method, we still required a high number of initial random seeds for the algorithm to work, and this seed collection is generally not given. Thus, a further development of our prototype includes the generation and growth of the seed pool. Ideas include sampling via a keyword search for common words in the language or regarding the topic of interest and a “snowballing” approach, where the latest 5,000 connections of the collected nodes are stored as seeds. 16 Such an implementation needs to be tested for representativity and comparability with the current approach, as the seed pool itself is not random anymore (Münch & Rossi, 2020).
As Twitter made the interface language of an account a private property at the end of this project, the method has been adapted to work with the tweet language instead. Whether the language detection provided by Twitter suffices for this remains to be tested.
Finally, further avenues of enquiry regarding this form of sampling include the collection and comparison of language-based Twitterspheres other than German, and its development to social media mining approaches that are based on topical instead of language-based criteria (Münch & Thies, 2020).
Supplemental Material
sj-pdf-1-sms-10.1177_2056305120984475 – Supplemental material for Walking Through Twitter: Sampling a Language-Based Follow Network of Influential Twitter Accounts
Supplemental material, sj-pdf-1-sms-10.1177_2056305120984475 for Walking Through Twitter: Sampling a Language-Based Follow Network of Influential Twitter Accounts by Felix Victor Münch, Ben Thies, Cornelius Puschmann and Axel Bruns in Social Media + Society
Footnotes
Declaration of Conflicting Interests
Funding
Supplemental Material
Author Biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
