Abstract
Introduction
Malware is a constant threat to the Internet and detecting malware is also a hot research topic. Recently, researchers1,2 have noted and analyzed the activeness of malware in wireless sensor network (WSN). Malware is also a security risk to WSN due to their nature of real-time monitoring and reporting of sensor data.
To defend malware, surveys3–5 summarized existing malware detection methods. The data sources used by detecting systems are generally NetFlow, Honeypot, domain name system (DNS) traffic, and address assignment information (border gateway protocol, autonomous system, dynamic host configuration protocol (DHCP), etc.), and some detecting systems need deep packet inspection to identify the characteristics of application layer.
At present, most of malware detection research is based on network communication characteristics of malware, especially C&C and attack traffic. In this article, we research the specific malware whose hard-coded C&C domains are currently expired for some reasons. Therefore, when these C&C domain names cannot be resolved, malware often would repeatedly retry, attempting to use the domain name to establish a communication with the C&C, which is different from the exploratory attempts in domain generation algorithm (DGA).6,7
Expired C&C domain names not only have the characteristic of repeated attempts. Since a lot of malware does not try to initiate requests with random intervals, they usually exhibit a certain periodicity. The literature 8 also shows that malicious software’s C&C connection would generate DNS requests and TCP connections with a fixed interval.
In order to describe the characteristics of repeatability and periodicity, we select two representative domain names, produced by program and manual input respectively, from DNS traffic and graphically present them as shown in Figure 1. The horizontal axis in Figure 1 is time, while the vertical axis is the client IP address sorted by appearance time. Points in the graph represent the client who initiates a DNS query in the corresponding time.

DNS querying pattern: (a) tracker.sjtu.edu.cn and (b) www.cnbeta.com.
Figure 1(a) shows the query characteristics of the domain name
Figure 1 presents two DNS querying patterns and we show that the program domain has the periodic requesting pattern. In fact, it is a common situation for malware that C&C domain would be periodically requested even if the domain has been taken down. Thus, it would follow the pattern as shown in Figure 1(a).
In this article, we propose a method to detect these malware based on this kind of traffic pattern. Our approach requires four steps as shown in Figure 2. The first step is collecting DNS failure traffic. With respect to the real world, the traffic dataset that we focus on is collected from campus real-time traffic. The second step is filtering irrelevant traffic for reducing false positive and expediting detection. The third step is to extract time sequences of domain requests and calculate the DNS clients’ retrying behavior. Finally, we compute the sequence scores and detect C&C domains with a threshold.

Steps to perform detecting.
The contribution of this article is as follows: (1) we analyze malware C&C failure problem and find it arising repeated and periodic request behavior in DNS traffic. We verify it with convincing real world dataset. (2) We analyze the characteristics of expired C&C domains of malware, distinguish properties from legal domains, and propose an effective method to detect them. (3) In the campus network, the algorithm has successfully detected 3027 malware C&C on 249 affected client hosts with a precision of 92.0%.
In the past, most research focused on detecting active malware. This article focuses on detection of abandoned malware which is still running on users’ machines. Although there is not much research in C&C failure before, it is still a potentially huge risk to the network security (see section “Evaluation”).
This article is organized as follows. In section “DNS traffic filtering,” we first proceed a pre-filtering method of failed query to reduce the amount of data that are needed for sequence analysis. Section “C&C detection” describes our method to detect C&C based on query sequence analysis and the client behavior’s similarity. In section “Evaluation,” we assess and analyze the effectiveness of detection experimentally. Finally, we give the conclusion of this article.
DNS traffic filtering
Request sequence analysis analyzes all query messages’ timestamps by grouping requests of the same domain and requested type in the DNS traffic of a client. Before this, in order to reduce the amount of data for request sequence analysis, we first filter out the massive failed traffic as follows.
Failed BitTorrent tracker
Counting repeated failed requests generated by a single client to a single domain name, we discover that failed BitTorrent tracker will cause certain clients to send DNS domain name queries frequently.
Since there is no way to accurately identify all of the BitTorrent tracker domain names from DNS traffic, our study focuses on repeated attempts from a single client and domain names with particularly large number of requesting clients. For domain names that take up a large portion in the failed DNS flow, we use a search engine to verify whether it is a BitTorrent tracker domain name.
When using a search engine to judge the BitTorrent tracker domain name, the domain name must:
Have no known websites built on that domain name directly.
Have been used by a BitTorrent tracker as their address.
Using the results of the search engine is an easy way to judge whether a domain name is used as the BitTorrent tracker, because most of BitTorrent trackers’ URL addresses are in similar (or fixed) format, such as
In the experiment on campus network traffic, we identified 342 domain names of failed BitTorrent tracker in total. The failed BitTorrent trackers widely exist in the BitTorrent seed files created earlier. There are also trackers that are failed but still copied and reused in newly made seed files, because BitTorrent users often want to use more trackers to find as much other downloaders (peers) with the same resources as possible. In addition, there are also failed tracker addresses carried in magnet URIs or magnetic links.
The amount of failed DNS queries caused by failed BitTorrent tracker is astonishing. Just with the 342 failed tracker domain names, the failed DNS queries caused by them occupy 42.2% of the total DNS failures in the campus network. It means nearly half of the DNS failures result from invalid BitTorrent tracker addresses. We also found a client who attempted to connect a failed tracker domain name for more than 100,000 times in 1 day. We think such kind of frequent retries may arise from the fact that the client has downloaded varieties of seed files that contain the failed tracker, but a more important reason is the possible defect in the implementation of the BitTorrent client. We identify the failed BitTorrent tracker domain name queries and summarize them in Table 1.
Top 20 failed BitTorrent trackers.
Expired domain names of legitimate software
In this section, we inspect expired domain names that were once used by legitimate software. When the software is no longer maintained or the updated version no longer uses the previous domain name, expired domain names appeared.
As we all know, among the DNS requests that are sent by the client machine, quite a large part are not initiated by user’s operation (e.g. web browsing, email checking). A variety of software installed on the client system will automatically access network and generate DNS queries, which usually serve the purpose of automatic update and advertisementing inside the software. The failure of the domain used by the software has caused severe incident in the past, such as on 19 May 2009, after Storm Player’s authoritative DNS provider DNSPod went offline, massive amount of Storm video network client frequently retried to get access to their service in the background, overloading recursive DNS servers in several major ISPs. 10
During the analysis of the campus network traffic, we located a lot of expired software domain names, many of which are still under frequent attempts to be resolved by the clients. For example, within 7 days, there are 68,936 clients attempting to resolve “stun01.sipphone.com,” which is an expired Session Traversal Utilities for NAT (STUN) server name. The total number of requests is 1,268,955. “nccpr.p2p.baofeng.net,” once used as Storm video advertisement server domain name, has 1,203,016 requests from 15,154 clients, showing a huge installation base and a widespread impact. The download software, Thunder, is also an important source of failed DNS requests. Once used by Thunder Assistant, “bibei.sandai.net” receives 1,181,038 attempts from 428 clients, while the expired domain names, “btrouter.sandai.net” and “ui.pmap.sandai.net,” have 9307 and 11,134 requesting clients, respectively, and 332,341 and 303,380 failed requests, respectively.
We filter out those domain names of common software that are failed to be resolved. Finally, we identified 333 expired domain names of common software with total request of 10,706,250 times, accounting for 3.05% of total DNS failure.
Other irrelevant traffic
Besides traffic that is mentioned in sections “Failed BitTorrent tracker” and “Expired domain names of legitimate software,” some failed traffic, which are irrelevant with malware’s behaviors, should also be filtered out. We illustrate them as follows. (1) Invalid top-level domains (TLDs). An invalid TLD means that the TLD is not registered, and thus, it contributes to a failure traffic such as localdomain, home. (2) Reverse DNS resolution. In the literature, 11 global survey results showed that the success rate of PTR query is only 30.4%, and 44.5% has a negative answer while 25.1% does not respond. It contributes to a large proportion of failure traffic that we do not care. (3) DNS-based blacklist (DNSBL). DNSBL provides a blacklist for DNS query. For queries that are not on the blacklist, the server returns NXDomain. Thus, DNSBL also has a certain proportion in the failure traffic and should be filtered out. (4) Intermittent failure. Due to the authorization servers’ instability, configuration errors, network failure, or other reasons, some domains may be unable to resolve occasionally. Filtering the intermittent failure domains can effectively reduce the interference caused by DNS system instability. (5) Internationalized domain names (IDN) domain. IDN domains accounted for only 0.012% of the failed traffic. It is mainly from web browsing’s input error or hyperlink error. (6) Campus network domain. We note that in failed traffic, the school’s domain sjtu.edu.cn accounted for 5.33%. The reason lies mainly in the following two points. First, the DNS on Shanghai Jiao Tong University campus does not serve only as recursive server for campus users, but also as an authorization server of sjtu.edu.cn. Therefore, some failure queries are from external recursive resolution. Second, campus wireless users can get a default domain suffix of sjtu.edu.cn when they use DHCP services. (7) Special symbols. DNS domain tags usually only contain letters, digits, and a hyphen (“-”), but some domains with unsupported special symbols can also be found in the traffic, which contributes 0.027% to failure traffic.
We summarize the proportion of each type of DNS failure, as shown in Figure 3. Through the preliminary filter, DNS failure that enters the request sequence analysis only accounts for 4.01% of the total DNS failure, showing a great reduction of data required for sequences analysis, while also reducing the interference in C&C domain detection from legitimate applications’ requests.

Classification of DNS failures.
C&C detection
Request sequences analysis
Request sequences analysis studies the time sequence behavior that one client sends repeating requests to an expired domain. Considering domain requests from programs have the characteristics of repeated attempts and fixed time intervals, we discard domains that are requested less than a certain amount per day. In the realization of our approach, we only take domain names whose requests are more than eight times a day into account. We will explain it in section “C&C domain detection.” Our client domain name request time sequences are split daily, instead of weekly, because the shutdown of the clients at night will cause a large pause in the request sequence. We use 1-week campus traffic for our research. After discarding requests below the minimal request amount, a total of 361,281 valid request sequences entered the following analysis.
The first step for request sequence analysis is to take the effect of DNS resolving timeout retry on the operating system into consideration. Because domain name resolution of DNS usually runs on unreliable UDP protocol, situations like packet loss would probably happen. If no server response is received within a certain period of time after the client sent a query (Timeout), the same query would be sent again (Retry). In order to reduce the delay of DNS resolution for the application, timeout in client’s DNS library is usually small, far less than the DNS server’s timeout. Therefore, when the DNS server is still processing the iterative resolution, the client may already consider it as timeout and resend the request. Most client applications resolve domain name using operating systems’ API. Therefore, in order to study the timeout retry behavior of clients, we need to look into different operating systems.
The documentation12,13 explained the behaviors of Windows DNS client. Timeout and retries in Windows DNS clients are determined by the registry value

Timeout and retry behavior of DNS clients: (a) Windows server 2008 and (b) Ubuntu Linux.
The literature 15 experimentally analyzed DNS failure request behavior of DNS clients under Windows, Linux, and Mac OS X with multiple browsers. DNS timeout retry has great significance for our time sequence analysis. In a series of DNS queries sent by the client, we must first identify the repeated requests sent by the DNS client due to timeout in a short period and merge them into one request. Only then, the judgment of periodic requests will be accurate. After analyzing the timeout retry feature for multiple operating systems’ DNS client, we set up a safe threshold. Only repeated requests that are within 18 s after the first query on the same domain name are believed to be the retries of the first query message due to timeout.
It is worth noting that when Linux is configured with multiple DNS servers, the timeout threshold of 18 s may not be enough, because multiple non-responding servers may raise the retrying period to more than half a minute. We ignore this situation, considering the timeout threshold we set is close to DNS server timeout, and Linux system is also not the target for most malware attack. So in this article, the simplification will not affect the final testing results.
To determine whether the client request on the domain is periodic, we calculate the request time interval {
C&C domain detection
The algorithm of detecting C&C domain can be split into two parts. first, we need to detect which domain is a program domain. Second, we need to check whether a program domain is a true C&C domain. The procedure is presented in Figure 5. We highlight the key points in the algorithm as follows.

C&C domain detection algorithm.
Detect_ProgramDomain
In the stage of detecting program domain, we extracted the domain request sequence from each host and analyzed the sequence if it meets the programmatic requesting features. The variation coefficient is able to describe the periodic characteristic of a query sequence, and the repetitive requesting nature of the program domain can be reflected by the number of retries for the same failed domain. Thus, to determine whether request sequence meets the characteristics for program domain, an evaluation function is defined on the basis of the variation coefficient of

CDFs of program/non-program domain score.
C&C_RequestCond
Most domain requesters have the characteristics of programmatic behaviors.
While a domain is only used as malware C&C, all the domain requests are produced by malicious programs on the host. Therefore, almost all the requesters should have program-like request characteristics. If only a small proportion of requesters have program-like request characteristics, it is more likely to be a normal domain name. For example, visiting the domain of a news site, users on Web browser will not have a repeated periodic request behavior, while users with RSS client will expose periodic behavior because RSS client will access RSS contents by schedule. In other words, domain with occasionally repeated period requests is not enough as a judgment condition of C&C domain. IPRate of program domain is defined as the number of periodic IP requests divided by the total number of IP requests on the domain. We randomly select 100 non-C&C domains and 50 C&C domains by manual as the test dataset. Then we count the domain IPRate probability density function and depict the CDF curve as shown in Figure 7. As shown in Figure 7, we can find that there is a obvious separation between C&C domains and non-C&C domains when IPRate is equal to 0.5, so we set IPRate is larger than 0.5 as one necessary condition of C&C domain requesting behavior in this article.

CDFs of C&C/non-C&C domain IPRate.
C&C_PeriodCond
2. All the requesters have a same request interval.
We already required that most of the requesters should have a repeating, periodic request to a domain in the previous part. Moreover, in most cases, each malware domain is always controlled by a same attacker. If many host machines have been infected by the same malware, they are more likely to be infected by a similar family of malware. Therefore, when they are communicating with the C&C, there would be an identical time interval. But for the legal domain names, for example, the domain name of the POP3 mail server, although the requests all come from Email clients, different users can have different fetch intervals, as it is set personally. So we use average variation coefficient to evaluate whether the average retrying interval of different requesters is consistent. We randomly selected 100 non-C&C domains and 50 C&C domains by manual as the test dataset. Then we count the average variation coefficient and depict the CDF curve as shown in Figure 8. From Figure 8, we can clearly see a separation line between C&C domains and non-C&C domains when average variation coefficient is equal to 0.2; thus, we set average variation coefficient is less than or equal to 0.2 as another necessary condition of C&C domain requesting behavior.

CDFs of C&C/non-C&C average variation coefficient.
Filtering with KnownBenignDomainSet
Besides these two characteristics, we also used Alexa ranking 17 for whitelisting. The top 1 million domains are considered as benign domains except the domains providing dynamic domain service such as DynDNS and 3322.org. It has been reported by security agencies that these dynamic domains were abused and most of them are exploited by a huge quantity of malware. In addition, we have defined additional rules to remove the domain name of Intra-Site Automatic Tunnel Addressing Protocol (ISATAP) and Web Proxy Auto-Discovery (WPAD) server.
Specially, during the experiment, we find some request sequences with very short period, for example, under 30 s. After we check these domains by manual, we find that these domains are benign, produced by program’s faults or abnormal programs. In fact, when a repeated request interval is less than 18, the exact interval would lie in ThreatExpert, Ltd.,18,19 the range between 18 and 36, because we have set the 18-s merging window in section “Request sequences analysis.” In fact, it is not wise for the malware to use a very short retry interval, because it may bring too much pressure to C&C server, even make C&C server Denial-of-Service. Considering that it is very rare that a malware has such a short request interval, we ignore those sequences whose average interval is shorter than 30 s to avoid misjudgment.
Evaluation
In section “Request sequences analysis,” we get 361,281 domain request sequences where each domain request occurs at least eight times a day. After merging the timeout retry requests of the DNS client, there are 305,148 valid request sequences remained. Next, 56,133 request sequences were abandoned, because their daily requests are less than eight after merging. Our detection algorithm identified 12,757 client domain name request behaviors which have the repetitive, cyclical nature of the program domain requests; 6079 domain names were involved. Finally, after filtering the behavioral characteristics and whitelist of the domain names, 3290 domain names are judged as malware C&C domains.
Among the 3290 domain names detected, we found 2200 of them have the DGA characteristic, which is in the structure of <
For other 1090 domain names detected by the algorithm, we need an efficient method to judge whether they belong to the malware. In this paper, we use Google to help recognize malware property of a domain name. Using Google as a security tool is a common method and there are past studies20,21 which also used Google to detect phishing fraud websites.
We use the detected domain names as keywords to search on Google and take the first 100 of search results. It is worth mentioning that when searching on Google, we added quotation mark around the domain name so that Google will only return results containing the exact match of the whole string. Then, we record these information of the title, linked URL and web content segment in the first 100 results of a domain name.
For the first 100 results returned from the search engine, if there are URLs pointing to anti-virus software vendors, malware analysis sites, or blacklist website, the domain is likely to be a known malware domain. Figure 9 illustrates a typical example. The domain name

Google validation result.
The matching list in this article collects domain from 17 anti-virus vendors, 8 virus analysis tools (such as VirusTotal 16 , ThreatExpert 18 etc.), 19 malware blacklists and monitoring site, and 4 anti-phishing blacklists. In addition, we matched the title of the search results with content fragments to find connections between domain names and malware, according to the keywords such as Trojan, Worm, Backdoor, Rootkit, and Virus.
Many domain names alerted by our algorithm have no results from Google. We also consider these domain names as possible malware domain names, for they have no connections with publicly known software, while those who has Google searching results are considered as legal. Note that we make Google a tool to verify and assess the output of our algorithm rather than a portion of our detection algorithm.
Figure 10 shows the results from Google search; 191 of the 1090 alerted domain names have connections with known malware, 636 of them have no searching results and thus are considered to be domain names of possible unknown malware, and 263 domain names are considered as legal domain names which are alerted by mistake. Also counting the 2200 domain names from DGA group as malicious, among 3290 alerted domain names, 3027 are C&C domain names of known or suspected malware. We get a better detecting performance with 92% accuracy rate.

Google search result of a known malware domain.
We count the amount of clients which have made requests to the 3027 detected malware domain names. Since the algorithm is completely designed for the C&C domains of malware, these clients can be considered as being infected by malware. In our analysis of the DNS traffic on the campus network, 249 clients' IP addresses have been affected by malware, among which nearly 110 clients on average would be online every day (Figure 11).

Number of infected clients.
Although the detected malware all have an expired C&C, malicious codes which are active on the client machines remain seriously harmful: (1) the control and communication of those malware may have failed, but the worm programs may continue to spread between hosts. (2) Malicious software may switch to another channel through fail-over mechanism in the case of a C&C failure. 22 (3) Failed C&C domain may be reactivated, or taken over23,24 by other attackers. So, the use of our C&C failure detection algorithm to find the infected hosts which temporarily lost control is of great significance to enhance network security and can effectively compensate for other C&C protocol analysis and signature detection which are difficult to find expired malware.
To further analyze the expired C&C domain request features, Figure 12 shows the average request interval distribution to the detected C&C domains. We avoid recounting the huge amount of similar domains in DGA group. We noted that about 50% malware has a 900 s request interval, and only less than 20% have an interval longer than 15 min, and more than 10% malware have an interval longer than 1 h.

Cumulative distribution of query interval.
It is not surprising that a large amount of requests concentrates around 900 s, and it is related to the Operating System. The default negative answer cache time of Windows system’s DNSCache service is 900 s and is controlled by the value of HKLM\SYSTEM\CurrentControlSet\Services\Dnscache\Parameters\MaxNegativeCacheTtl in the registry. 25 Therefore, even if the error retry timeout of the malicious program is less than 900 s, failure to resolve the C&C domain name can also be answered by Windows from the cache.
Conclusion
The article presents an approach to detect malware C&C from DNS failures. The method focus on periodical failure-retry property of program-initiated domain name queries. Our approach first filter out the irrelevant queries from DNS traffic, then extract client-domain request sequence while merging duplicated packets caused by the operating system's DNS library. After that, we apply a series of procedures to identify request sequences belonging to malware C&C. Finally, we use Google search engine results to verify whether the domain name is malicious. In the campus network traffic, the algorithm successfully detected 3027 malware C&C domains which affected 249 client hosts. Meanwhile, the detection algorithm precision rate reached 92.0%. Malware taken down has not been extensively researched in the past, but it is of great significance to network security.
