Abstract
1. Introduction
Wireless Sensor Networks (WSNs) consist of a set of distributed wireless devices known as sensors. Sensors are used to examine and check physical conditions and pass their data to a central location. In smart environments, sensors take input from the physical environment and work virtually in scenarios where wired networks are not usable. WSNs are used for several applications including greenhouse monitoring, forest fire detection, military, landslide detection, air pollution monitoring, industrial monitoring and so forth. The characteristics of WSNs include low energy usage, limited computational capability, dynamic and independent operation network, easy installation and maintenance, low memory, and susceptibility to attacks due to ad hoc communication. Due to these characteristics, it is necessary to develop a protection mechanism for WSNs that is light, reliable, and computationally inexpensive. Moreover, it is important to detect whether the data transferred from source nodes reach the gateway properly without any interruption. Sensors are vulnerable to attacks and their security is highly important as they communicate very sensitive data. There are many interruptions, also known as anomalies, which disrupt the normal flow of the sensor data. These anomalies disturb the normal network flow in many ways including delayed packets, packets destroyed, and wormhole attacks. Therefore, it is highly desirable to detect such anomalies that disrupt the normal flow of the data in order to make the sensor network communication more reliable and consistent [1, 2].
Over the past few years, researchers and scientists have shown great interest in developing biologically inspired algorithms and techniques for solving various real-world problems. Several techniques such as Artificial Immune System (AIS), Ant Colony Optimization (ACO), Artificial Bee Colony (ABC) algorithm, Genetic Algorithm (GA), and Particle Swarm Optimization (PSO) have been developed and successfully used for such problems. AIS is a well-known bioinspired technique, which is inspired by the principles and processes of Human Immune System (HIS), and takes advantage of the characteristics of the HIS such as memory and learning for solving problems. AIS basically abstracts the functions and structure of HIS into computational systems. It considers the application of these systems for solving different information technology, mathematical, and engineering problems. One of the most important benefits of AIS-based systems is that these systems are adaptive. AIS has been successfully used for solving anomaly detection problems, both in wired and in wireless networks.
This paper proposes an AIS-based solution for anomaly detection in WSNs. For our solution, we use Negative Selection Algorithm (NSA) of the AIS with some modifications and implement it for the anomaly detection problem. We perform learning of the system for a large dataset and generate a detector set. After this step, we propose an injection feature in the detector set, called vaccination. Through this feature, the detector set can be updated at any stage. In our experiments, we use AIS for detecting nodes that cause negative effect on actual working of the sensor network. We address anomalies including sensor network packets dropped, packets delayed, and wormholes. The results of our experiments are very encouraging and show that NSA can work efficiently for anomaly detection in WSNs and is really helpful for making sensor network flow more reliable. We also compare the proposed NSA with Clonal Selection Algorithm (CSA) of the AIS for the same dataset. The results show that NSA performs better than CSA in most of the cases.
The remainder of this paper is organized as follows: Section 2 introduces HIS and AIS in detail. Section 3 presents the related work in the area of anomaly detection in WSNs. Section 4 discusses the proposed NSA for anomaly detection. Section 5 shows experimental results as well as comparison. Section 6 concludes the paper with possible future directions.
2. Human and Artificial Immune Systems
The Human Immune System (HIS) has efficiently saved our bodies from harmful attacks of pathogens like bacteria, parasites, and viruses [15]. A complex biological immune system comprises of molecules, organs, and cells. It is an adaptive system and holds the detection method that is able to perceive and fight abnormalities from body's own cells. HIS protects a body from external pathogens. It classifies cells within a body as self-cells and nonself-cells. HIS has two main categories: innate immune system and adaptive immune system. These two types of immune systems are shown in Figure 1.

Adaptive and innate immune systems.
The static system that identifies and eliminates definite harmful organisms is known as innate immune system, whereas the system that remembers unknown foreign cells and reacts with them is known as adaptive immune system. They build a response to unknown foreign cells that reside in the body for a long time [16].
2.1. Purpose of HIS
HIS is used for the security of different organisms against external microorganisms known as pathogens such as bacteria, viruses, and funguses that might cause infections. The HIS has to guarantee the detection of each potentially harmful molecule or substance. HIS discriminates between the cells of organisms from unknown dangerous cells. Moreover, it removes harmful irrelevant cells and infected cells for avoiding the disease.
2.2. Immune System Entities
Different entities of HIS are given below:
2.3. Immune System Properties
An immune system has the following major properties:
2.4. Immune System Process
Immune system has multilayered protection architecture including an adaptive immune system. This system has the ability to recognize specific types of pathogens and memorize them for accelerated future responses. This is the major motivation for AIS. The adaptive immune system is a combination of the range of atom cells and molecules spread all over the body. Two lymphocyte types among its cells, T-cells and B-cells, collaborate to discriminate between self and nonself (antigens). B-cells and T-cells are developed and matured inside thymus and bone marrow tissues.
T-cells and B-cells pass through a negative selection phase. Lymphocytes that match self-cells are destroyed. In this way, autoimmunity can be avoided. Earlier, T-cells go through positive selection and T-cells having weak bonds are removed. T-cells and B-cells, which stay alive in the negative selection, become grown-up and go into the blood stream for the purpose of detection. These mature lymphocytes never interact with antigens and are known as naive. There is a variety of subpopulation of T-cells. It includes T helper cells, T memory cells, and T suppressor cells. T-cells follow the same procedure for defense as B-cells but they also scan fragments of antigens that are present at the surface [15].
2.5. Artificial Immune System (AIS)
Computer security systems such as Intrusion Detection System (IDS) have received a lot of motivation from the HIS. Features collected from HIS fulfill the requirement for designing efficient IDSs [17]. AIS is inspired by the processes and principles of the HIS and takes advantage of its characteristics like memory and learning for solving various kinds of problems. AIS abstracts the structure and functions of HIS into computational systems. AIS systems do not create pattern for normal data but instead produce anomalous patterns by using normal data. These anomalous patterns are known as nonself. Hence, they perform only anomaly-based intrusion detection. Patterns that match with the nonself-patterns will be declared as anomalies.
Different entities of AIS are given below:
AIS uses the concepts of HIS and utilizes them for the computational problems. AIS has been widely used for anomaly detection because it discriminates between self- and nonself-data. AIS is efficient for small and medium domains of anomaly detection and in many cases, it is superior to other anomaly detection methods. AIS gives much flexibility in rules and principles due to the large number of constraints used for tuning its performance. It is difficult to handle large-scale problems because their multidimensional parameter space is infeasible to search with any rigor [18]. AIS is composed of antigen and antibody. The external intruder that attacks a system is known as antigen. Antibodies are a part of the system that is used to detect and remove antigens. Antibodies do a partial matching process to identify antigens. AIS-based system develops and keeps comparatively less antibodies, which are capable of detecting a large number of antigens reliably including antigens that have never been seen before [19].
AIS algorithm has many variations including NSA, CSA, and danger theory [20]. Two of the most popular AIS algorithms, that is, NSA and CSA, are discussed as follows:
A detailed discussion on different AIS algorithms with their immunological aspects, computational problems, and typical applications is available in [23].

The two main steps in the Negative Selection Algorithm: (a) censoring, (b) detection [4].
3. Related Work
AIS can be considered as a strong candidate for anomaly detection as it discriminates between self and nonself-data. AIS was used in WSNs for detecting anomalies in [4]. It was a direct one-to-one mapping between a thymus and a node. Wälchli and Braun [10] proposed a system for office monitoring with WSN by using node level decision component of a self-learning anomaly detection system. A neural network approach of Adaptive Resonance Theory (ART) was used for creating node level decision unit. A fuzzy ART neural network was used as it can accumulate a fixed number of prototypes and also for receiving analog inputs. Each node would be able to respond to suspicious activity in the neighborhood by easy computation and later final decision was carried out at the base station. This system performed efficiently and with less time and memory consumption.
Livani and Abadi [7] proposed an energy-efficient distributed solution for detecting anomalies in WSNs for sensed data. Faulty or broken nodes can cause anomalies in the sensed data. A combination of fixed-width clustering (FWC) and Distributed Principal Component Analysis (DPCA) was used for detecting anomaly and for creating global profile name. Global profile name was updated periodically and was distributed to all nodes. Authors showed that this approach reduced energy consumption and communication overhead and gave similar accuracy like the centralized approach. Dereszynski and Dietterich [3] proposed a real-time automated data quality control that used the data temporal and spatial correlations to identify defective sensor observations from valid observations. The adaptability was obtained with Bayesian network structure that confines spatial relationships between neighboring sensors. Moreover, to handle temporal correlations, dynamic Bayesian network structure was extended. This model truly guessed the values of corrupt or missing readings and also detected defective observations. SensorScope Project data samples were used for evaluating the performance of this model. Experiments proved that using both temporal and spatial observations gave better results for identifying defective observations instead of using only spatial or only temporal observations.
Schaust and Szczerbicka [12] proposed an algorithm to alleviate detected faults by generating parallel responses in WSNs. The system was able to adapt with the changing situation by using costimulatory feedback and was able to respond accordingly. The concept was taken from degenerate receptor behavior and low-level response mechanism of T-cells in the biological immune system. Authors had explained its usability for WSNs by running a simulation model in OMNet++. Fu et al. [11] proposed an anomaly detection framework by combining the advantages of fuzzy theory and AIS to cope with DoS/DDoS attacks on WSNs. WSN was considered susceptible to this attack due to resource limitation of sensor nodes. The framework was based on three components that included global identification, local danger sensing, and costimulation. This method was found more adaptable and flexible. Authors proved with simulations that the proposed framework performed better than the watchdog method in detecting with low false positives.
Lim et al. [13] proposed an immune-inspired self-healing mechanism to tackle with repeated WSN problems like unreachable nodes and link failures due to interference. In this system, an individual node was able to identify network performance degradation and was able to carry out diagnostic tests. This node was able to give automatic instant response and recover the network to a secure and stable state. Authors tested this interference detection and recovery system and compared the performance of this system with others on a test bed environment. The system was found adaptable to its changing environment. Xie et al. [5] examined the lazy learning issue of
An IDS framework for WSNs based on HIS was presented by Salmon et al. [8]. Authors enhanced dendritic cell algorithm that detected intrusions by observing and collaborating with neighboring nodes. This customized algorithm was tested in real sensor scenarios and found efficient in energy utilization and in identifying denial of sleep attack. Abduvaliyev et al. [24] presented a detailed classification of different IDS techniques for WSNs according to their underlying mechanisms. Three major classes were discovered that included misuse detection, specification-based protocol, and anomalies. Authors explored the work with network structure of WSN and highlighted various critical areas that were currently underdeveloped. Moreover, the details of security attacks and their related proposed IDS protocols to cope with those attacks were also given. The paper was focused on a thorough survey on IDSs in WSNs with the explanation of different critical limitations in the current IDSs and gave a future track for researchers in this area.
Kumar and Reddy [6] identified that wireless networks had intrusions both at packet and signal levels and it could be simple and too complex unlike IP networks. Conventional techniques may be unsuccessful in wireless network due to the complexity of identifying intrusions at different levels and variation in credentials at different nodes. Authors proposed a unique technique based on agents. These agents collected information from different nodes and detected intrusions by using this information on an evolutionary AIS algorithm and presented the invasive path communication. It was experimentally proven that their system worked well for prevention and detection of intrusions in a wireless network and was also consistent for topological changes. Rajasegarar et al. [9] proposed a distributed hyperspherical cluster algorithm for detecting measured anomalies from WSNs. Authors implemented their algorithm on real WSN test bed and came up with reduced communication cost of the network by measuring sensor clustering and then combined clusters for sending them to the next nodes. A central node was responsible for processing of all the sensor node measurements. Assessments on different datasets showed similar correctness as that of the centralized system and also a decrease in communication cost was observed.
Lim et al. [14] proposed an immune-inspired detection and recovery system (IDRS) for irregular and unreliable communication of WSNs because the nodes used the same frequency range as other radio devices were using. It was a self-adaptive fault tolerant network that was able to retain service level in the presence of faults as well. Nodes were able to examine and update their routing protocols in a dependable and energy-efficient way due to the restricted resources. The system was composed of a combination of self-detection, self-diagnosis, and self-recovery. The reliability of the protocol was checked with Systematic Protocol Evaluation Technique (SPET) and the scalability and robustness of the IDRS were checked with the traces of simulation. The accuracy of the proposed system was validated and the system was found adaptable to the operating environment and was highly dependable.
4. Proposed NSA for Anomaly Detection in WSNs
NSA is used for detecting change based on the principles of self-nonself discrimination by (T-Cell) receptors in the immune system. The system is able to detect antigens. Originally, NSA was developed by Forrest et al. [25], which is a conceptually simple algorithm and has been widely used by the AIS community. NSA is famous due to its simplicity and different affinity-matching functions. In one of the commonly used affinity matching functions, adjacent attributes of an antigen vector and Artificial Lymphocytes (ALC) detector relationship are considered to check whether a particular region activates lymphocytes. Moreover, these affinity-matching rules are also used for detecting unknown strings or holes.
4.1. Significant Features of NSA
The most significant features of NSA are the information is represented negatively. It is different from other learning systems. Its strengths, usage, and applicability in different scenarios are widely being explored; it uses some form of detector set as the detection system. This feature provides chances to expand this method to a distributed environment [26] especially the chances of generation distribution; there is only a single classification. The purpose of NSA is to distinguish between two classes. However, training is done from the samples of one class only. Work is being done on generalizing it to multiple classes; NSA includes space representation, matching rule, detector generation, and detector representation.
4.2. Components of NSA
NSA includes the following components:
Detector (it is an antibody). Self-samples (self-set, training set). Arriving data occurrences (data item, new simple data). Measurement of distance (affinity measure). Matching rule (match rule).
4.3. Proposed Enhanced NSA
NSA has been used for detecting anomalies in different ways. We use NSA with some modification. The learning of the system is done for a large dataset and a detector set is generated. After this step, we propose an injection feature in the detector set. Through this feature, the detector set can be updated at any stage. This injection step is named as vaccination.
The proposed framework has learning and testing phases as shown in Figures 3 and 4, respectively. We first implement the basic NSA that is capable of doing a single classification. It detects anomalies from the dataset. At this stage, we have two classes, namely, self-set and nonself. Later, the rest of the processing is done on detected nonself and three different anomalies are classified; that is, sensor network packets delayed, packets dropped, and wormholes are detected.

Proposed NSA learning for anomaly detection.

Proposed NSA testing for anomaly detection.
In Figure 3, self-strings are matched with randomly generated strings using character-by-character matching. The strings that get matched are rejected. The strings that do not get matched are moved to the detector set. Detector set can be updated anytime using vaccination, which makes it more efficient. Vaccination allows a user to enter any nonself-pattern directly into the detector set, which makes the working of the detector set more competent [27].
In Figure 4, randomly generated strings are matched with the detector set using character-by-character matching. The strings that get matched are declared as nonself. Source and destination matching are performed on nonself and they are further classified as sensor network packets delayed, packets dropped, and wormholes.
4.3.1. Protocols and Assumptions
In a sensor network,
In WSNs, there is a promiscuous mode in which a node is able to listen to the data communication in the neighboring nodes. It saves information of overheard packets in the neighborhood. This mode is costly, as it has to analyze all overheard packets. Moreover, this mode is energy inefficient, as it will not allow the network to operate in sleep mode. Wireless interface will operate either in idle or receiving mode and power consumption will be 12–20% higher as compared to the sleep mode [29].
4.3.2. Classification and Anomaly Detection
Anomalies can be due to intrusions, which can be software or hardware failure. In our experiments, we check three different kinds of anomalies. Data packet dropping is a qualitative misbehavior; data packet delaying is a quantitative misbehavior, and wormholes are topological misbehavior:
5. Experiments and Results
5.1. Performance Metrics
Different performance measures are used for our experiments using the proposed algorithm. Here we consider only the performance measures specific to NSA. The most popular measures for analyzing the performance of NSA and other AIS algorithms are false positives, true positives, false negatives, and true negatives. These measures are defined as follows:
False positives (FPs) are found when self-patterns are mistakenly identified as nonself. True positives (TPs) are found when self-patterns are correctly classified as self. True negatives (TNs) are found when nonself-patterns are correctly identified as nonself. False negatives (FNs) are found when nonself-patterns are identified as self.
These measures can be used to calculate the detection rate (DR), false positive rate (FPR), and accuracy [21], which are defined as follows:
5.2. Experiment 1
In our first experiment, we implemented NSA for a small dataset having normal packets only. We inserted anomalies at runtime and then detected the anomalies. Total anomalies inserted are 10. Simulations are executed in Matlab 2009 and it took 8–10 seconds to run. A screenshot for the basic NSA simulation with random anomalies is given in Figure 5, and the average results calculated for this simulation are shown in Table 1.
Results of Experiment 1.

Basic NSA simulation with random anomalies.
5.3. Experiment 2
In the second experiment, we used the sensor network dataset provided by Drozda [30]. We implemented the enhanced NSA, and self and nonself-network packets were identified. First, incoming network strings are matched with self-strings. Those strings that get matched are rejected and others are moved to the detector set. In the next step, random strings are matched with the detector set and those strings that get matched are identified as nonself. Next, nonself-patterns are considered to identify specific anomalies. Wormholes, packets delayed, and packets dropped are found as shown in Figure 6. Average results calculated for this simulation are shown in Table 2. All the values are in 103. These results are then compared with the original dataset and the values for TP, FN, FP, and TN are calculated. The detection rate for this experiment is 97.3%, whereas the FPR is ±2.6%. Here, DR is the intermediate result, which includes false positives and true negatives as well. Accuracy of the system is found to be 89.1%.
Results of Experiment 2 (all values are in 103).

Anomalies detected.
5.4. Experiment 3
For our third experiment, CSA is implemented. It models the production of antigens, which are then bound to specific antigens. A key lock mechanism can be used in some cases for binding processes. Here the idea is established that those antibodies, which recognize the antigen, are selected for matching. After matching, a detector set is generated. The CSA works as follows:
Generate an initial population of antibodies. Perform clonal selection for high affinity matches (threshold taken is 76%). Only those antibodies are selected that match the threshold and detector set is generated. Randomly generated antibodies are then introduced in a system. Clonal selection generated detector set is then used for identifying self and nonself.
5.5. Comparison of NSA with CSA
The experiments are performed to compare the performance of NSA with CSA on different dataset subsets and the results of both for anomaly detection and false positives are compared, as shown in Table 3. In Table 3, the number of anomalies detected and false positive ratio shows the performance of both the algorithms on particular dataset parts. It is clear from the table that, for dataset parts 1, 2, 4, 5, 7, 9, and 10, NSA gives better results. For dataset parts 3, 6, and 8, CSA performs better.
Comparison of NSA and CSA.
Comparison is also performed for the whole dataset. In the first case, only some of the files of the same dataset are used for comparison. In the second comparison, results of both the algorithms for the whole dataset are produced, as shown in Figure 7. Normal and anomalous sensor network packets, undetected packets, and false positive rate for both the algorithms are shown. The results of the experiments show that the detection rate of NSA is 97.3% and the false positive rate is ±2.6%, whereas, for CSA, the detection rate is 88% and false positive rate is 3.4%. This clearly shows that NSA performs better than CSA in terms of both detection rate and false positive rate.

Comparison of NSA and CSA for complete dataset.
5.6. Comparison with Other State-of-the-Art Techniques
A theoretical comparison of the proposed technique is also performed with other state-of-the-art techniques available in the literature for anomaly detection in WSNs. The comparison is based on the algorithm used, characteristics, and usability of these techniques, which is presented in Table 4.
Comparison of techniques for Anomaly detection in WSNs.
6. Conclusion
Artificial Immune System (AIS) is an active research area and researchers have been using AIS for network intrusion detection as well as other optimization problems. In this paper, we presented an enhanced Negative Selection Algorithm (NSA) for anomaly detection in Wireless Sensor Networks (WSNs). We first implemented simple NSA and tested on the dataset having random anomalies, and results were calculated. Then, the enhanced NSA is implemented for a large dataset having normal and anomalous packets. Experiments are performed to check the accuracy, detection rate (DR), and false positive rate (FPR) of the proposed algorithm. According to the results of our experiments, the accuracy of the proposed NSA is found to be 89.1% with a DR of 97.3% and FPR of ±2.6%. For comparison, another immune-based technique known as Clonal Selection Algorithm (CSA) is used. CSA is implemented for the detector generation, and anomalies are identified using the detector set. Simulations of both the algorithms are performed on different parts of the dataset, and the results for anomaly detection and false positive rates are calculated and compared. The experiments showed that the proposed NSA exhibited better results as compared to CSA for most of the cases. Comparison on the complete dataset also showed that NSA had a low FPR and high DR as compared to CSA. NSA is, therefore, a good candidate for real-world problems such as anomaly detection in WSNs. The advantage of using this technique is that it gives more efficiency in terms of detection and is easy to implement. Future work includes accommodating some other anomalies for detection in WSNs. FPR can be further reduced by incorporating other variations of the AIS algorithm.
