Abstract
Introduction
Wireless sensor network (WSN) is a distributed network for sensing the environment. Because of the flexibility, WSNs have been widely used in several applications, such as smart homes, 1 medical care, 2 environmental monitoring, 3 disaster prediction, 4 and military field 5 . A WSN consists of a set of sensor nodes deployed in a certain area. Each sensor has limited battery capacity and data buffer of a fixed size, which leads to the issue of energy efficiency. Thus, several protocols are used to achieve the energy efficiency in WSN. Those protocols are mainly categorized into four classes: ratio module, data reduction, sleep/awake schemes, and battery charging. 6 The ratio module considers four states for each sensor node: idle, sleep, transmit, and receive. Most of the time, sensors are in idle state which causes depletion of the sensor node. The ratio module minimizes the energy consumption by modulation of optimization parameters. However, the ratio module has to manage four states which introduce additional energy consumption. The next category, data reduction, minimizes the quantity of transmitted data packets to achieve energy efficiency and reduce the rate of traffic and latency, too. Nevertheless, it inevitably losses the accuracy and precision of original data. The sleep/awake scheme employs the duty cycling mechanism to diminish the idle state and support the sleep mode. However, the duty cycling leads to sleepy latency and wastes energy for redundant wakeups. The battery charging harvests energy from the wind, solar, and other new energy. It takes nodes with the uneven residual energy into consideration for designing the protocol.
In this article, we focus on the energy efficiency based on the sleep/awake scheme. The sensor nodes periodically generate sensing data and transfer data to the base station through multiple-hop forwarding. Due to the nature of multiple-hop forwarding, the nodes near the base station will receive and transfer more data packets so that those nodes will consume more energy. In this case, the nodes closest to the base station are prone to deplete the energy first. When those nodes are dead, the data of other nodes cannot arrive at the base station, and the network cannot work until the dead nodes are charged. The problem of unbalanced energy consumption will significantly shorten the lifetime of the network. 7 Recently, to alleviate the problem of non-uniform energy consumption, several studies have considered a novel data collection strategy with a mobile sink. The mobile sink plays the role of a data transporter that moves around the WSN and collects the data packets from sensor nodes. There are two major categories of those studies: no-data-forwarding and hybrid approaches. 8 The no-data-forwarding methods apply the mobile sink to visit all sensor nodes to gather data packets, where no communication between sensor nodes is needed and the energy consumption of whole network will significantly reduce. But the touring path of mobile sink is too long, which leads to longer delay of network. The hybrid approaches use some data collection points (CPs) to communicate with some part of sensor nodes and gather their data packets. The mobile sink only visits the CPs to gather data packets. The key is to tradeoff the delay of data collection and energy consumption of communication between nodes. Most of those methods select CPs and plan a touring path for the mobile sink to visit all selected CPs separately.
However, the two problems of CPs selection and path planning interrelate with each other. The selected CPs determine a corresponding touring path while maximum touring length influences the process of selecting CPs. Independently solving the two problems will result in the low utilization of the given maximum touring length, and the full use of maximum length can effectively prolong the network lifetime. As shown in Figure 1, due to the separation of CPs selection and path planning and the local optimization of each stage, it is not uncommon to obtain a result with lower path utility.

The relation between CPs selection and path planning: (a) independently solving the CPs selection and path planning and (b) fully use of maximum length.
To further prolong the network lifetime, we investigate whether we can solve both CPs selection and path planning simultaneously. The following summarize the contributions of this article:
We proposed an end-to-end data collection strategy, which selects CPs and plans the touring path heuristically and simultaneously. We first construct a data-forwarding tree (DF-Tree) and then apply an ant colony optimal (ACO)-based algorithm to select CPs in order.
We introduced a path estimation method modified from the path construction of EAPC 8 to improve the ACO-based algorithm. Instead of the path found by the best ant, a re-estimated shortest path is used to update the pheromone of ACO.
The remaining parts of this article are structured as follows. We review the related works in section “Related works,” formulate the problem in section “Problem formulation,” detail the proposed end-to-end method in section “End-to-end data collection strategy,” show the performance of proposed method in section “Performance evaluation,” and conclude this article in section “Conclusion and future works.”
Related works
In the past few years, several energy-efficient routing protocols for WSN have been proposed. In Nabavi et al., 9 the multi-objective greedy method is designed to find the optimal routing path, where each node selects another node as next hop based on several factors for transmitting information. To balance the energy consumption among all sensor nodes and further prolong the network lifetime, most approaches10–14 employ a hierarchical routing protocol, where all nodes are first divided into several clusters and then a cluster head is elected to maintain other nodes in each cluster. Although those routing protocols help to reduce the energy consumption, the performance still remains inadequate. In those methods, the nodes closed to the base station will deplete energy more quickly and become the bottleneck of the whole network, which lowers the upper bound of the performance. To alleviate this issue, several mobile sink-based data collection protocols have been proposed, which adopt the mobile sink to gather data packets.
In those methods, a number of data CPs are selected and responsible for receiving data packets from other sensor nodes. The mobile sink moves periodically within the WSN to visit CPs and gather all data packets. According to whether or not the mobile sink passes through the center of the CPs, those methods can be classified into two mainly categories: stop and collect (SC) and collect while moving (CWM).
SC
The SC policy requires that the mobile sink should stop at the location of the sensor for communication. The mobile sink will passes through the center of sensors to collect data. Those methods can be further divided into two classes: no-data-forwarding and hybrid methods.
No-data-forwarding
In the class of no-data-forwarding, most methods have adopted one or more mobile sinks to visit all sensors, which can avoid the multiple-hop communication between sensor nodes. In Ma and collegues,15,16 the problem of data collection is formulated as the single-hop data gathering problem (SHDGP), which aims to find the shortest touring path for the mobile sink to visits all sensors. A heuristic tour-planning algorithm for a single mobile sink is proposed to solve the SHDGP. However, the speed of moving the mobile sink is much lower than that of data-forwarding through the network, which will result in a higher latency of data update. In most applications of the WSN, the delay is an important performance metric. A WSN with low latency should be designed. One possible solution is to apply multiple mobile sinks to collect data together. Some studies17–19 have proposed several data collection strategies with multiple mobile sinks. Although those studies reduce the latency of data collection, the high cost of mobile sink makes them impractical. To alleviate the problem of the high latency and the high cost caused by mobile sink, some hybrid methods are proposed.
Hybrid
Those hybrid methods20–26 combines multiple-hop forwarding with the use of mobile sink, where some nodes are selected as CPs. All sensor nodes send their data to the nearest CP through multiple-hop. And the mobile sink periodically visits all CPs to collect data packets through a determined touring path constrained by delay. Since the mobile sink has a fixed moving speed, the time delay constraint can be transformed into a maximum length constraint of the touring path of the mobile sink. Most hybrid methods transform the problem of data collection into the problem of CPs selection and path planning, whose objective is to minimize the energy consumption under a maximum touring length constraint. They first cluster all sensor nodes into several classes, and select a cluster head as the CP. Then, the path planning is solved as the travel salesman problem (TSP) using some heuristic algorithms. In Krishnan and Lim, 27 the cluster heads are chosen dynamically based on the remaining energy and then a model free Q-learning approach is introduced to find the shortest routing path. In those methods, selecting CPs and planning the tour are treated as two independent problems, which lead to the sub-optimal solution for the data collection strategy.
CWM
More recently, a new data collection protocol has been proposed, where the mobile can communicate with a sensor when it passes through the communication range of the sensor. The study by Wen et al. 8 selects CPs accounting for the path cost from current CP to next point and the forwarding load of each sensor, and then constructs a data collection path. The data can be collected by mobile sink if the path passes through the communication range of these sensors. The study by Chang et al. 28 dynamically adopts the proper transmission rate based on the data amount of each sensor to construct the shortest and energy-efficient touring path but guarantees the completeness of data collection. In Donta et al., 29 an extended ant colony optimization (ACO)-based method is first designed to select the best set of rendezvous points and the efficient touring path of mobile sink and then a virtual rendezvous points selection is performed to increase the data gathering speed. Although the protocol can reduce the constructed path length, the design of the protocol is more complicated. Sensors should have the ability to sense the location of the mobile sink and the duration that the mobile sink locates in the communication range should be enough to transfer all data packets.
In this article, we focus on the SC policy and propose a heuristic end-to-end data collection strategy based on ACO, where CPs selection and tour planning perform at the same time. The proposed ACO-based algorithm can be extended to combining other methods which solve the CP selection and path planning independently, to fully use the maximum path length.
Problem formulation
In this article, we focus on a WSN in which sensor nodes periodically generate data packets. All the data packets must be transmitted to the base station within a given delay. There are a set of CPs that responsible for collecting and storing the data packets generated by a certain group of sensor nodes. And a mobile sink can move around the WSN to collect data from the CPs. The objective is to select the set of CPs and plan a tour that passes through all CPs for mobile sink within the given delay. We first detail the problem description, outline some assumptions, and then formulate the problem definition.
Problem description
A WSN consists of a set of sensor nodes, a communication topology network, and a base station. Each node is responsible for sensing environmental status and generating data packets periodically. Those data packets must be transmitted to the base station within a delay
Transmitting and receiving one data packet will consume several energies. The energy consumption is proportional to the packet size. In this article, we assume the sensor nodes generate one data packet with fixed size in a round. We can formulate the energy consumption of sensor
where
where
Assumption
We make some assumptions as following:
The mobile sink moves at a fixed speed.
The mobile sink collects data packets from a CP only when it arrives at the position of the CP.
Comparing to the touring time of mobile sink, the communication time between the mobile sink and a CP is small enough to be neglected. Similarly, the time of transmission and reception between sensor nodes is negligible compared to the touring time.
Each CP has enough capacity to store all sensed data.
The sensor nodes have a fixed communication range and initial energy.
There are no isolated nodes, and each sensor node can communicate with other nodes with multiple-hop.
Each sensor node produces on data packet each round.
The energy of mobile sink is sufficient to visit all CPs and gather data packets in a round within the given delay.
Problem definition
The topology of WSN can be formulated as an undirected graph
where
End-to-end data collection strategy
This section details our proposed algorithm for solving the problem described in section “Problem formulation.” The proposed algorithm is based on the key observation that selecting CPs and planning a tour are not independent. The proposed data collection strategy constructs a DF-Tree first and then uses the ACO-based algorithm to jointly optimize the CPs selection and path planning. The following details the proposed method and describes how the method can be extended.
Initial phase
Given a graph
Motivated by Wen et al., 8 we construct a minimum spanning tree as the DF-Tree which has the minimum total forwarding cost. The Prim algorithm is applied to construct the DF-Tree.
In addition, a complete graph
End-to-end joint optimization framework
Based on the constructed DF-Tree, we aim to select CPs and plan a path. The essence of path planning is to determine the order of the CPs. In fact, the process of selecting CPs also can be viewed as a kind of access order of CPs. Based on this insight, we propose an end-to-end framework to jointly optimize the two problems using improved ACO methods.
ACO for DC
ACO is a heuristic algorithm, which has been proved to be effective for solving combinatorial optimization problem. The basic idea of ACO is motivated by practical ant colony system. When an ant walks on the road, it will release some pheromone. The ants will prefer to move toward the direction which contains more pheromone. While the path with the better solution always contains more pheromone. In the end, most ants will likely choose the best path which has the most pheromone. The key of ACO is how to set up and update the pheromone. The goal of the problem is to minimize
We iteratively update the pheromone using the best result of the elitist ant at each iteration. In each iteration, several ants search the feasible solutions based on a probability function independently. And then, the best solution with minimum
where
where
where
Let
where
The proposed ACO-based algorithm details in Algorithm 1.
Path estimation
Because of the randomness of ACO-based algorithm, it is easy to get a path in which there are several intersecting segments. With the same CPs, a touring path with intersecting segments means a longer touring length. The problem of intersecting segments also leads to the algorithm get a sub-optimal solution. To alleviate the problem, we utilize an efficient path estimation algorithm to compute the touring length
The algorithm consists of three steps: the construction of the convex polygon, the connection of internal points, and the estimation of touring length.
Construction of the convex polygon
This step aims to construct a bounding box of all CPs where the box is a convex polygon and has the minimum area. First, the point at the bottom-left is selected as the start point
Connection of the internal points
After the convex polygon is constructed, there may remain some unconnected points. Obviously, those points are all located at the inside of the convex polygon. We iteratively joint the remaining points into the polygon. Let
The
where
Estimation of touring length
When
Complexity analysis
The proposed data collection strategy consists of an initial phase and an ACO-based CPs selection and tour planning. In the initial phase, the Prim algorithm is employed to construct the DF-Tree, whose time complexity is
Combining with other methods
Due to the flexibility of the parameters setting of ACO, the proposed end-to-end strategy can be easily extended to other methods. Given a data collection algorithm
Performance evaluation
To validate the effectiveness of the proposed method, we have conducted extensive experiments. The Python simulator is used to simulate the network environment. We consider a network area that measures
The simulation settings.
The deployment of sensor nodes is dependent on its certain application. Different deployment characteristics will result in different performance evaluations. We conduct extensive experiment on two scenarios:
Unbalanced deployment (UD) scenario: all sensor nodes are uniformly deployed in the network area.
Balanced deployment (BD) scenario: the network area is divided into a
To benchmark our method, we compare it to the version of “stop and collect” of Energy-Aware Path Construction (EAPC-SC) algorithm
8
in terms of network lifetime, total energy consumption, FI, efficient index (EI), and running time. The EAPC-SC algorithm has more efficient performance than existing data collection mechanisms, which is applied to solve the same problem as the considered in this article. The EAPC-SC first constructs a minimal spanning tree (MST), selects a set of CPs, and then constructs a data collection path. The selection of CPs is based on a heuristic function, which is the same as the
In the experiments, the distributions of sensors are randomly initialized and the proposed algorithm is based on the ACO algorithm which is stochastic. Therefore, we conducted 10 repeated experiments to alleviate the effect from randomness.
Network lifetime
The network lifetime describes the maximum working time of the whole WSN, which is an important metric of a well-designed WSN. And it uses round as a unit of measure. The higher value denotes better performance. In each round, a mobile sink visits all CPs based on the arranged routes and collects all data packets from CPs. The lifetime is defined as the time until one node runs out of energy. And very simplistically, we assume the sensors consuming energy only for transmitting and receiving data packets. The lifetime of network depends on the number of total sensor nodes and the touring path of mobile sink. We compare the lifetime under various numbers of sensor nodes for two scenarios, where the number of sensors ranges from 10 to 300. The results are shown in Tables 2 and 3, where the ACO denotes our proposed strategy. We conduct experiments in both scenarios under different time delay constraints. It is obvious that our proposed end-to-end strategy can further prolong the network lifetime. To better compare with other methods, we also visualize the results in Figure 2.
Network’s lifetime (rounds) under BD scenarios.
BD: balanced deployment; ACO: ant colony optimal; EAPC-SC: Energy-Aware Path Construction; DDR: dynamic directional routing.
Network’s lifetime (rounds) under UD scenarios.
UD: unbalanced deployment; ACO: ant colony optimal; EAPC-SC: Energy-Aware Path Construction; DDR: dynamic directional routing.

Network lifetime (a) BD-
Total energy consumption
Total energy consumption is defined as the sum of the energy consumption of all sensors in the WSN per round. Figure 3 compares the total energy consumption of WSN for different algorithms under two scenarios. The result shows that our proposed method results in comparable energy consumption per round with other protocols.

Total energy consumption (a) BD-
FI
Since the mobile sink-based data collection strategy is introduced to alleviate the problem of non-uniform energy consumption, the degree of uniformity should be quantified. A higher uniformity of energy consumption will lead to lower total energy consumption and higher network lifetime. The FI reflects the degree of the balance between nodes, which is defined as following
where

Fairness index (a) BD-
EI
The given delay constraint is correspondent to a maximum touring length for mobile sink. The full use of the maximum length can effectively improve the network lifetime and FI. The EI describes the utilization of the given maximum length, which is defined as following
where

Efficient index (a) BD-
Running time
The running time of different methods with various numbers of nodes is compared in Figure 6. The time complexity of EAPC and DDR methods is of the same magnitude, and the running time of the two methods drew in the figure is nearly overlapped. To better distinguish, we change the scale of the vertical axis to logarithmic. Our proposed method is more time-consuming and has minor responsiveness, which makes it hard to be applied in online situations. However, there are many applications running algorithms to generate data collection strategies offline, where the structure of WSN is almost invariant and the proposed method is performed only when the structure is changed. Thus, the running time is acceptable in the offline cases.

Running time (a) BD-
Conclusion and future works
Data collection is an important topic in WSNs because it is energy-consumed and affects the network lifetime. The use of mobile sink can significantly reduce the energy consumption and prolong the network lifetime. In this article, we proposed an end-to-end data collection strategy, which selects appropriate sensor nodes as CPs and plans a path of visiting all CPs for mobile sink simultaneously. The proposed strategy is based on ACO and can be extended to combining with other data collection approaches. The experimental results reveal that the proposed end-to-end strategy can achieve a better performance in terms of network lifetime, total energy consumption, and FI, especially in the UD scenario.
In the future, we will extend our end-to-end strategy to combining with other data collection methods to demonstrate the generalization capacity of our method. Currently, we do not pay much attention to the runtime performance issues, it is promising for further performance improvements on other methods.
