Sage Journals: Discover world-class research

Abstract

The nonexistence of end-to-end path between the sender and the receiver poses great challenges to the successful message transmission in delay tolerant networks. Probabilistic routing provides an efficient scheme to route messages, but most existing probabilistic routing protocols do not consider whether a message has enough time-to-live to reach its destination. In this paper, we propose an improved probabilistic routing algorithm that fully takes into account message's time-to-live when predicting the delivery probability. Based on statistical analysis, we compute and update the expected intermeeting times between nodes. And then the probability for a message to be delivered within its time-to-live is computed based on the assumed exponential distribution. We further propose an optimal message schedule policy, by modeling the buffer management problem as 0-1 knapsack, of which the maximum delivery probability sum can be achieved by resorting to the back track technique. Extensive simulations are conducted and the results show that the proposed algorithm can greatly enhance routing performance in terms of message delivery probability, overhead ratio, and average hop count.

1. Introduction

As a new emerging store and forward networking architecture, delay tolerant networks (DTNs) have been widely studied and applied. In recent years, DTNs have achieved great successes in some challenging networks deployed in extreme environment, such as interplanetary Internet, habitat monitoring networks, underwater sensor networks [1, 2], vehicular ad hoc networks [3], pocket switched networks [4, 5], and mobile social networks [6, 7]. However, different from the traditional Internet, DTNs are characterized by frequent topology partitions [8], sparse node density, limited network resources (e.g., storage, bandwidth, etc.), extremely high end-to-end latency, asymmetric data rate, high bit error rate, heterogeneous interconnection, and so forth. So in DTNs, there may never be a complete end-to-end path between the sender and the receiver. Consequently, the successful message transmission in DTNs faces great challenges.

In order to cope with the intermittent connectivity problem, DTN architecture [9, 10] introduces a bundle layer between the application layer and the transport layer to implement store-carry-and-forward routing strategy. Furthermore, with the help of the bundle layer, DTN architecture is able to shield heterogeneous networks and communicate across the multiple regions that have different types of network architectures and protocols. So based on the bundle layer, DTNs can relay messages hop by hop until encountering the destination nodes. But due to the extremely limited network resource and bandwidth, current node should make a clever next hop routing selection to control the number of message copies.

So far, a large number of probabilistic routing strategies have been proposed to optimize the next hop routing selection in the absence of global topology knowledge. Most of them (e.g., Prophet [11]) make attempts to predict the encounter probabilities between nodes and then make routing decisions based on the computed probability values. There is no denying that these routing strategies can improve message delivery ratio in opportunistic routing. We also should notice that the time-to-live (TTL) of message is gradually depleted as time progresses. Most traditional probabilistic routing protocols always try to replicate message to the node with a higher delivery probability, without taking into consideration message's TTL. However, although a selected intermediate node has a higher probability to encounter destination node, the message delivery will still fail if the message's TTL is exhausted before meeting the destination node. In this case, considering the limited buffer resource, the message copy should not be delivered to the intermediate node. From the above case, we can see that a higher encounter probability can only indicate that the two nodes are more likely to encounter each other. But the two nodes may still need a period of time to encounter each other. If message's TTL is exhausted during this period of time, the message delivery will still fail. From this point of view, the node is not a good choice although it has a higher probability to encounter the destination. Consequently, an efficient probabilistic routing algorithm that fully takes into account message's TTL is expected to be employed in DTNs.

In this paper, from a new perspective, we propose a statistical analysis based probabilistic routing (SAPR) algorithm, which predicts the probability that a message can be successfully delivered to its destination based on its remaining time-to-live. Firstly, for each pair of nodes, we use statistical analysis methods to compute the mathematical expectation of the intermeeting times (IMTs) between them. Secondly, in the case of fully taking into account message's remaining time-to-live, we predict the possibility that a message can be successfully delivered. Finally, we make probabilistic routing selections according to the computed probabilities. In other words, we replicate message to the intermediate node that can make message get a higher delivery probability. In addition, we also introduce buffer management policy in the proposed routing algorithm. The message management is modeled as a 0-1 knapsack problem when node's buffer overflows. By solving the knapsack problem, we can make sure that each node always keeps the messages that can maximize the delivery probability sum.

The rest of this paper is organized as follows. In Section 2, we make the routing assumptions and explain the mathematical notations used in this paper. Section 3 gives the detailed descriptions of the proposed algorithm. The performance evaluations and comparisons are presented in Section 4. Section 5 discusses some related works. Finally in Section 6, we summarize this paper.

2. Assumptions and Preliminary

In order to analyze and implement the proposed probabilistic routing algorithm, we make the following assumptions. (i)

The intermeeting time (IMT) between nodes is exponentially distributed or has at least an exponential tail.

(ii)

Nodes move independently and their mobility is heterogeneous. In other words, different node pairs have different exponential distribution parameter λ.

(iii)

The network resource (e.g., storage, bandwidth, energy, etc.) is limited.

Regarding the first assumption, it has been shown that many simple synthetic mobility models (e.g., Random Walk, Random Waypoint, and Random Direction [12, 13]) have such a property. Furthermore, it is a known result in the theory of random walks on graphs that hitting times on subsets of vertices usually have an exponential tail [14]. And [15] has derived the fact that the expected intermeeting time in Random Walk model also follows an exponential distribution. Besides, the Exponential Correlated Random Mobility model can also be used to support the first assumption. So the assumption that most nodes exhibit random mobility is reasonable in the opportunistic networks. However, in addition to exponential distribution, some recent researches have suggested that intermeeting time also follows power law distribution in some human mobility traces. But recently, by using a diverse set of measured human mobility traces, Karagiannis et al. [16] have argued that the intermeeting time still exhibits an exponential tail. They find as an invariant property that there is a characteristic time, order of half a day, beyond which intermeeting time follows an exponential distribution. Within the characteristic time, intermeeting time follows a power law distribution. This is to say, in many human traces, although intermeeting time follows a power law distribution within a period of time, it still exhibits an exponential tail. Taking the Content datasets of Cambridge haggle for example, Cambridge spends about two months (far greater than half a day) to collect the trace data. In this case, according to the above conclusion of Thomas, the time period in which intermeeting time follows exponential distribution is much longer than the time period following power law distribution. Thus the entire cumulative distribution can be approximately seen as a certain exponential distribution. In addition, in the MIT trace using Bluetooth devices, up to 60% of intermeeting times observed are above one day (greater than half a day), and these large intermeeting times can also be found in the traces collected by UCSD and Dartmouth. So in some sense, the distributions of intermeeting times in these traces can also be seen as exponential. In this paper, our routing is not specifically designed for human mobility model, but in some extent of generality. So we tend to assume the exponential distribution taking into account the above factors. And the results in our simulations also show the reasonableness of the assumption. Regarding the second assumption, it is clear that nodes follow different moving trajectories and different node pairs usually have different encounter rates in the real world. Some nodes may encounter each other frequently, but other nodes may never meet each other.

The mathematical notations used in this paper are listed and explained in Notations section.

3. Statistical Analysis Based Probabilistic Routing (SAPR)

Before presenting our SAPR algorithm, we first introduce some analysis works and routing models based on the above assumptions.

3.1. Estimating Exponential Distribution Parameter

Here we assume that intermeeting time X follows the exponential distribution with parameter λ, that is, $\begin{matrix} X ~ Exponential (λ) . \end{matrix}$ (1) With the second assumption, we know that different node pairs have different parameter λ. For the sake of simplicity, we temporarily use λ to uniformly represent these exponential distribution parameters in this section. Then we have the probability density function $f^{\exp} (x)$ : $\begin{matrix} f^{\exp} (x) = {\begin{cases} λ e^{- λ x} & x > 0 \\ 0 & x \leq 0 . \end{cases} \end{matrix}$ (2) In order to find the functional relationship between $E (X)$ and λ, we compute the mathematical expectation of the intermeeting times based on the exponential distribution: $\begin{matrix} E (X) = \int_{- \infty}^{\infty} x f^{\exp} (x) d x = \frac{1}{λ} . \end{matrix}$ (3)Now, we can get (4) to estimate the parameter λ for each node pair: $\begin{matrix} λ_{a, b} = \frac{1}{E (X_{a, b})} . \end{matrix}$ (4)

3.2. Computing IMT's Mathematical Expectation

In order to compute the value of $λ_{a, b}$ , we still need to get the mathematical expectation of $X_{a, b}$ . In this section, for each node pair a and b, we use the statistical analysis methods to compute $E (X_{a, b})$ . For this purpose, each node needs to maintain two matrixes, $M^{imt}$ and $M^{count}$ , where $x_{k} (N_{j})$ denotes the kth sample of the intermeeting times between current node and node $N_{j}$ and correspondingly $c_{k} (N_{j})$ denotes the count that the kth sample appears. As shown in (5), for each of the other $N - 1$ nodes, $M^{imt}$ records the current n samples by random sampling. Then based on these samples, we can use statistical analysis methods to compute $E (X_{a, b})$ for nodes a and b: $\begin{matrix} M^{imt} = \begin{matrix} \begin{matrix} \begin{matrix} N_{1} \\ ⋮ \\ \begin{matrix} N_{j} \\ ⋮ \\ N_{| N - 1 |} \end{matrix} \end{matrix} \end{matrix} & \begin{matrix} \begin{matrix} 1 & \dots & \begin{matrix} k & \dots & n \end{matrix} \end{matrix} \\ [\begin{bmatrix} x_{1} (N_{1}) & \dots & x_{k} (N_{1}) & \dots & x_{n} (N_{1}) \\ ⋮ & ⋱ & ⋮ & ⋱ & ⋮ \\ x_{1} (N_{j}) & \dots & x_{k} (N_{j}) & \dots & x_{n} (N_{j}) \\ ⋮ & ⋱ & ⋮ & ⋱ & ⋮ \\ x_{1} (N_{| N - 1 |}) & \dots & x_{k} (N_{| N - 1 |}) & \dots & x_{n} (N_{| N - 1 |}) \end{bmatrix}] \end{matrix} \end{matrix} \end{matrix}$ (5) $\begin{matrix} M^{count} = [\begin{bmatrix} c_{1} (N_{1}) & \dots & c_{k} (N_{1}) & \dots & c_{n} (N_{1}) \\ ⋮ & ⋱ & ⋮ & ⋱ & ⋮ \\ c_{1} (N_{j}) & \dots & c_{k} (N_{j}) & \dots & c_{n} (N_{j}) \\ ⋮ & ⋱ & ⋮ & ⋱ & ⋮ \\ c_{1} (N_{| N - 1 |}) & \dots & c_{k} (N_{| N - 1 |}) & \dots & c_{n} (N_{| N - 1 |}) \end{bmatrix}] . \end{matrix}$ (6)Assuming current node is node a, then node a can use (7) to compute the final value of $E (X_{a, b})$ : $\begin{array}{l} P (x_{k} (b)) = \frac{C_{k} (b)}{\sum_{i = 1}^{n} C_{i} (b)}, \\ E (X_{a, b}) = \sum_{k = 1}^{n} x_{k} (b) \times P (x_{k} (b)) \\ = \frac{\sum_{k = 1}^{n} x_{k} (b) C_{k} (b)}{\sum_{i = 1}^{n} C_{i} (b)} . \end{array}$ (7)

Note that we use random sampling in this paper. For a node pair, it is easy to compute the interval between two encounters. By repeating this operation in a random way, we can finally get the sample data. The value of n (i.e., the size of sample data) is not an invariant variable, which can be flexibly set to an appropriate value according to the specific scenario. For the scenario with limited resources and computing capacity, we can appropriately reduce the value of n. For the scenario with sufficient resources and better computing capacity, we can increase the value so as to make the algorithm more accurate. In this paper, we set the value of n to the number of nodes in the networks. When a node encounters a new node that recently joined the networks, it needs to add the node to the above matrixes. Due to the lack of enough sample data about the new node, we first use continuous sampling to quickly get n sample data. After that, we continue to collect data by using random sampling. Note that we have to use the sample data we have collected to calculate the mathematical expectation before getting enough sample data. In order to accurately estimate the current expectation of intermeeting times, we always use the latest sample data in this paper. That is to say, we are constantly replacing the oldest sample data with the latest one after having got n sample data.

In order to more accurately compute the $E (X)$ for a node pair, we also build an update process to update $E (X)$ when the two nodes have not encountered each other for a long period of time.

Theorem 1.

For the intermeeting times between any two nodes, assuming their expectation is $E (X)$ and their variance is $D (X)$ , then one can predict the probability that $| X - E (X) | < ∆ X$ when setting $∆ X$ to a meaningful value; that is, $\begin{matrix} P {| X - E (X) | < ∆ X} \geq 1 - \frac{D (X)}{{∆ X}^{2}} . \end{matrix}$ (8)

Proof.

Firstly, we assume the probability density function of X is $f (x)$ . Then we can get $\begin{array}{l} P {| X - E (X) | \geq ∆ X} = \int_{| x - E (X) | \geq ∆ X}^{} f (x) d x \\ \leq \int_{| x - E (X) | \geq ∆ X}^{} \frac{{| x - E (X) |}^{2}}{{∆ X}^{2}} f (x) d x \\ \leq \frac{1}{{∆ X}^{2}} \int_{- \infty}^{+ \infty} {(x - E (X))}^{2} f (x) d x . \end{array}$ (9)For the variable X, we have $\begin{matrix} D (X) = \int_{- \infty}^{+ \infty} {(x - E (X))}^{2} f (x) d x . \end{matrix}$ (10) Then with (9), we get $\begin{matrix} P {| X - E (X) | \geq ∆ X} \leq \frac{D (X)}{{∆ X}^{2}} . \end{matrix}$ (11) Finally, we can get $\begin{array}{l} P {| X - E (X) | < ∆ X} = 1 - P {| X - E (X) | \geq ∆ X} \\ \geq 1 - \frac{D (X)}{{∆ X}^{2}} . \end{array}$ (12)

To some extent, Theorem 1 shows the central tendency of the intermeeting times. That is to say, we can get the interval that most intermeeting times are clustered together in.

Corollary 2.

For the intermeeting times between any two nodes, at least ρ of them are clustered together in the interval $(E (X) - \sqrt{D (X) / (1 - ρ)}, E (X) + \sqrt{D (X) / (1 - ρ)})$ ; that is, $\begin{matrix} P {E (X) - \sqrt{\frac{D (X)}{1 - ρ}} < X < E (X) + \sqrt{\frac{D (X)}{1 - ρ}}} \geq ρ . \end{matrix}$ (13)

Proof.

If we set $\begin{matrix} ρ = 1 - \frac{D (X)}{{∆ X}^{2}}, \end{matrix}$ (14) then we can get the meaningful value of $∆ X$ : $\begin{matrix} ∆ X = \sqrt{\frac{D (X)}{1 - ρ}} . \end{matrix}$ (15) With (8), we can finally get (13). Then setting ρ to an appropriate value, we can get the interval that most intermeeting times appear in.

If two nodes have not encountered each other for a long period of time, they should update $E (X)$ so as to more accurately estimate the parameter λ. Now the issue is how and when to update $E (X)$ . According to Corollary 2, if we set ρ to a value close to 1.0, we can find the interval that most intermeeting times are distributed in. For simplicity, let x denote the time that has elapsed since the last encounter. If x is in the interval $(E (X) - \sqrt{D (X) / (1 - ρ)}, E (X) + \sqrt{D (X) / (1 - ρ)})$ , then we consider it a common case and it is not necessary to update $E (X)$ in this case. And if x is in the interval $(0, E (X) - \sqrt{D (X) / (1 - ρ)}]$ , we still do not update $E (X)$ . This is because this case will not increase the value of $E (X)$ . But if x is in the range $[E (X) + \sqrt{D (X) / (1 - ρ)}, + \infty)$ , we consider it an abnormal case and it will increase the value of $E (X)$ . Therefore in this case, we need to update $E (X)$ . For this purpose, we define the update cycle as $\sqrt{D (X) / (1 - ρ)}$ ; that is, $E (X)$ is updated once every $\sqrt{D (X) / (1 - ρ)}$ time units when $x > E (X)$ . With (11), we have $\begin{array}{l} P {| X - E (X) | \geq m \sqrt{\frac{D (X)}{1 - ρ}}} \leq \frac{D (X)}{{(m \sqrt{D (X) / (1 - ρ)})}^{2}} \\ \leq \frac{1 - ρ}{m^{2}} . \end{array}$ (16) Then, from a statistical point of view, we further assume $\begin{array}{l} P {X - E (X) \geq m \sqrt{\frac{D (X)}{1 - ρ}}} \\ = P {X - E (X) \leq - m \sqrt{\frac{D (X)}{1 - ρ}}} . \end{array}$ (17) Now, we can have $\begin{matrix} P {X - E (X) \geq m \sqrt{\frac{D (X)}{1 - ρ}}} \leq \frac{1}{2} \times \frac{1 - ρ}{m^{2}} . \end{matrix}$ (18) Here, we use its upper bound; that is, $\begin{matrix} P^{upper} {X - E (X) \geq m \sqrt{\frac{D (X)}{1 - ρ}}} = \frac{1 - ρ}{2 m^{2}} . \end{matrix}$ (19) Finally we use (20) to update $E (X)$ in the mth update cycle when $x > E (X)$ : $\begin{array}{l} m = round down (\frac{x - E (X)}{\sqrt{D (X) / (1 - ρ)}}) \\ = ⌊ \frac{x - E (X)}{\sqrt{D (X) / (1 - ρ)}} ⌋, \\ E (X) = E (X) + m \sqrt{\frac{D (X)}{1 - ρ}} \\ \times P^{upper} {X - E (X) \geq m \sqrt{\frac{D (X)}{1 - ρ}}} \\ = E (X) + m \sqrt{\frac{D (X)}{1 - ρ}} \times \frac{1 - ρ}{2 m^{2}} \\ = E (X) + \frac{\sqrt{D (X) (1 - ρ)}}{2 m} . \end{array}$ (20) Now, we only need to get the variance $D (X)$ for updating the $E (X)$ . With the above matrixes $M^{imt}$ and $M^{count}$ , we can easily compute the $D (X)$ for current node a and node b by using $\begin{array}{l} D (X_{a, b}) = \sum_{k = 1}^{n} {(x_{k} (b) - E (X))}^{2} P (x_{k} (b)) \\ = \frac{\sum_{k = 1}^{n} {(x_{k} (b) - E (X))}^{2} c_{k} (b)}{\sum_{i = 1}^{n} C_{i} (b)} . \end{array}$ (21)

3.3. Predicting Message's Delivery Probability

After computing and updating $E (X)$ , we can estimate the parameter λ for each node pair. Then we can predict message's delivery probability according to the message's remaining TTL.

Theorem 3.

Assuming the remaining time-to-live of $m_{i}$ is $R_{i}$ , the mathematical expectation of the intermeeting times between current node and the destination is $E (X)$ , the time that has elapsed since the last encounter with destination node is τ, and then the delivery probability of $m_{i}$ on current node is $P_{m_{i}}$ : $\begin{matrix} P_{m_{i}} = 1 - e^{- ({τ + R}_{i}) / E (X)} . \end{matrix}$ (22)

Proof.

The probability that a message can be successfully delivered by a node is equal to the probability that the next intermeeting time between the node and the destination is not greater than the sum of message's remaining TTL and the time that has elapsed; that is, $\begin{array}{l} P_{m_{i}} = P {X \leq τ + R_{i}} \\ = \int_{- \infty}^{τ + R_{i}} f^{\exp} (x) d x \\ = \int_{- \infty}^{0} 0 d x + \int_{0}^{τ + R_{i}} λ e^{- λ x} d x \\ = - {e^{- λ x} |}_{0}^{τ + R_{i}} \\ = - e^{- λ (τ + R_{i})} + 1 \\ = 1 - e^{- (τ + R_{i}) / E (X)} . \end{array}$ (23)

3.4. Next Hop Selection Strategy

Now we focus on the next hop selection strategy. When communication opportunity arises, the message should be delivered to the relay node with a higher delivery probability. The detailed process is shown in Algorithm 1.

Algorithm 1: Next hop routing selection on node $N_{a}$ .

Triggering condition:

when node $N_{a}$ encounters node $N_{b}$

$N_{a}$ Executes:

(1) For each $m_{i}$ in $N_{a}$ do

(2) $R_{i} \leftarrow$ the remaining TTL of $m_{i}$

(3) $dest \leftarrow m_{i} . d e s t i n a t i o n$

(4) update $E (X_{N_{a}, dest})$

(5) $P_{N_{a}, dest}^{m_{i}} \leftarrow$ compute delivery probability using (22)

(6) update $E (X_{N_{b}, dest})$

(7) $P_{N_{b}, dest}^{m_{i}} \leftarrow$ compute delivery probability using (22)

(8) If $P_{N_{b}, dest}^{m_{i}} > P_{N_{a}, dest}^{m_{i}}$ then

(9) send $m_{i}$ to $N_{b}$

(10) End if

(11) End for

3.5. Buffer Management Policy

In DTNs, the buffer resources of nodes are usually limited. So when node's buffer overflows, the resource allocation problem arises. In this case, current node needs to determine whether to receive the incoming message and which message to drop. To this end, we first need to define the optimal objective and then make the optimal decisions based on the objective. In this paper, our objective is to use the limited buffer to maximize the sum of the delivery probabilities of messages that can be stored by current node. We formalize the optimal buffer management as a 0-1 knapsack problem and further solve it by using the back track technique.

Theorem 4.

Optimal buffer management is a 0-1 knapsack problem.

Proof.

If we view the buffer size of a node, the messages' delivery probabilities, and the sizes of messages as the maximum weight of a knapsack, the values of goods, and the weights of goods, then the objective of selecting and storing the messages that can maximize the sum of delivery probabilities can be viewed as filling the knapsack with the goods that can get the maximum value. Consequently, buffer management can be modeled as a 0-1 knapsack problem, and then we can further use the technique of solving the knapsack problem to solve the optimal buffer management problem.

Definition 5.

The formalization of the optimal buffer management is as follows, where $B_{i}$ is used to mark whether to store the message $m_{i}$ : $\begin{array}{l} Max \sum_{i = 1}^{n} P_{m_{i}} B_{i} \\ s . t . \sum_{i = 1}^{n} size (m_{i}) B_{i} \leq buffer size \\ B_{i} = {0,1}, 1 \leq i \leq n . \end{array}$ (24)

The above formalization can make sure that each node always keeps the messages that can maximize the delivery probability sum. Now the issue is how to solve the optimal problem.

The common way to solve knapsack problem is dynamic programming algorithm. But considering the huge cost of dynamic programming in this problem, we use the back track technique to solve the knapsack problem in this paper. The detailed process is shown in Algorithm 2, which also needs to call Algorithms 3 and 4. Algorithm 4 is to compute the upper bound of the optimal value of right subtree in the search process, which is called by Algorithm 3 to determine whether to continue searching right subtree. That is to say, we cut off the subtree if its upper bound is less than the current best value. Algorithm 3 is the back track algorithm, which searches the entire solution space tree and records the current optimal solution. In Algorithm 2, line 1 first sorts messages in a descending order according to the unit value computed by (25). Lines 2–9 are to initialize the global variables that will be sued in Algorithms 3-4. Note that these global variables can also be modified by Algorithms 3-4. Finally, lines 11–13 delete or reject those messages that are not included in the optimal solution computed by Algorithm 3. In order to determine which message to drop and which message to receive, the $m s g L i s t$ in Algorithm 2 should contain both the stored messages and the incoming messages: $\begin{matrix} P_{m_{i}}^{unit} = \frac{P_{m_{i}}}{size (m_{i})} . \end{matrix}$ (25)

Algorithm 2: Buffer management strategy.

Input:

$m s g L i s t = [m_{1}, m_{2}, \dots, m_{n}]$

(1) sort $m s g L i s t$ in a descending order according to (25)

(2) for each $m_{i}$ in $m s g L i st$ do

(3) $s [i] \leftarrow m_{i} . s i z e$

(4) $p [i] \leftarrow$ the delivery probability of $m_{i}$

(5) end for

(6) $c u r r e n t S \leftarrow 0$ , $c u r r e n t P \leftarrow 0$ , $b e s t P \leftarrow 0$

(7) $BS \leftarrow$ the buffer size of current node

(8) $c u r r e n t P a t h [1,2, \dots, n] \leftarrow [1,1, \dots, 1]$

(9) $b e s t P a t h [1,2 \dots n] \leftarrow [1,1, \dots, 1]$

(10) call Back Track ( $1$ )

(11) for each element in $b e s t P a t h$ do

(12) if $b e s t P a t h [i] = 0$ then

(13) delete or reject $m_{i}$

(14) end if

(15) end for

Algorithm 3: Back track process.

Input:

starting index: $i n d e x$

(1) if $i n d e x > n$ do

(2) $b e s t P \leftarrow c u r r e n t P$

(3) $b e s t P a t h \leftarrow c u r r e n t P a t h$

(4) return

(5) end if

(6) if $c u r r e n t S + s [i n d e x] \leq BS$ do

(7) $c u r r e n t S \leftarrow c u r r e n t S + s [i n d e x]$

(8) $c u r r e n t P \leftarrow c u r r e n t P + p [i n d e x]$

(9) $c u r r e n t P a t h [i n d e x] \leftarrow 1$

(10) call Back Track ( $i n d e x + 1$ )

(11) $c u r r e n t S \leftarrow c u r r e n t S - s [i n d e x]$

(12) $c u r r e n t P \leftarrow c u r r e n t P - p [index]$

(13) $c u r r e n t P a t h [i n d e x] \leftarrow 0$

(14) else

(15) $c u r r e n t P a t h [i n d e x] \leftarrow 0$

(16) end if

(17) if $B o u n d (i n d e x + 1) > b e s t P$ then

(18) call Back Track ( $i n d e x + 1$ )

(19) end if

Algorithm 4: Bound process.

Input:

starting index: $i n d e x$

Output:

the upper bound of $i n d e x$ : $b o u n d$

(1) $l e f t S \leftarrow BS - c u r r e n t S$

(2) $b o u n d \leftarrow c u r r e n t P$

(3) while $i n d e x \leq n$ and $s [i n d e x] \leq l e f t S$ do

(4) $l e f t S \leftarrow l e f t S - s [i n d e x]$

(5) $b o u n d \leftarrow b o u n d + p [i n d e x]$

(6) $i n d e x \leftarrow i n d e x + 1$

(7) end while

(8) if $i n d e x \leq n$ do

(9) $b o u n d \leftarrow b o u n d + p [i n d e x] \times c l e f t S / s [i n d e x]$

(10) end if

(11) return bound

4. Simulation

In this section, we use the ONE [17] simulator to conduct extensive simulations for evaluating the performance of SAPR under various settings. The simulation settings, evaluation metrics, and results are described as follows.

4.1. Simulation Settings

To well evaluate the routing performance of SAPR based on our assumptions, we first conduct simulations based on the synthetic traces generated by Random Walk model. This is because Random Walk is a typical movement model, in which the intermeeting times between nodes follow exponential distributions. So it is very helpful to evaluate our proposed routing algorithm. The detailed simulation settings are shown in Table 1. We introduce Epidemic, Prophet, and Source Spray and Wait into the simulations and comparisons. The reason is that Epidemic is typical multicopy routing based on flooding, which can be used to verify the performance improvements of SAPR. Prophet is typical probabilistic routing based on the encounter probabilities between nodes, which differs from SAPR and can be used to evaluate SAPR from the perspective of probabilistic routing. Source Spray and Wait is typical opportunistic routing that strictly limits the number of message copies, which can be used to evaluate the overhead ratio of SAPR. In addition, we also introduce the drop-front buffer management policy when implementing the above three algorithms in order to evaluate our proposed buffer management policy.

Table 1

Simulation settings in Random Walk.

Parameter	Default value (range)
Area size	$1000 m \times 1000 m$
Number of nodes	126
Message size	500 K–1 M
Transmission speed	250 KBps
Transmission radius	100 m
Moving speed	0.5–1.5 m/s
Message interval	40 s (10–50 s)
Node buffer size	50 M (5–50 M)
Time-to-live (TTL)	3 h (2–5 h)
Simulation time	24 h
ρ	0.95

Taking into consideration the shortcomings of Random Walk, we also conduct simulations based on the synthetic traces generated by Helsinki City model. Helsinki City model is a more realistic mobility scenario, which is based on real map data and adds realism, so it is very helpful to evaluate routing protocols. Moreover, we can modify the model parameters as needed, so that it can reproduce various empirical mobility properties, which is beneficial to the routing performance evaluations. This is also why we use the Helsinki City model instead of the real traces.

We use 126 nodes in the Helsinki City whose area is $4500 \times 3400 m^{2}$ . These nodes are divided into 6 groups. Group 1 and Group 3 are pedestrian groups (each group contains 40 nodes); Group 2 consists of 40 car nodes. Group 4, Group 5, and Group 6 are tram groups and they, respectively, consist of two nodes. Pedestrians move with speeds of 0.5–1.5 m/s, cars move with speeds of 2.7–13.9 m/s, and trams move with speeds of 7–10 m/s. Two types of devices are introduced in the simulations. One is Bluetooth device with transmission speed of 250 KBps and transmission range of 20 m. The other is High Speed device with transmission speed of 10 MBps and transmission range of 1000 m. Group 1, Group 2, and Group 3, respectively, have 4–20 MB buffers and they are based on the Shortest Path Map Based movement model. Group 4, Group 5, and Group 6 have 50 MB buffers, respectively, and they are based on the Route Map Based movement model. Group 1, Group 2, Group 3, Group 5, and Group 6 use Bluetooth devices; Group 4 uses both Bluetooth devices and High Speed devices. The other simulation settings are shown in Table 2. In the simulations, we also introduce Epidemic and Prophet for the same reason. Besides taking into consideration the mobility of nodes in this scenario, we add First Contact to the simulations to evaluate the overhead ratio of SAPR. These routing algorithms also implement the drop-front buffer management policy.

Table 2

Simulation settings in Helsinki City.

Parameter	Default value (range)
Message size	500 K–1 M
Message interval	40 s (30–90 s)
Buffer (Groups 1–3)	20 M (4–20 M)
Time-to-live (TTL)	3 h (2–5 h)
Simulation time	24 h
ρ	0.95

4.2. Evaluation Metrics

In this paper, the simulations are grouped into the three categories: varying buffer size, varying message's time-to-live, and varying message generation interval. Under the same guideline, we evaluate all routing algorithms based on the following metrics. (1)

Delivery ratio: this metric is to measure the delivery capability of each algorithm.

(2)

Overhead ratio: it reflects the efficiency of message transmission and it is desirable to achieve a low overhead ratio.

(3)

Average latency: the lower average latency means better routing performance.

(4)

Average hop count: it is another routing goal to reduce transmission cost, such as bandwidth and energy.

(5)

Dropped messages: it is desirable to achieve a fewer number of dropped messages so as to improve the utilization efficiency of storage.

4.3. Simulation Results

4.3.1. Performance Evaluations in Random Walk Model

Figure 1 shows the different routing performance by varying buffer size from 5 MB to 50 MB. Figure 2 shows the different simulation results by varying message's TTL from 2 hours to 5 hours. Figure 3 shows the performance comparisons by varying message's generation interval from 10 s to 50 s.

Figure 1

Delivery ratio, overhead ratio, average latency, and average hop count versus buffer size when setting TTL and message interval to 3 hours and 40 seconds in Random Walk.

Figure 2

Delivery ratio, overhead ratio, average latency, and average hop count versus TTL when setting buffer size and message interval to 50 MB and 40 seconds in Random Walk.

Figure 3

Delivery ratio, overhead ratio, average latency, and average hop count versus message interval when setting buffer size and TTL to 50 MB and 3 hours in Random Walk.

Regarding Figure 1, we can see that SAPR gets the highest delivery ratio, the lowest overhead ratio, and the fewest average hop count compared to Epidemic and Prophet. This can show the accuracy and efficiency of the routing selections of SAPR. Besides, it can also verify the improvements of SAPR on predicting message's delivery probability. By limiting the number of message copies, Source S and W gets a slightly higher delivery ratio than that of SAPR when buffer is insufficient (i.e., less than 10 MB). However, our SAPR gets the highest message delivery ratio when buffer size is greater than 10 MB. For the same reason, Source S and W also gets the lowest overhead ratio and the fewest hop count. But SAPR's performance on network overhead and average hop count is very close to Source S and W.

From Figure 2, we can find that SAPR still outperforms Epidemic and Prophet in terms of overhead ratio and average hop count. And SAPR still gets the highest delivery ratio when message's TTL is greater than 3 hours. This shows that SAPR is adapted to the scenario with a longer message TTL.

Figure 3 shows similar simulation results to Figure 1. Compared to Epidemic and Prophet, SAPR achieves advantages in message delivery ratio, overhead ratio, and average hop count. Moreover, the overhead ratio and average hop count of SAPR are also close to those of S and W, but SAPR gets a higher message delivery ratio.

Finally from Figures 1–3, we can see that SAPR gets a very low overhead ratio and greatly controls average hop count. In addition, SAPR also achieves a satisfying message delivery ratio. Unfortunately, SAPR does not get advantages in message's delivery latency in this scenario.

4.3.2. Performance Evaluations in Helsinki City Model

Figure 4 shows the different routing performance by varying buffer size from 4 MB to 20 MB. Figure 5 shows the different simulation results by varying message's TTL from 2 hours to 5 hours. Figure 6 shows the performance comparisons by varying message's generation interval from 30 s to 90 s.

Figure 4

Delivery ratio, overhead ratio, average latency, and average hop count versus buffer size when setting TTL and message interval to 3 hours and 40 seconds in Helsinki model.

Figure 5

Delivery ratio, overhead ratio, average latency, and average hop count versus TTL when setting buffer size and message interval to 20 MB and 40 seconds in Helsinki model.

Figure 6

Delivery ratio, overhead ratio, average latency, and average hop count versus message interval when setting buffer size and TTL to 20 MB and 3 hours in Helsinki model.

Regarding Figure 4, we can see that SAPR can achieve the highest message delivery ratio, the lowest overhead ratio, the shortest average latency, and the fewest average hop count. It can show once again the accuracy and efficiency of SAPR's selections.

In Figure 5, SAPR can outperform the other three routing protocols in terms of delivery latency and average hop count. When message's TTL is greater than 3 hours, SAPR achieves the highest message delivery ratio. Moreover, SAPR still gets advantages in network overhead ratio compared to Epidemic and Prophet.

Figure 6 shows that SAPR still achieves some advantages in message delivery ratio, network overhead ratio, delivery latency, and average hop count compared to the other three algorithms.

Finally, from Figures 4–6, we can draw the conclusion that SAPR can enhance routing performance in terms of message delivery ratio, network overhead ratio, average delivery latency, and average hop count compared to Epidemic, Prophet, and First Contact.

4.3.3. Performance Evaluations of Dropped Messages

Figures 7(a)–7(c) compare the performance of dropped messages in Random Walk model by changing buffer size, TTL, and message interval. Figures 7(d)–7(f) show the evaluation results of dropped messages under different settings.

Figure 7

Dropped message versus buffer size, TTL, and message interval in Random Walk model and Helsinki model.

From Figure 7 we can see that the number of dropped messages of Epidemic is the largest. This is because Epidemic uses flooding strategy to distribute message to every encountered node. In this case, it will spread a large number of message copies to the whole network. When storage resource is not sufficient, these message copies will be frequently dropped by nodes. In this case, it is hard to spread message to farther regions, which is not conducive to a better distribution of messages. On the contrary, the other algorithms all control message redundancy by different schemes, thus dropping fewer messages. In this case, Epidemic is more sensitive to the amount of network resources compared to other algorithms, and its better performance greatly relied on more cache resources. This can explain the reason why the performance of Epidemic is not better than that of other algorithms.

The number of dropped messages of Prophet is fewer than that of Epidemic; thus Prophet can get better performance. Compared to Epidemic and Prophet, SAPR drops fewer messages, and its number of dropped messages is slightly higher than that of Source Spray and Wait. This can indicate that SAPR can greatly control message redundancy, thus improving the utilization efficiency of storage resource. From this point of view, it can explain the better routing performance of SAPR.

5. Related Works

Prophet is a typical probabilistic routing protocol which makes routing selections based on the encounter probabilities between nodes. For example, current node will deliver message to an intermediate node if the node is more likely to meet the final destination. In addition to using the transitivity property to update the encounter probability, Prophet also proposes an aging function for the outdated information as time progresses.

For message dropping problem, many traditional policies (e.g., drop-tail, drop-front, random drop, etc.) have been proposed, which can play a role in opportunistic routing. In [18], Zhang et al. analyze the buffer constrained Epidemic routing and make the conclusion that drop-front can outperform drop-tail in DTN context. In [19], a node first deletes the message that has the largest number of copies in order to mitigate the impact on routing performance. Based on a specific community detection algorithm, [20] proposes an efficient buffer management policy for social delay tolerant networks, which utilizes social relation and centrality to avoid dropping meaningful messages.

In [21], Li et al. propose an optimal routing strategy by exploiting the heterogeneous features of nodes to enhance the routing performance. It takes into consideration nodes' heterogeneous contact rates and delivery costs when selecting intermediate nodes to minimize the delivery cost. For mobile sensor networks, [22] provides a reliable routing scheme with an enhanced delaying technique, which estimates connectivity based on the ratio of past and present connections. When the connectivity is unreliable, nodes will delay message transmission.

With a home-aware model, CAOR [23] turns mobile social networks into a network that only includes community homes. Then, in the network of community homes, it computes the minimum expected delivery delay by a reverse Dijkstra algorithm. In [24], by introducing a metric to accurately detect the quality of friendship, each node defines its friendship community as the set of nodes having close friendship with itself either directly or indirectly. Then temporally differentiated friendships are used to make the forwarding decisions of messages.

6. Conclusion

In this paper, we try to improve the probabilistic routing performance by taking into account the message's remaining TTL so as to avoid the shortcomings of routing messages directly based on the encounter probabilities between nodes. Our motivation is that the higher encounter probability can only indicate that the two nodes can meet each other frequently. But they may still need a period of time to encounter each other again. However, the message transmission will still fail if the message's TTL is exhausted during this period of time. In this case, an effective scheme that fully takes into account the message's remaining TTL when computing message's delivery probability can get a better performance in probabilistic routing. To this end, by using statistical analysis methods, we propose an efficient scheme to compute and update the expectation of the intermeeting times between nodes. And then, based on exponential distribution, we predict the probability that a message can be successfully delivered before its TTL is exhausted.

In addition, we also improve buffer management policy by modeling message dropping problem as a 0-1 knapsack problem. Then, solving the problem by the back track technique, each node always keeps the messages that can maximize the delivery probability sum. Extensive simulations are conducted based on Random Walk model and Helsinki City model. The results show that the proposed SAPR can greatly enhance the routing performance in DTN context.

Footnotes

Notations

Conflict of Interests

All authors do not have any possible conflict of interests.

Acknowledgments

This research is supported in part by Natural Science Foundation of Shandong Province under Grant no. ZR2013FQ022,Science and Technology Plan Project for Colleges and Universities of Shandong Province under Grant no. J14LN85,and Foundation Research Project of Qingdao Science and Technology Plan under Grant no. 12-1-4-2-(14)-jch.

References

Partan

Kurose

Levine

B. N.

A survey of practical issues in underwater networks

SIGMOBILE Mobile Computing Communications Review 2007 11 4 23 33

Guo

Wang

Cui

J.-H.

Generic prediction assisted single-copy routing in underwater delay tolerant sensor networks

Ad Hoc Networks 2013 11 3 1136 1149

10.1016/j.adhoc.2012.11.012

2-s2.0-84875696135

Soares

V. N. G. J.

Rodrigues

J. J. P. C.

Farahmand

GeoSpray: a geographic routing protocol for vehicular delay-tolerant networks

Information Fusion 2014 15 1 102 113

10.1016/j.inffus.2011.11.003

2-s2.0-84885959091

Erramilli

Chaintreau

Crovella

Diot

Diversity of forwarding paths in pocket switched networks

Proceeding of the 7th ACM SIGCOMM Conference on Internet Measurement (IMC ’07)

October 2007

New York, NY, USA

161 174

10.1145/1298306.1298330

2-s2.0-42149167886

Hui

Chaintreau

Scott

Gass

Crowcroft

Diot

Pocket switched networks and human mobility in conference environments

Proceedings of the ACM SIGCOMM Workshop on Delay-Tolerant Networking (WDTN ’05)

August 2005

244 251

10.1145/1080139.1080142

2-s2.0-84885799779

Kayastha

Niyato

Wang

Hossain

Applications, architectures, and protocol design issues for mobile social networks: a survey

Proceedings of the IEEE 2011 99 12 2130 2158

10.1109/JPROC.2011.2169033

2-s2.0-81855199790

Vastardis

Yang

Mobile social networks: architectures, social properties, and key research challenges

IEEE Communications Surveys and Tutorials 2013 15 3 1355 1371

10.1109/SURV.2012.060912.00108

2-s2.0-84881315820

Chuah

M. C.

Cheng

Davison

B. D.

Enhanced disruption and fault tolerant network architecture for bundle delivery (EDIFY)

Proceedings of the IEEE Global Telecommunications Conference (GLOBECOM ’05)

December 2005

St. Louis, Mo, USA

807 812

10.1109/GLOCOM.2005.1577751

2-s2.0-33846601871

Fall

Farrell

DTN: an architectural retrospective

IEEE Journal on Selected Areas in Communications 2008 26 5 828 836

10.1109/JSAC.2008.080609

2-s2.0-44649172478

10.

Fan

X.-M.

Shan

Z.-G.

Zhang

B.-X.

Chen

State-of-the-art of the architecture and techniques for delay-tolerant networks

Acta Electronica Sinica 2008 36 1 161 170

2-s2.0-41649099522

11.

Lindgren

Doria

Schelén

Probabilistic routing in intermittently connected networks

SIGMOBILE Mobile Computing Communications Review 2003 7 3 19 20

12.

Groenevelt

Koole

Nain

Message delay in MANET

Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS ’05)

June 2005

412 413

10.1145/1064212.1064280

2-s2.0-33244486793

13.

Spyropoulos

Psounis

Raghavendra

C. S.

Performance analysis of mobility-assisted routing

Proceedings of the 7th ACM/IEEE International Symposium on Mobile Ad Hoc Networking and Computing (MobiHoc ’06)

2006

14.

Aldous

Fill

J. A.

Reversible Markov Chains and Random Walks on Graphs 2002

15.

Small

Haas

Z. J.

Resource and performance tradeoffs in delay-tolerant wireless networks

Proceedings of the ACM SIGCOMM Workshop on Delay-Tolerant Networking (WDTN ’05)

August 2005

260 267

10.1145/1080139.1080144

2-s2.0-84885755298

16.

Karagiannis

Le Boudec

J.-Y.

Vojnović

Power law and exponential decay of inter contact times between mobile devices

Proceedings of the 13th Annual ACM International Conference on Mobile Computing and Networking (MobiCom ’07)

September 2007

183 194

10.1145/1287853.1287875

2-s2.0-37749052702

17.

Keranen

Ott

Karkkainen

The ONE simulator for DTN protocol evaluation

Proceeding of the 2nd International Conference on Simulation Tools and Techniques

2009

1 10

18.

Zhang

Neglia

Kurose

Towsley

Performance modeling of epidemic routing

Computer Networks 2007 51 10 2867 2891

10.1016/j.comnet.2006.11.028

2-s2.0-34247602959

19.

Dohyung

Hanjin

Ikjun

Minimizing the impact of buffer overflow in DTN

Proceedings of the International Conference on Future Internet Technologies (CFI ’08)

June 2008

Seoul, Republic of Korea

20.

Settawatcharawanit

Yamada

Haque

M. E.

Rojviboonchai

Message dropping policy in congested social Delay Tolerant Networks

Proceedings of the 10th International Joint Conference on Computer Science and Software Engineering (JCSSE ’13)

May 2013

Maha Sarakham, Thailand

116 120

10.1109/JCSSE.2013.6567330

2-s2.0-84883383513

21.

Wang

Jin

Chen

An optimal relaying scheme for delay-tolerant networks with heterogeneous mobile nodes

IEEE Transactions on Vehicular Technology 2013 62 5 2239 2252

10.1109/TVT.2012.2237046

2-s2.0-84879134248

22.

Cha

Talipov

Cha

Data delivery scheme for intermittently connected mobile sensor networks

Computer Communications 2013 36 5 504 519

10.1016/j.comcom.2012.12.001

2-s2.0-84873998781

23.

Xiao

M. J.

Huang

L. S.

Community-aware opportunistic routing in mobile social networks

IEEE Transactions on Computers 2014 63 7 1682 1695

10.1109/TC.2013.55

24.

Bulut

Szymanski

B. K.

Friendship based routing in delay tolerant mobile social networks

Proceedings of the 53rd IEEE Global Communications Conference (GLOBECOM ’10)

December 2010

1 5

10.1109/GLOCOM.2010.5683082

2-s2.0-79551643833

A Statistical Analysis Based Probabilistic Routing for Resource-Constrained Delay Tolerant Networks

Abstract

1. Introduction

2. Assumptions and Preliminary

3. Statistical Analysis Based Probabilistic Routing (SAPR)

3.1. Estimating Exponential Distribution Parameter

3.2. Computing IMT's Mathematical Expectation

Theorem 1.

Proof.

Corollary 2.

Proof.

3.3. Predicting Message's Delivery Probability

Theorem 3.

Proof.

3.4. Next Hop Selection Strategy

Algorithm 1: Next hop routing selection on node N a .

3.5. Buffer Management Policy

Theorem 4.

Proof.

Definition 5.

Algorithm 2: Buffer management strategy.

Algorithm 3: Back track process.

Algorithm 4: Bound process.

4. Simulation

4.1. Simulation Settings

4.2. Evaluation Metrics

4.3. Simulation Results

4.3.1. Performance Evaluations in Random Walk Model

4.3.2. Performance Evaluations in Helsinki City Model

4.3.3. Performance Evaluations of Dropped Messages

5. Related Works

6. Conclusion

Footnotes

Notations

Conflict of Interests

Acknowledgments

References

Algorithm 1: Next hop routing selection on node $N_{a}$ .