Sage Journals: Discover world-class research

Abstract

In view of the traditional increment localization method, only the heteroscedasticity caused by the error accumulation is considered unilaterally a kind of incremental localization algorithm based on multivariate analysis is proposed. The algorithm combines the feasible weighted least squares (FWLS) in the multianalysis with the canonical correlation regression (CCR) and utilizes the FWLS to solve the heteroscedasticity caused by the error accumulation in the process of incremental localization; in the process of estimation, the CCR is used to solve the topology problems between the original beacon nodes and new beacon nodes. The simulation results show that the method can not only effectively restrain the influence caused by the accumulative errors but also can adapt to the different node topological shapes, so as to improve the positioning accuracy of the nodes.

1. Introduction

In many application problems related with sensor network, location information of nodes is of great importance to the monitoring activity of the whole network, which plays a critical role in many applications [1]. Monitoring data without nodes’ location information is often of no use. 80% of information provided by sensor nodes to users related with the monitored area is connected with location [2]. In the application of wireless sensor network, nodes’ location information can be acquired by adding GPS to nodes. However, this is only applicable with outdoor and open-sided circumstances. Besides, GPS is large in volume and high in cost and energy consumption. Moreover, GPS also needs stable base installations. These facts have made it difficult to realize the requirements of sensor network, which are “low price, low cost and low energy consumption” [3]. As for this, in the deploying area, only some of the nodes can be installed with GPS. For the rest nodes, their location information can only be calculated via a certain algorithm. However, in monitoring area, some unknown nodes not within communication radius of beacon nodes cannot be estimated out though ordinary localization algorithm concurrently at once for the reasons that communication radiuses of sensor nodes are limited by energy, randomness of nodes distribution, barrier between nodes, and so on, which will cause deficient monitoring area coverage; as a result, the quality of sensor network service decreases rapidly and is not able to effectively monitor the deployment area. The most frequently used solution is to increase the coverage of nodes through mobile beacon nodes [4], but the path, difficult reachability, and relatively larger power consumption of mobile nodes and others limit the application of this method. Incremental estimation algorithm [5, 6] is another method to enhance nodes coverage that has some advantages, such as, it does not need to consider path problems, and is not limited by actions of mobile nodes, and moreover consumes much less power compared with mobile nodes. Incremental algorithm will estimate unknown nodes near beacon nodes at first; these unknown nodes will act as new beacon nodes once their locations are determined, then locations of the rest unknown nodes will be estimated by newly added beacon nodes together with original beacon nodes and so on in a similar way, locations of all nodes were estimated out.

Incremental localization is a kind of low energy consumption positioning method which can effectively solve the coverage rate of the monitoring area; through outward extension in turn, each node is localized successively. Due to the successive localization, the previous estimation error will be bound to affect the following estimated accuracy. Such error accumulation inevitably leads to the inconsistency of the variance between the previous error term and the following localized error term; such phenomenon is also known as heteroscedasticity [7, 8]. In the process of location estimation, the heteroscedasticity appears, then the traditional ordinary least squares (OLS) is used to estimate the location of unknown nodes; the estimated value of the obtained node coordinate may not be the efficient estimator, even not the asymptotically efficient estimator. In order to correct the adverse effect caused by the heteroscedasticity, Meesookho et al. [9] proposed the weighted least squares (WLS); they used the reciprocal of error variance as the weighting of weight to restrain the error propagation. WLS is considered to be the improved OLS method; similar to OLS, the residual sum of squares is solved firstly by WLS and followed by the minimum value. However, in the process of finding the residual sum of squares, unlike OLS, WLS considers the influence of heteroscedasticity. In view of considering the heteroscedasticity based on location estimation by WLS, the location accuracy is improved through corresponding different weights with different data. Subsequently, Xiong et al. [10] proposed a kind of incremental node localization method with the optimal weighted least squares on the basis of WLS; this method is based on the obtained optimal weighted least squares when the error variance matrix is estimated as the minimal. Ji and Liu [5] proposed another strategic improved incremental localization approach (IILA) with the hypothesis that previous localization accuracy is greater than next one during incremental localization, this means that the estimation of location nearer to original beacon nodes is more accurate; on the basis of this assumption, with estimated distance of previous location as a constraint condition, the localization problem is converted into trust region sequences that can be solved by sequential quadratic programming (SQP) method. However, IILA did not consider sensor network as a kind of multihop network that there are many paths to certain node, and neglected complexity of deployment environment but only assumed errors during localization process that definitely increase with increasing hop counts; that is, heteroscedasticity of localization process is only monotonically increasing. In complex monitoring environment, variation tendency of heteroscedasticity is difficult to predict, is not necessarily monotonically increasing, but also is possibly decreasing or concurrently increasing and decreasing. For example, as shown in Figure 1, in monitoring area, there are many paths from node A to unknown node D, such as A → B → C → D and A → E → D due to environment complexity of monitoring area, barrier or interference sources exist between node A and node E, and accuracy of measurement from node A to Node E is far less than that of other nodes; as a result, it is not appropriate to estimate ED distance if AE distance acts as the constraint condition.

Figure 1

Localization in complex environment.

In addition, there is a problem that error of locations through estimation has directivity, for example, in Figure 1, errors between node A and node E could along direction of $\vec{AE}$ also could along $\overset{\leftarrow}{AE}$ ; similarly, errors between node E and node D also have directivity; if the direction of errors between node A and node E is in opposite direction of errors between node E and node D, errors between node E and node D is possibly less than those between node A and node E therefore, the assumption of IILA would not be valid any more.

In pervious incremental algorithms, most incremental localization algorithms are used to adjust heteroscedasticity during localization process and assume that heteroscedasticity is only monotonically increasing but fail to take deployment environment and networking features of sensor network into account. Sensor network is a kind of multi-hope network with relatively worse deployment environment, and incremental pattern of its heteroscedasticity is complicated and diversified for incremental localization algorithm. Furthermore, as same as concurrent localization algorithm, the accuracy of incremental localization is influenced by topology quality of beacon nodes also, and multicollinearity problem [11, 12] caused by topological relations among beacon nodes is not considered by previous incremental algorithms. For these reasons, we will propose a feasible incremental localization algorithm, (Location Estimation-FWLS-CCR) LE-FWLS-CCR, which uses less beacon nodes and considers multi-hops features of sensor network, error accumulation, heteroscedasticity and multicollinearity, and other problems. This algorithm will resolve heteroscedasticity problem through feasible weighted least squares (FWLS) [13] iterative computation mode. The iterative process is more proper for multi-hops features, while in estimation process, we adopts Canonical Correlation Regression (CCR) [12, 14–16] multivariate analysis to solve errors exist in newly added beacon nodes as well as topological relations between newly added and original beacon nodes.

2. Relevant Knowledge Review

2.1. Feasible Weighed Least Squares

In a concurrent localization process, distance-coordinates formula is generally transformed into a form of $A x = b + ξ$ [11], in which ξ is distance-measuring error term. The hypothesis to deduce locations of unknown nodes by distance-coordinates formula is variance of distance-measuring errors, $Var (ξ) = σ^{2} I_{n}$ ( $σ^{2}$ is a constant, $I_{n}$ is unit matrix), and in estimation algorithm, constant variance of errors is also known as homoscedasticity [7, 8]. While in location estimation by incremental algorithm, distance-measuring error is not inevitable, so variance of distance-measuring errors ( $Var (ξ) = σ^{2} Ω \neq σ^{2} I_{n}$ ) in step-by-step localization process would not be a constant, and this manifestation is called as heteroscedasticity. Because incremental localization process is complicated, there are abundant heteroscedasticity problems in it. Due to existence of heteroscedasticity, the results of typical location estimation models are not accurate and effective.

In the presence of heteroscedasticity, the positions of localization data in location estimation are different; smaller variance of error term of data means higher confidence level of residuals, while bigger variance of error term means lower confidence level of residuals. Therefore, as to the estimation of location in coordinates under the circumstance that heteroscedasticity exists, it usually uses weighted least squares method to discriminate different residuals [17], namely, to pay adequate attention to data terms with relatively smaller residuals and so assign larger weights on them and smaller weights on data term with larger residual in order to adjust the effect of various data items on estimation computation, and therefore to obtain an effective estimation value in localization process.

So, variance of error term of distance-coordinates formula exhibits heteroscedasticity which is expressed as

\begin{matrix} Var (ξ) = σ^{2} Ω, \end{matrix}

(1)

among which $σ^{2}$ represents a constant; Ω denotes the n-order symmetric positive definite matrices. It is easy to understand that a n-order invertible matrix must exist, so as to make the following formula true.

\begin{matrix} Ω = D D^{T} ⟹ D^{- 1} Ω {(D^{T})}^{- 1} = I_{n} . \end{matrix}

(2)

$D^{- 1}$ is multiplied at both sides of the equation $A x = b + ξ$

\begin{matrix} D^{- 1} A x + D^{- 1} ξ = D^{- 1} b . \end{matrix}

(3)

Assume, $b^{*} = D^{- 1} b$ , $A^{*} = D^{- 1} A$ , $ξ^{*} = D^{- 1} ξ$ , then (3) can be converted into

\begin{matrix} b^{*} + ξ^{*} = A^{*} x \end{matrix}

(4)

Then the variance of the error term is

\begin{array}{l} Var (ξ^{*}) = E [ξ^{*} {(ξ^{*})}^{T}] = E [D^{- 1} ξ {(D^{- 1} ξ)}^{T}] \\ = E [D^{- 1} ξ ξ^{T} {(D^{- 1})}^{T}] \\ = D^{- 1} E [ξ ξ^{T}] {(D^{- 1})}^{T} \\ = D^{- 1} σ^{2} Ω {(D^{- 1})}^{T} \\ = σ^{2} D^{- 1} Ω {(D^{- 1})}^{T} \\ = σ^{2} I_{n} . \end{array}

(5)

Then, the heteroscedasticity of the error term is eliminated, and it is easy to learn that $E (ξ^{*}) = 0$ . Obviously, the error term $ξ^{*}$ in (5) meets the assumption of the least squares model; therefore, there is the loss equation $S (x)$ :

\begin{array}{l} S (x) = {(ξ^{*})}^{T} ξ^{*} \\ = {(b^{*} - A^{*} x)}^{T} (b^{*} - A^{*} x) \\ = {(b - A x)}^{T} Ω^{- 1} (b - A x) . \end{array}

(6)

In order to obtain the optimal solution, we must make

\begin{matrix} \min {(b - A x)}^{T} Ω^{- 1} (b - A x) . \end{matrix}

(7)

It is assumed that ${\hat{x}}_{WLS}$ is the minimized optimal solution. Therefore, ${\hat{x}}_{WLS}$ meets the minimal least squares equation solution as follows:

\begin{matrix} (A^{T} Ω^{- 1} A) {\hat{x}}_{WLS} = A^{T} Ω^{- 1} b . \end{matrix}

(8)

Obviously, if the row vector of A is linearly independent, then the row vector of $A^{*}$ is linearly independent. So, $(A^{*})^{T} A^{*} = (A^{*})^{T} Ω^{- 1} A^{*}$ is reversible; thus, the optimal solution of (8) is expressed as

\begin{matrix} {\hat{x}}_{WLS} = {(A^{T} Ω^{- 1} A)}^{- 1} A^{T} Ω^{- 1} b . \end{matrix}

(9)

Through Schwarz inequality [17, 18], it is proved that when the matrix Ω is the reciprocal of the variance matrix of the range error under the condition that the ratio of range error to the distance is independent Gaussian random variables, the error variance by WLS is minimal. But in reality, the variance of the error term is unknown; therefore, if WLS is solved, the weight needs to be taken according to the actual situation.

FWLS is a feasible method which is able to overcome the problem that cannot be implemented by WLS due to the unavailable weight. FWLS uses residuals attained at each computation as weight matrix; therefore, real weight values can be acquired in computation process, and procedure of FWLS algorithm is shown in Algorithm 1.

Algorithm 1: FWLS algorithm.

(1) Firstly, it is essential to estimate the model through OLS method and obtain the estimated value

$\hat{x}$ ; then substitute it into the equation, and obtain the residual

error ${\hat{u}}_{0} = b - A \hat{x}$ accordingly.

(2) Utilize the square of the residual term as the Ω matrix, for example, ${\hat{Ω}}_{i} = diag ({\hat{u}}_{i, 0}^{2}, {\hat{u}}_{i, 1}^{2}, \dots, {\hat{u}}_{i, n - 1}^{2})$

(3) Obtain the next-order estimated value and residual value by WLS

${\hat{x}}_{i + 1} = (A^{T} {\hat{Ω}}_{i}^{- 1} A)^{- 1} A^{T} {\hat{Ω}}_{i}^{- 1} b$

${\hat{u}}_{i + 1} = b - A {\hat{x}}_{i + 1}$

(4) Go back to Step 2, until the number of iterative times meet the number of times according to the

algorithm requirements.

It can be noted that the FWLS algorithm is in marching iteration; the derivation of the optimal estimated value ${\hat{x}}_{i}$ in each step is based on the assumption of nonexistent multicollinearity problem in $A^{T} {\hat{Ω}}_{i}^{- 1} A$ ; unfortunately, by virtue of FWLS, it is feasible to eliminate the interference of heteroscedasticity, but the multicollinearity interference cannot be sure to eliminate. Therefore, in the process of iteration, it is essential to make corresponding strategies to avoid the algorithm insolubility caused by the multicollinearity.

2.2. Canonical Correlation Regression

In concurrent localization algorithm, beacon nodes have a very large influence on final location estimation and possibly cause significant errors when beacon nodes relations are collinear or approximately collinear [11]. Principal component analysis (PCA) in multivariate analysis will remove partial information through recombination of coordinate information of beacon nodes in order to reduce noise and effects of multicollinearity.

As to incremental localization algorithm, it can use PCA [19] method in the first estimation of coordinates of unknown nodes to avoid problems caused by multicollinearity in location estimation, but in incremental algorithm, measuring errors cannot be absolutely avoided or eliminated, which indicates that there would be always errors in locations of newly added beacon nodes in coordinates; therefore, it requires processing error information in output message. PCA only processes input variables; for incremental localization algorithm, its output variables act as locations of newly added beacon nodes, and errors contained in them shall be preprocessed to a certain extent.

Canonical correlation analysis (CCA) is a kind feasible and powerful multivariate analysis method especially appropriate for processing and analysis of two correlated data. At the same time, it is a kind of descending dimension method similar to PCA and is also able to remove some noise information that contains collinear information through recombination of data like PCA. CCA pays more attention to data processing and analysis of correlated data; for this reason, it is more proper for regression algorithm and has higher regression accuracy than PCA.

For the equation $A x = b + ξ$ , the solution procedure of CCA is as follows

Suppose that there are two groups of data, A and b, which have been processed with centralization, $A \in ℝ^{p}$ , $b \in ℝ^{q}$ , CCA is mainly used to seek linear combination of A and b, $w_{A}^{T} A$ and $w_{b}^{T} b$ , respectively, making them correlate with each other maximally, that is to say, to find the maximum solution of following equation:

\begin{array}{l} ρ = \frac{E [w_{A}^{T} A b^{T} w_{b}]}{\sqrt{E [w_{A}^{T} A A^{T} w_{A}] E [w_{b}^{T} b b^{T} w_{b}]}} \\ = \frac{w_{A}^{T} C_{A b} w_{b}}{\sqrt{w_{A}^{T} C_{A A} w_{A} w_{b}^{T} C_{b b} w_{b}}}, \end{array}

(10)

in which $C_{A A} \in ℝ^{p \times p}$ , $C_{b b} \in ℝ^{p \times p}$ is within-set covariance matrix of variable A and $b,$ respectively, $C_{A b} \in ℝ^{p \times q}$ means between-set covariance matrix, and moreover $C_{A b} = C_{A b}^{T} \in ℝ^{p \times q}$ .

The correlation function ρ is independent of scales of $w_{A}$ and $w_{b}$ ; by constraining respective within-set covariance $C_{A A}$ and $C_{b b}$ of A and b, CCA can be formulated as solution of optimization problem of the following equation:

\begin{matrix} \begin{matrix} \max_{w_{A}, w_{b}} w_{A}^{T} C_{A b} w_{b} \\ s . t . w_{A}^{T} C_{A A} w_{A} = 1, w_{b}^{T} C_{b b} w_{b} = 1 . \end{matrix} \end{matrix}

(11)

To solve this optimization problem of (11), we can build a Lagrange equation to obtain the optimal solution; that is,

\begin{array}{l} L (w_{A}, w_{b}, λ_{1}, λ_{2}) = w_{A}^{T} C_{A b} w_{b} \\ + \frac{1}{2} λ_{1} (1 - w_{A}^{T} C_{A A} w_{A}) \\ + \frac{1}{2} λ_{2} (1 - w_{b}^{T} C_{b b} w_{b}) . \end{array}

(12)

Differentiating (12) with $w_{A}$ and $w_{b}$ , partial derivatives are as follows:

\begin{matrix} \frac{\partial L}{\partial w_{A}} = C_{A b} w_{b} - λ_{1} C_{A A} w_{A} \end{matrix}

(13a)

\begin{matrix} \frac{\partial L}{\partial w_{b}} = C_{b A} w_{A} - λ_{2} C_{b b} w_{b} . \end{matrix}

(13b)

To obtain the optimal solution, let (13a), (13b), (14a),and (14b) equal to zero, then

\begin{matrix} C_{A b} w_{b} = λ_{1} C_{A A} w_{A}, \end{matrix}

(14a)

\begin{matrix} C_{b A} w_{A} = λ_{2} C_{b b} w_{b} . \end{matrix}

(14b)

Multiply both sides of formula (14a) and (14b) by $w_{A}$ and $w_{b}$ from left, respectively, easily obtain $λ_{1} = λ_{2}$ , and take $λ_{1} = λ_{2} = λ$ , then the above formula can be simplified into

\begin{matrix} C_{A b} w_{b} = λ C_{A A} w_{A}, \end{matrix}

(15a)

\begin{matrix} C_{b A} w_{A} = λ C_{b b} w_{b} . \end{matrix}

(15b)

Given $C_{b b}$ is reversible, from (15b), obtain $w_{b} = (1 / λ) C_{b b}^{- 1} C_{b A} w_{A}$ , and then substitute it into (15a), and reorganize them into

\begin{matrix} \begin{matrix} C_{A b} C_{b b}^{- 1} C_{b A} w_{A} = λ^{2} C_{A A} w_{A} \end{matrix}, \end{matrix}

(16a)

\begin{matrix} C_{b A} C_{A A}^{- 1} C_{A b} w_{b} = λ^{2} C_{b b} w_{b} . \end{matrix}

(16b)

Here, the solution of CCA was translated into generalized eigenvalue-eigenvector problem of two matrixes whose scales are $p \times p$ , and $q \times q$ respectively. And then CCA problem is equally described as generalized eigenvalue problem of formula (17):

\begin{matrix} (\begin{pmatrix} A b^{T} \\ b A^{T} \end{pmatrix}) (\begin{pmatrix} w_{A} \\ w_{b} \end{pmatrix}) = λ (\begin{pmatrix} A A^{T} \\ b b^{T} \end{pmatrix}) (\begin{pmatrix} w_{A} \\ w_{b} \end{pmatrix}) . \end{matrix}

(17)

Formula (17) can be abridged into $X w = λ Y w$ , in which X, $Y,$ respectively, correspondes to left and right matrix of previous formula, $w = [w_{A}^{T}, w_{b}^{T}]^{T}$ therefore, $w_{A}$ and $w_{b}$ are eigenvectors of $(A^{T} A)^{- 1} A^{T} b (b^{T} b)^{- 1} b^{T} A$ and $(b^{T} b)^{- 1} b^{T} A (A^{T} A)^{- 1} A^{T} b$ , respectively.

The literature [12, 15] proposed a regression method based on canonical correlation analysis, namely, canonical correlation regression (CCR). CCR combines least squares with canonical correlation analysis with the purposes of optimizating solution of regression coefficient under most relevant significance. CCA-based regression method, to some extent, avoids interferences of multicollinearity of samples through utilization of components that have been features-extracted for regression; in addition, CCA considers correlation between output and input invariables and so can be regarded as an advanced regression between two multivariates, an extension of multiple linear regression (MLR), and known as a “multi-to-multi” regression method. Regression coefficient of canonical correlation regression can be computed out by following equation:

\begin{matrix} {\hat{x}}_{CCR} = (W_{k} W_{k}^{T}) x^{T} b, \end{matrix}

(18)

in which, $W_{k} = [w_{A}^{1}, w_{A}^{2}, \dots, w_{A}^{k}]$ is composed of first k of maximum eigenvector.

3. Methodology

3.1. The LE-FWLS-CCR Algorithm

The existence of multicollinearity usually brings seriously adverse effects on model estimation, testing, and prediction. During localization estimation, multicollinearity not only exists in concurrent localization but also exists in incremental location estimation process. For this reason, we add canonical correlation regression method into FWSL algorithm to acquire optimal prediction direction through correlation analysis of input and output variables and dimension reduction processing then use FWLS method to resolve problems caused by heteroscedasticity. Because the procedure of FWLS-CCR localization algorithm is similar to that of FWLS algorithm, its solution is carried out through iteration, and the algorithm process is shown in Algorithm 2.

Algorithm 2: LE-FWLS-CCR algorithm.

Input: Beacon nodes coordinates:

Distance from beacon nodes to unknown nodes ${x_{1}, x_{2}, \dots, x_{m}}, m \geq 3$

Output: Estimated coordinates of unknown nodes: ${{\hat{x}}_{m + 1}, {\hat{x}}_{m + 2}, \dots, {\hat{x}}_{n}}$

(1) Beacon nodes deliver their location information outwards through controllable flooding, if unknown nodes

acquire more than 3 beacon nodes, it firstly uses CCR to estimate locations of unknown

nodes, and will stop if there is no rest unknown nodes, otherwise, will carry out next step.

(2) Uses estimated location to estimate residual vector by the estimation formula: ${\hat{u}}_{0} = b - A {\hat{x}}_{CCR}$ ;

(3) Constructs covariance matrix through FWLS method residual vectors, ${\hat{Ω}}_{i} = diag ({\hat{u}}_{i, 0}^{2}, {\hat{u}}_{i, 1}^{2}, \dots, {\hat{u}}_{i, n - 1}^{2})$ ;

(4) Uses newly-constructed covariance matrix to rewrite location distance equation, $A x = b + ξ$ ;

(5) According to new equation, uses CCR method to estimate locations of secondary beacon nodes.

(6) If there still are some nodes which locations have not be estimate out in deployment area, skip to Step 2.

(7) The algorithm will finish if there is no node to be estimated in deployment area, and it will output

coordinates of unknown nodes.

Node localization process based on FWLS-CCR is shown in Figure 2. Assume that the monitoring area is deployed with several sensor nodes and $L_{1}$ , $L_{2}$ , $L_{3}$ , and $L_{4}$ are original beacon nodes, which are set to be zero-level beacon nodes. The nodes of A, B, and C are the nodes to be localized. Node A is directly connected with the three original beacon nodes of $L_{1}$ , $L_{2}$ , and $L_{3}$ . Node B is connected with $L_{3}$ and $L_{4}$ , while node C is only connected with $L_{1}$ .

Figure 2

Example and phases of LE FWLS-CCR.

Obviously, the coordinates of node A can be calculated and estimated according to the beacon nodes of $L_{1}$ , $L_{2}$ , and $L_{3}$ . According to CCR incremental localization algorithm, node A may be updated to a new beacon node after adopting CCR algorithm to calculate its estimated coordinates, and the node may be set as the first level beacon node. By calculating the residual, matrix Ω can be obtained. Assuming that $b^{*} = D_{}^{- 1} b$ , $A^{*} = D^{- 1} A$ , the location calculation equation is transformed into the form of $b^{*} + ξ^{*} = A^{*} x$ . taking original beacon nodes $L_{3}$ , $L_{4}$ , and the newly added beacon node A as the referential nodes, and then readopt CCR algorithm to figure out the estimated coordinates of node B. After the calculation, node B is added as a new beacon node, with second level. Similarly, the location of node C can be estimated based on beacon $L_{1}$ , node A, and node B. Node C is defined to be a third level beacon node.

In incremental localization approach, localization of nodes is implemented in batches. Owing to distance measurement error, there is a certain difference between the estimated value and the practical value of first level beacon node. As for second level beacon nodes, their estimated values are as well influenced by measurement error and the intrinsic error of first level beacon nodes. In addition, it is shown in Figure 2 that, when estimating node C, the positions of referential nodes $L_{1}$ , A, and B are founded to be approximately collinear, which may lead to even larger error when estimating node C. On this basis, FWLS-CCR method is applied to utilize CCR to reduce the dimensionality of input and output data after the data is analyzed. The objective is to partially eliminate error and multicollinearity. Meanwhile, attainable residuals are adopted as the weight value, making the algorithm even feasible and applicable to practical environment. Thus, compared with previous algorithms, localization algorithm based on FWLS-CCR is more feasible, with higher adaptability.

3.2. Time Complexity Analysis

In this Section, we compare the time complexities for localization as required by LE-FWLS-CCR and other popular incremental location estimation algorithms, namely, WLS-based location estimation (LE-WLS) proposed in the literature [10] as well as SQP-based location estimation (LE-IILA) proposed in the literature [5]. These algorithms will also be compared experimentally in Section 4.

LE-FWLS-CCR: basically, the complexity of our algorithm is dominated by two parts: canonical correlation regression and feasible weighted least squares. The complexity of computing CCR is mainly determined by the core algorithm CCA, whose complexity is $O (n^{3} \log n)$ . FWLS method is OLS improvement, which uses residuals attained at each computation as weight matrix; therefore, complexity of FWLS is $O (n^{4})$ . Thus, the total complexity of the FWLS-CCR is $O (n^{4})$ .

LE-WLS: WLS-based location estimation method using the WLS method as the core algorithm, and the complexity of WLS is $O (n^{3})$ ; thus, the computer complexity of LE-WLS method is $O (n^{3})$ .

LE-IILA: the complexity of SQP method is the main reason in LE-IILA algorithm. If the limited number of iterations, the computational complexity of SQP is $O (k^{2} + k n)$ , where k is the number of iterations. If $n ≫ k$ , the complexity of computer is $O (k n)$ , otherwise the complexity of computer is $O (k^{2})$ .

The literature [20] has proved that properly increasing the calculation amount of algorithm will not affect the performance of sensor network. For this reason, it is worthy to improve the localization accuracy based on FWLS-CCR by sacrificing partial calculation volume.

4. Simulation and Experiments

This section will analyze and evaluate LE-FWLS-CCR localization algorithm on Matlab platform. In simulation experiment, it is supposed that nodes are deployed in a two-dimensional monitoring area and adopt transformation of RSSI signals to distance for matrix of distance among nodes. In order to compare impartiality of experimental results, this section adopts signal model proposed in the literature [21] to simulate signal strength among nodes; that is,

\begin{matrix} \begin{matrix} P_{i j} ~ N ({\bar{P}}_{i j}, σ_{d B}^{2}) \end{matrix}, \\ {\bar{P}}_{i j} = P_{0} - 10 n_{p} lg (\frac{d_{i j}}{d_{0}}), \end{matrix}

(19)

among which $P_{i j}$ represents the transmitted signal power which is received by node i from the node j, and the unit is dBm; $P_{0}$ represents the received signal power corresponding to the point of the reference range $d_{0}$ ; $d_{0}$ represents the reference range; $n_{p}$ represents the attenuation coefficient of the wireless transmission and is related to the environment; ${\bar{P}}_{i j}$ represents the received signal power corresponding to the point of the reference range $d_{0}$ (dBm); $σ_{d B}^{2}$ represents the shadow variance. $n_{p}$ uses fitting data from real collection in the literature [20]; as for $σ_{d B}^{2}$ , let $σ_{d B}^{2} / n_{p} = 1 . 2$ in this experiment.

Due to higher coverage of incremental algorithm, the experiment in this section mainly examines the accuracy of localization of nodes with ALE as evaluation basis, and the definition of ALE is as follows:

\begin{matrix} ALE = \frac{\sum_{i = 1}^{n} \sqrt{{({\hat{x}}_{i} - x_{i})}^{2} + {({\hat{y}}_{i} - y_{i})}^{2}}}{n \times R} \times 100 % . \end{matrix}

(20)

In the formula, $({\hat{x}}_{i}, {\hat{y}}_{i})$ represents the estimated coordinate location of the ith node, $(x_{i}, y_{i})$ represents the actual coordinate location of the ith node n represents the number of the unknown nodes, and R represents the communication radius. It can be seen from the above formula that ALE refers to the ratio of the average error of the Euclidean distance from the estimation location of all nodes to the real location in the area to the communication radius. ALE can reflect the stability of the localization algorithm and the positioning accuracy; when the communication radius of the node is given, if the average localization error of the node is smaller, then the positioning accuracy of the algorithm is higher and vice versa.

This experiment also compares the method proposed in this paper, the localization algorithm based on FWLS-CCR (LE-FWLS-CCR), with WLS-based location estimation (LE-WLS) proposed in the literature [10], as well as IILA-based location estimation (LE-IILA) proposed in the literature [5]. In addition, the experiment carries out comparison by use of data collected from actual scenes provided in the literature [20].

4.1. Simulation Experiments Based on Distance-Measuring Model

The experiments based on distance-measuring model have set four experimental scenes: random deployment nodes in square area, regular deployment nodes in square area, random deployment nodes in C-shape area and regular deployment nodes in C-shape area, in which C-shape area, is formed because of a bigger barrier, mainly used to evaluate localization performance with lager barrier, that is, in case of non-line-of-sight. In order to decrease the effect of single one experiment, each group of experiments will be repeated for 50 times in each scene, finally the average indicators of the 50 experiments will be reported. The experiments will examine accuracy of final localization results of unknown nodes with incremental quantity of beacon nodes. In these experiments, the valid communication radius of nodes is supposed to be 60 m.

4.1.1. Rules Deployment

Regular deployment of nodes in monitoring area principally aims to explore effects of collineation of beacon nodes on localization accuracy, while regular deployment in C-shape area is used to observe effects of non-line-of-sight caused by barriers in monitoring area on localization accuracy.

In this group of experiments, regular deployment of nodes is within a $300 m \times 300 m$ area, whose side length of grids is $30 m$ , there are 121 nodes in total without barrier, but a $150 m \times 90 m$ barrier was placed in C-shape area, and then the quantity of nodes becomes 106. In these experiments, select 5–15 nodes as beacon nodes, provided that their location information is known. Figure 3 is the final localization result of nodes of certain deployment in square area under the circumstance that there are 10 beacon nodes, in which circle denotes unknown node, box denotes beacon node, and the straight line connects actual coordinates of unknown node with its estimated coordinates. Figure 3(a) shows deployment of nodes; Figure 3(b) shows localization results based on weighted least squares with weights being reciprocals of variance of error term that is optimal theoretically, and in this figure, $ALE = 70 %$ ; Figure 3(c) shows LE-IILA method proposed by Ji, and in this figure, $ALE = 43.2 %$ ; Figure 3(d) shows LE-FWLS-CRR proposed in this paper, and in this figure, $ALE = 18.2 %$ .

Figure 3

Localization results of regular deployment in square area.

The circumstance that there is a barrier in regular deployment area was described in Figure 4(b) to Figure 4(d), and this experiment mainly is used to study effects of non-line-of-sight on localization results. Figure 4(a) shows deployment of nodes; Figure 4(b) shows localization results of LE-WLS, $ALE = 46.8 %$ ; Figure 4(c) shows LE-IILA method proposed by Ji, $ALE = 42.2 %;$ Figure 4(d) shows LE-FWLS-CRR proposed in this paper, $ALE = 16.1 %$ .

Figure 4

Localization results of regular deployment in C-shape area.

It can be seen from Figures 3 and 4 that LE-WLS and LE-IILA only considered heteroscedasticity but didn't take multicollinearity into account; therefore, only nodes in partially incremental area obtained satisfactory results in experiments, and localization errors in some area are still large. LE-FWLS-CCR method proposed in this paper comprehensively considered heteroscedasticity, error escalation as well as multicollinearity, and other problems, and so localization results are more effective than those of LE-WLS and LE-IILA methods.

Because incremental localization method is used to locate nodes through gradually increment, as shown in Figure 3, barrier did not cause localization coverage reduction. While because of “extrusion” of barrier, the ratio of beacon nodes in such deployment area is higher than the same size of deployment area without barrier under the circumstance that the quantity of beacon nodes is equivalent. Ratio of beacon nodes increases; hence, localization accuracy in most of the area is relatively higher as shown in Figure 4. As shown in Figures 3 and 4, LE-WLS and LE-IILA similarly did not consider multicollinearity between original and newly added beacon nodes; as a result, localization errors of nodes in some areas are large and then affect the whole performance of localization.

Figure 5 describes curve of average ALE of repeated experiments by three localization methods varying with the quantity of beacon nodes in regular deployment scenes. It is easily to find ALE of LE-WLS fluctuating as the strongest, and accuracy is the lowest; LE-IILA is in second place, while LE-FWLS-CCR proposed in this paper is most accurate. This is because original and newly added beacon nodes are being most possibly collinear, furthermore, LE-WLS and LE-IILA algorithms did not consider this and noise was not removed completely, especially LE-WLS method that only considered heteroscedasticity but did not take noise escalation into account; consequently ALE curve waves greatly, sometimes; ALE is approximate to 180%; LE-IILA method only ideally considered noise escalation but did not take multicollinearity into account and so obtain better localization results than LE-WLS method, but the localization results are still not stable, and sometimes ALE is greater than 100%; therefore, the effectiveness of localization by LE-IILA is hard to meet actual needs; the method proposed in this paper considered multiple factors that affect accuracy in incremental localization process and obtained fairly stable localization results and significantly higher accuracy than other incremental localization methods.

Figure 5

Average localization error of regular deployment.

4.1.2. Random Deployment

Random deployment is more close to actual situation. The experiments in this scene are used mainly is to discuss whether this algorithm is proper for various actual situations or not. In the same way, experiments about random deployment are classified into two groups, in C-shape area with barrier and in square area without barrier. In this group of experiments, there are 200 nodes randomly deployed in a $300 m \times 300 m$ monitoring area, similar with regular deployment, to compare LE-FWLS-CCR with LE-WLS and LE-IILA in order to evaluate changes of ALE with quantity of beacon nodes by two algorithms. In the scene with barrier, a $150 m \times 90 m$ object was placed in deployment area to artificially lead to ineffective communication of nodes in this area. In these experiments, select 5–15 nodes as beacon nodes provided that their location of information is known.

Figure 6 shows final localization result of nodes of certain deployment in square area under the circumstance that there are 10 beacon nodes. Figure 6(b) shows localization results of LE-WLS method, with weights being reciprocals of variance of error term that is optimal theoretically, $ALE = 41 %$ ; Figure 6(c) shows LE-IILA method proposed by Ji, $ALE = 18.5 %$ ; Figure 6(d) shows LE-FWLS-CRR proposed in this paper, $ALE = 15.2 %$ . From this figure, localization results of unknown nodes surrounding the area crowded with original beacon nodes are better, but with increasing incremental series, effectiveness of localization by LE-WLS is worse; LE-IILA is better than LE-WLS because the former considered error escalation, but in some areas, estimation results of unknown nodes are still far from actual values, for the reason that it did not consider multicollinearity between newly added beacon nodes and original beacon nodes. The results of LE-FWLS-CRR proposed in this paper are still stable and have higher accuracy than the previous two methods.

Figure 6

Localization results of random deployment in square area.

Figure 7 shows localization result of certain random deployment in C-shape area with barrier. Figure 7(a) shows nodes distribution diagram of certain localization. Figures (b),(c), and (d) show localization result of LE-WLS, LE-IILA, and LE-FWLS-CCR, respectively, and final ALE of various localization in Figure 7 is 38.6%, 15.6%, and 13.1%, respectively. The figure shows very obvious trace of incremental localization, that is, anticlockwise step-by-step location estimation. Due to existence of barrier, same quantity of beacon nodes accounts for a higher proportion in unit area than random deployment in square area; therefore, if the incremental series is lower, three methods’ estimation of accuracy for unknown nodes is high, but with incremental series increasing, advantages and disadvantages of three methods will appear. LE-WLS did not consider error control and multicollinearity, so localization error for higher series is large; although LE-IILA considered error escalation to a certain extent, it did not consider multicollinearity, the localization results were improved but are still poor in some areas. The algorithm proposed in this paper obtained similar localization results with the previous three scenes, and its localization accuracy is still stable and excellent.

Figure 7

Localization results of random deployment in C-shape area.

Figure 8 describes curve of average ALE of repeated experiments by three localization methods varying with quantity of beacon nodes in random deployment scenes. Curves of LE-WLS and LE-IILA methods still wave ups and downs, because randomness of random deployment is far greater than that of regular deployment, and as a result, maximum ALE of LE-WLS and LE-IILA methods is even approximate to 180%; however, the method proposed in this paper still obtains stable results. Owing to full considerations on adverse factors in localization process, the characteristics of random deployment did not reduce localization accuracy greatly but improved it.

Figure 8

Average localization error of random deployment in square area.

4.2. Simulation Experiment Based on Actually Measured Data

This paper uses actually measured data set provided by Neal Patwari of Utah State University. The experiment was arranged in a standard office area that is a $12 m \times 14 m$ rectangle. There are 44 nodes (in which 4 nodes act as beacon nodes) deployed in it; the communications among nodes adopt direct sequence spread spectrum (DS-SS), and the center frequency of deployment nodes is 2.4 GHz. This paper uses these data and enlarges effective communication radius of nodes to compare LE-WLS, LE-IILA, and LE-FWLS-CCR methods proposed in this paper. Table 1 shows that localization results of LE-FWLS-CCR for four different communication radiuses are better than those of other two localization algorithms. The details are shown in Table 1

Table 1

Comparisons of average localization errors based on actual RSSI measurement data.

Wireless communication radius (m)	LE-WLS ALE	LE-IILA ALE	LE-FWLS-CCR ALE
6.5	99.1%	46.6%	20.8%
7	81.8%	39.6%	19.6%
7.5	70.23%	41.3%	18.4%
8	83.51%	49.12%	15.51%

Figure 9 shows the localization results of three algorithms under the circumstance that communication radius is 7 m, in which Figure 9(a) shows nodes deployment; Figure 9(b) is localization result of LE-WLS method, $ALE = 81.8 %$ ; Figure 9(c) is localization result of LE-IILA method, $ALE = 39.6 %$ ; Figure 9(d) shows localization result of LE-FWLS-CCR method proposed in this paper, $ALE = 19.6 %$ . From Figure 9, it can be seen that only localization results of unknown nodes near original beacon nodes are relatively satisfactory in Figure 9(b); in Figure 9(c), localization results of unknown nodes far from original beacon nodes are better than those in Figure 9(b); while localization result in Figure 9(d) is the best in the three.

Figure 9

Localization results of actually measured data.

5. Conclusion

This paper combines FWLS method and CCR method in a localization process that uses FLWS method to solve problems in location estimation caused by heteroscedasticity and uses dimensionality reduction algorithm CCR in multivariate analysis to deal with topology between original and newly added beacon nodes and error accumulation problems. The results of many groups of experiments indicate that the method proposed in this paper can effectively resolve heteroscedasticity, accumulative error, and multicollinearity problems, and its localization results are stable and have higher accuracy than previous incremental localization methods.

Footnotes

Acknowledgments

The paper is sponsored by Natural Science Foundation of China (61005008);Provincial University Natural Science Research Foundation of Jiangsu Education Department (11KJD510002,12KJD510006);Natural Science Foundation of Jiangsu (BK2012082).

References

Yick

Mukherjee

Ghosal

Wireless sensor network survey

Computer Networks 2008 52 12 2292 2330

2-s2.0-46449122114

10.1016/j.comnet.2008.04.002

Cabri

Leonardi

Mamei

Zambonelli

Location-dependent services for mobile users

IEEE Transactions on Systems, Man, and Cybernetics A 2003 33 6 667 681

2-s2.0-0346076753

10.1109/TSMCA.2003.819496

Zheng

Jamalipour

Wireless Sensor Networks a Networking Perspective 2009

New York, NY, USA

John Whiley and Sons

Priyantha

N. B.

Balakrishnan

Demaine

E. D.

Teller

Mobile-assisted localization in wireless sensor networks

Proceedings of the IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE INFOCOM '05)

March 2005

Miami, Fla, USA

172 183

2-s2.0-25844450264

Liu

Accumulative error analysis of incremental node localization approach and its improvement in wireless sensor network

Journal of Nanjing University of Science and Technology 2008 32 4 496 501

2-s2.0-51649126226

Wang

Shi

Ren

Self-localization systems and algorithms for wireless sensor networks

Journal of Software 2005 16 5 857 868

2-s2.0-21144448647

10.1360/jos160857

Lim

Sen

P. K.

Peddada

S. D.

Accounting for uncertainty in heteroscedasticity in nonlinear regression

Journal of Statistical Planning and Inference 2012 142 5 1047 1062

2-s2.0-84856021736

10.1016/j.jspi.2011.11.003

Cribari-Neto

da Silva

W. B.

A new heteroskedasticity-consistent covariance matrix estimator for the linear regression model

AStA Advances in Statistical Analysis 2011 95 2 129 146

2-s2.0-79955482035

10.1007/s10182-010-0141-2

Meesookho

Mitra

Narayanan

On energy-based acoustic source localization for sensor networks

IEEE Transactions on Signal Processing 2008 56 1 365 377

2-s2.0-37749025998

10.1109/TSP.2007.900757

10.

Xiong

Tang

Incremental node localization approach and its improvement in wireless sensor network

Chinese Journal of Sensors and Actuators 2011 24 4 576 580

2-s2.0-79961061060

10.3969/j.issn.1004-1699.2011.04.021

11.

Yan

Qian

A localization method based on principal component analysis

Journal of Computational Information Systems 2012 8 22 9425 9432

12.

Hyötyniemi

Multivariate Regression Techniques and tools

http://autsys.aalto.fi/en/Publications/25041, 2013

13.

Heij

de Boer

Franses

P. H.

Kloek

van Dijk

H. K.

Econometric Methods with Applications in Business and Economics 2004

Oxford, UK

Oxford University Press

14.

Abraham

Merola

Dimensionality reduction approach to multivariate prediction

Computational Statistics and Data Analysis 2005 48 1 5 16

2-s2.0-10144261309

10.1016/j.csda.2003.11.021

15.

Reiter

Enhanced Multiple Output Regression based on Canonical Correlation Analysis with Applications in Computer Vision

2010

16.

Sun

Canonical correlation analysis for multilabel classification: a least-squares formulation, extensions, and analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence 2011 33 1 194 200

2-s2.0-78649325096

10.1109/TPAMI.2010.160

17.

Schick

Weighted least squares estimation with missing responses: an empirical likelihood approach

Electronic Journal of Statistics 2013 7 932 945

10.1214/13-EJS793

18.

Bidwell

Hassell

M. E.

Westphal

C. R.

A weighted leastsquares finite element method for elliptic problems withdegenerate and singular coefficients

Mathematics of Computation 2013 82 282 673 688

19.

Jolliffe

Principle Component Analysis 2002 2nd