Abstract
1. Introduction
In many application problems related with sensor network, location information of nodes is of great importance to the monitoring activity of the whole network, which plays a critical role in many applications [1]. Monitoring data without nodes’ location information is often of no use. 80% of information provided by sensor nodes to users related with the monitored area is connected with location [2]. In the application of wireless sensor network, nodes’ location information can be acquired by adding GPS to nodes. However, this is only applicable with outdoor and open-sided circumstances. Besides, GPS is large in volume and high in cost and energy consumption. Moreover, GPS also needs stable base installations. These facts have made it difficult to realize the requirements of sensor network, which are “low price, low cost and low energy consumption” [3]. As for this, in the deploying area, only some of the nodes can be installed with GPS. For the rest nodes, their location information can only be calculated via a certain algorithm. However, in monitoring area, some unknown nodes not within communication radius of beacon nodes cannot be estimated out though ordinary localization algorithm concurrently at once for the reasons that communication radiuses of sensor nodes are limited by energy, randomness of nodes distribution, barrier between nodes, and so on, which will cause deficient monitoring area coverage; as a result, the quality of sensor network service decreases rapidly and is not able to effectively monitor the deployment area. The most frequently used solution is to increase the coverage of nodes through mobile beacon nodes [4], but the path, difficult reachability, and relatively larger power consumption of mobile nodes and others limit the application of this method. Incremental estimation algorithm [5, 6] is another method to enhance nodes coverage that has some advantages, such as, it does not need to consider path problems, and is not limited by actions of mobile nodes, and moreover consumes much less power compared with mobile nodes. Incremental algorithm will estimate unknown nodes near beacon nodes at first; these unknown nodes will act as new beacon nodes once their locations are determined, then locations of the rest unknown nodes will be estimated by newly added beacon nodes together with original beacon nodes and so on in a similar way, locations of all nodes were estimated out.
Incremental localization is a kind of low energy consumption positioning method which can effectively solve the coverage rate of the monitoring area; through outward extension in turn, each node is localized successively. Due to the successive localization, the previous estimation error will be bound to affect the following estimated accuracy. Such error accumulation inevitably leads to the inconsistency of the variance between the previous error term and the following localized error term; such phenomenon is also known as heteroscedasticity [7, 8]. In the process of location estimation, the heteroscedasticity appears, then the traditional ordinary least squares (OLS) is used to estimate the location of unknown nodes; the estimated value of the obtained node coordinate may not be the efficient estimator, even not the asymptotically efficient estimator. In order to correct the adverse effect caused by the heteroscedasticity, Meesookho et al. [9] proposed the weighted least squares (WLS); they used the reciprocal of error variance as the weighting of weight to restrain the error propagation. WLS is considered to be the improved OLS method; similar to OLS, the residual sum of squares is solved firstly by WLS and followed by the minimum value. However, in the process of finding the residual sum of squares, unlike OLS, WLS considers the influence of heteroscedasticity. In view of considering the heteroscedasticity based on location estimation by WLS, the location accuracy is improved through corresponding different weights with different data. Subsequently, Xiong et al. [10] proposed a kind of incremental node localization method with the optimal weighted least squares on the basis of WLS; this method is based on the obtained optimal weighted least squares when the error variance matrix is estimated as the minimal. Ji and Liu [5] proposed another strategic improved incremental localization approach (IILA) with the hypothesis that previous localization accuracy is greater than next one during incremental localization, this means that the estimation of location nearer to original beacon nodes is more accurate; on the basis of this assumption, with estimated distance of previous location as a constraint condition, the localization problem is converted into trust region sequences that can be solved by sequential quadratic programming (SQP) method. However, IILA did not consider sensor network as a kind of multihop network that there are many paths to certain node, and neglected complexity of deployment environment but only assumed errors during localization process that definitely increase with increasing hop counts; that is, heteroscedasticity of localization process is only monotonically increasing. In complex monitoring environment, variation tendency of heteroscedasticity is difficult to predict, is not necessarily monotonically increasing, but also is possibly decreasing or concurrently increasing and decreasing. For example, as shown in Figure 1, in monitoring area, there are many paths from node A to unknown node D, such as A → B → C → D and A → E → D due to environment complexity of monitoring area, barrier or interference sources exist between node A and node E, and accuracy of measurement from node A to Node E is far less than that of other nodes; as a result, it is not appropriate to estimate ED distance if AE distance acts as the constraint condition.

Localization in complex environment.
In addition, there is a problem that error of locations through estimation has directivity, for example, in Figure 1, errors between node A and node E could along direction of
In pervious incremental algorithms, most incremental localization algorithms are used to adjust heteroscedasticity during localization process and assume that heteroscedasticity is only monotonically increasing but fail to take deployment environment and networking features of sensor network into account. Sensor network is a kind of multi-hope network with relatively worse deployment environment, and incremental pattern of its heteroscedasticity is complicated and diversified for incremental localization algorithm. Furthermore, as same as concurrent localization algorithm, the accuracy of incremental localization is influenced by topology quality of beacon nodes also, and multicollinearity problem [11, 12] caused by topological relations among beacon nodes is not considered by previous incremental algorithms. For these reasons, we will propose a feasible incremental localization algorithm, (Location Estimation-FWLS-CCR) LE-FWLS-CCR, which uses less beacon nodes and considers multi-hops features of sensor network, error accumulation, heteroscedasticity and multicollinearity, and other problems. This algorithm will resolve heteroscedasticity problem through feasible weighted least squares (FWLS) [13] iterative computation mode. The iterative process is more proper for multi-hops features, while in estimation process, we adopts Canonical Correlation Regression (CCR) [12, 14–16] multivariate analysis to solve errors exist in newly added beacon nodes as well as topological relations between newly added and original beacon nodes.
2. Relevant Knowledge Review
2.1. Feasible Weighed Least Squares
In a concurrent localization process, distance-coordinates formula is generally transformed into a form of
In the presence of heteroscedasticity, the positions of localization data in location estimation are different; smaller variance of error term of data means higher confidence level of residuals, while bigger variance of error term means lower confidence level of residuals. Therefore, as to the estimation of location in coordinates under the circumstance that heteroscedasticity exists, it usually uses weighted least squares method to discriminate different residuals [17], namely, to pay adequate attention to data terms with relatively smaller residuals and so assign larger weights on them and smaller weights on data term with larger residual in order to adjust the effect of various data items on estimation computation, and therefore to obtain an effective estimation value in localization process.
So, variance of error term of distance-coordinates formula exhibits heteroscedasticity which is expressed as
among which
Assume,
Then the variance of the error term is
Then, the heteroscedasticity of the error term is eliminated, and it is easy to learn that
In order to obtain the optimal solution, we must make
It is assumed that
Obviously, if the row vector of
Through Schwarz inequality [17, 18], it is proved that when the matrix Ω is the reciprocal of the variance matrix of the range error under the condition that the ratio of range error to the distance is independent Gaussian random variables, the error variance by WLS is minimal. But in reality, the variance of the error term is unknown; therefore, if WLS is solved, the weight needs to be taken according to the actual situation.
FWLS is a feasible method which is able to overcome the problem that cannot be implemented by WLS due to the unavailable weight. FWLS uses residuals attained at each computation as weight matrix; therefore, real weight values can be acquired in computation process, and procedure of FWLS algorithm is shown in Algorithm 1.
(1) Firstly, it is essential to estimate the model through OLS method and obtain the estimated value error (2) Utilize the square of the residual term as the (3) Obtain the next-order estimated value and residual value by WLS (4) Go back to Step 2, until the number of iterative times meet the number of times according to the algorithm requirements.
It can be noted that the FWLS algorithm is in marching iteration; the derivation of the optimal estimated value
2.2. Canonical Correlation Regression
In concurrent localization algorithm, beacon nodes have a very large influence on final location estimation and possibly cause significant errors when beacon nodes relations are collinear or approximately collinear [11]. Principal component analysis (PCA) in multivariate analysis will remove partial information through recombination of coordinate information of beacon nodes in order to reduce noise and effects of multicollinearity.
As to incremental localization algorithm, it can use PCA [19] method in the first estimation of coordinates of unknown nodes to avoid problems caused by multicollinearity in location estimation, but in incremental algorithm, measuring errors cannot be absolutely avoided or eliminated, which indicates that there would be always errors in locations of newly added beacon nodes in coordinates; therefore, it requires processing error information in output message. PCA only processes input variables; for incremental localization algorithm, its output variables act as locations of newly added beacon nodes, and errors contained in them shall be preprocessed to a certain extent.
Canonical correlation analysis (CCA) is a kind feasible and powerful multivariate analysis method especially appropriate for processing and analysis of two correlated data. At the same time, it is a kind of descending dimension method similar to PCA and is also able to remove some noise information that contains collinear information through recombination of data like PCA. CCA pays more attention to data processing and analysis of correlated data; for this reason, it is more proper for regression algorithm and has higher regression accuracy than PCA.
For the equation
Suppose that there are two groups of data,
in which
The correlation function ρ is independent of scales of
To solve this optimization problem of (11), we can build a Lagrange equation to obtain the optimal solution; that is,
Differentiating (12) with
To obtain the optimal solution, let (13a), (13b), (14a),and (14b) equal to zero, then
Multiply both sides of formula (14a) and (14b) by
Given
Here, the solution of CCA was translated into generalized eigenvalue-eigenvector problem of two matrixes whose scales are
Formula (17) can be abridged into
The literature [12, 15] proposed a regression method based on canonical correlation analysis, namely, canonical correlation regression (CCR). CCR combines least squares with canonical correlation analysis with the purposes of optimizating solution of regression coefficient under most relevant significance. CCA-based regression method, to some extent, avoids interferences of multicollinearity of samples through utilization of components that have been features-extracted for regression; in addition, CCA considers correlation between output and input invariables and so can be regarded as an advanced regression between two multivariates, an extension of multiple linear regression (MLR), and known as a “multi-to-multi” regression method. Regression coefficient of canonical correlation regression can be computed out by following equation:
in which,
3. Methodology
3.1. The LE-FWLS-CCR Algorithm
The existence of multicollinearity usually brings seriously adverse effects on model estimation, testing, and prediction. During localization estimation, multicollinearity not only exists in concurrent localization but also exists in incremental location estimation process. For this reason, we add canonical correlation regression method into FWSL algorithm to acquire optimal prediction direction through correlation analysis of input and output variables and dimension reduction processing then use FWLS method to resolve problems caused by heteroscedasticity. Because the procedure of FWLS-CCR localization algorithm is similar to that of FWLS algorithm, its solution is carried out through iteration, and the algorithm process is shown in Algorithm 2.
Input: Beacon nodes coordinates: Distance from beacon nodes to unknown nodes Output: Estimated coordinates of unknown nodes: (1) Beacon nodes deliver their location information outwards through controllable flooding, if unknown nodes acquire more than 3 beacon nodes, it firstly uses CCR to estimate locations of unknown nodes, and will stop if there is no rest unknown nodes, otherwise, will carry out next step. (2) Uses estimated location to estimate residual vector by the estimation formula: (3) Constructs covariance matrix through FWLS method residual vectors, (4) Uses newly-constructed covariance matrix to rewrite location distance equation, (5) According to new equation, uses CCR method to estimate locations of secondary beacon nodes. (6) If there still are some nodes which locations have not be estimate out in deployment area, skip to Step 2. (7) The algorithm will finish if there is no node to be estimated in deployment area, and it will output coordinates of unknown nodes.
Node localization process based on FWLS-CCR is shown in Figure 2. Assume that the monitoring area is deployed with several sensor nodes and

Example and phases of LE FWLS-CCR.
Obviously, the coordinates of node A can be calculated and estimated according to the beacon nodes of
In incremental localization approach, localization of nodes is implemented in batches. Owing to distance measurement error, there is a certain difference between the estimated value and the practical value of first level beacon node. As for second level beacon nodes, their estimated values are as well influenced by measurement error and the intrinsic error of first level beacon nodes. In addition, it is shown in Figure 2 that, when estimating node C, the positions of referential nodes
3.2. Time Complexity Analysis
In this Section, we compare the time complexities for localization as required by LE-FWLS-CCR and other popular incremental location estimation algorithms, namely, WLS-based location estimation (LE-WLS) proposed in the literature [10] as well as SQP-based location estimation (LE-IILA) proposed in the literature [5]. These algorithms will also be compared experimentally in Section 4.
LE-FWLS-CCR: basically, the complexity of our algorithm is dominated by two parts: canonical correlation regression and feasible weighted least squares. The complexity of computing CCR is mainly determined by the core algorithm CCA, whose complexity is
LE-WLS: WLS-based location estimation method using the WLS method as the core algorithm, and the complexity of WLS is
LE-IILA: the complexity of SQP method is the main reason in LE-IILA algorithm. If the limited number of iterations, the computational complexity of SQP is
The literature [20] has proved that properly increasing the calculation amount of algorithm will not affect the performance of sensor network. For this reason, it is worthy to improve the localization accuracy based on FWLS-CCR by sacrificing partial calculation volume.
4. Simulation and Experiments
This section will analyze and evaluate LE-FWLS-CCR localization algorithm on Matlab platform. In simulation experiment, it is supposed that nodes are deployed in a two-dimensional monitoring area and adopt transformation of RSSI signals to distance for matrix of distance among nodes. In order to compare impartiality of experimental results, this section adopts signal model proposed in the literature [21] to simulate signal strength among nodes; that is,
among which
Due to higher coverage of incremental algorithm, the experiment in this section mainly examines the accuracy of localization of nodes with ALE as evaluation basis, and the definition of ALE is as follows:
In the formula,
This experiment also compares the method proposed in this paper, the localization algorithm based on FWLS-CCR (LE-FWLS-CCR), with WLS-based location estimation (LE-WLS) proposed in the literature [10], as well as IILA-based location estimation (LE-IILA) proposed in the literature [5]. In addition, the experiment carries out comparison by use of data collected from actual scenes provided in the literature [20].
4.1. Simulation Experiments Based on Distance-Measuring Model
The experiments based on distance-measuring model have set four experimental scenes: random deployment nodes in square area, regular deployment nodes in square area, random deployment nodes in C-shape area and regular deployment nodes in C-shape area, in which C-shape area, is formed because of a bigger barrier, mainly used to evaluate localization performance with lager barrier, that is, in case of non-line-of-sight. In order to decrease the effect of single one experiment, each group of experiments will be repeated for 50 times in each scene, finally the average indicators of the 50 experiments will be reported. The experiments will examine accuracy of final localization results of unknown nodes with incremental quantity of beacon nodes. In these experiments, the valid communication radius of nodes is supposed to be 60 m.
4.1.1. Rules Deployment
Regular deployment of nodes in monitoring area principally aims to explore effects of collineation of beacon nodes on localization accuracy, while regular deployment in C-shape area is used to observe effects of non-line-of-sight caused by barriers in monitoring area on localization accuracy.
In this group of experiments, regular deployment of nodes is within a

Localization results of regular deployment in square area.
The circumstance that there is a barrier in regular deployment area was described in Figure 4(b) to Figure 4(d), and this experiment mainly is used to study effects of non-line-of-sight on localization results. Figure 4(a) shows deployment of nodes; Figure 4(b) shows localization results of LE-WLS,

Localization results of regular deployment in C-shape area.
It can be seen from Figures 3 and 4 that LE-WLS and LE-IILA only considered heteroscedasticity but didn't take multicollinearity into account; therefore, only nodes in partially incremental area obtained satisfactory results in experiments, and localization errors in some area are still large. LE-FWLS-CCR method proposed in this paper comprehensively considered heteroscedasticity, error escalation as well as multicollinearity, and other problems, and so localization results are more effective than those of LE-WLS and LE-IILA methods.
Because incremental localization method is used to locate nodes through gradually increment, as shown in Figure 3, barrier did not cause localization coverage reduction. While because of “extrusion” of barrier, the ratio of beacon nodes in such deployment area is higher than the same size of deployment area without barrier under the circumstance that the quantity of beacon nodes is equivalent. Ratio of beacon nodes increases; hence, localization accuracy in most of the area is relatively higher as shown in Figure 4. As shown in Figures 3 and 4, LE-WLS and LE-IILA similarly did not consider multicollinearity between original and newly added beacon nodes; as a result, localization errors of nodes in some areas are large and then affect the whole performance of localization.
Figure 5 describes curve of average ALE of repeated experiments by three localization methods varying with the quantity of beacon nodes in regular deployment scenes. It is easily to find ALE of LE-WLS fluctuating as the strongest, and accuracy is the lowest; LE-IILA is in second place, while LE-FWLS-CCR proposed in this paper is most accurate. This is because original and newly added beacon nodes are being most possibly collinear, furthermore, LE-WLS and LE-IILA algorithms did not consider this and noise was not removed completely, especially LE-WLS method that only considered heteroscedasticity but did not take noise escalation into account; consequently ALE curve waves greatly, sometimes; ALE is approximate to 180%; LE-IILA method only ideally considered noise escalation but did not take multicollinearity into account and so obtain better localization results than LE-WLS method, but the localization results are still not stable, and sometimes ALE is greater than 100%; therefore, the effectiveness of localization by LE-IILA is hard to meet actual needs; the method proposed in this paper considered multiple factors that affect accuracy in incremental localization process and obtained fairly stable localization results and significantly higher accuracy than other incremental localization methods.

Average localization error of regular deployment.
4.1.2. Random Deployment
Random deployment is more close to actual situation. The experiments in this scene are used mainly is to discuss whether this algorithm is proper for various actual situations or not. In the same way, experiments about random deployment are classified into two groups, in C-shape area with barrier and in square area without barrier. In this group of experiments, there are 200 nodes randomly deployed in a
Figure 6 shows final localization result of nodes of certain deployment in square area under the circumstance that there are 10 beacon nodes. Figure 6(b) shows localization results of LE-WLS method, with weights being reciprocals of variance of error term that is optimal theoretically,

Localization results of random deployment in square area.
Figure 7 shows localization result of certain random deployment in C-shape area with barrier. Figure 7(a) shows nodes distribution diagram of certain localization. Figures (b),(c), and (d) show localization result of LE-WLS, LE-IILA, and LE-FWLS-CCR, respectively, and final ALE of various localization in Figure 7 is 38.6%, 15.6%, and 13.1%, respectively. The figure shows very obvious trace of incremental localization, that is, anticlockwise step-by-step location estimation. Due to existence of barrier, same quantity of beacon nodes accounts for a higher proportion in unit area than random deployment in square area; therefore, if the incremental series is lower, three methods’ estimation of accuracy for unknown nodes is high, but with incremental series increasing, advantages and disadvantages of three methods will appear. LE-WLS did not consider error control and multicollinearity, so localization error for higher series is large; although LE-IILA considered error escalation to a certain extent, it did not consider multicollinearity, the localization results were improved but are still poor in some areas. The algorithm proposed in this paper obtained similar localization results with the previous three scenes, and its localization accuracy is still stable and excellent.

Localization results of random deployment in C-shape area.
Figure 8 describes curve of average ALE of repeated experiments by three localization methods varying with quantity of beacon nodes in random deployment scenes. Curves of LE-WLS and LE-IILA methods still wave ups and downs, because randomness of random deployment is far greater than that of regular deployment, and as a result, maximum ALE of LE-WLS and LE-IILA methods is even approximate to 180%; however, the method proposed in this paper still obtains stable results. Owing to full considerations on adverse factors in localization process, the characteristics of random deployment did not reduce localization accuracy greatly but improved it.

Average localization error of random deployment in square area.
4.2. Simulation Experiment Based on Actually Measured Data
This paper uses actually measured data set provided by Neal Patwari of Utah State University. The experiment was arranged in a standard office area that is a
Comparisons of average localization errors based on actual RSSI measurement data.
Figure 9 shows the localization results of three algorithms under the circumstance that communication radius is 7 m, in which Figure 9(a) shows nodes deployment; Figure 9(b) is localization result of LE-WLS method,

Localization results of actually measured data.
5. Conclusion
This paper combines FWLS method and CCR method in a localization process that uses FLWS method to solve problems in location estimation caused by heteroscedasticity and uses dimensionality reduction algorithm CCR in multivariate analysis to deal with topology between original and newly added beacon nodes and error accumulation problems. The results of many groups of experiments indicate that the method proposed in this paper can effectively resolve heteroscedasticity, accumulative error, and multicollinearity problems, and its localization results are stable and have higher accuracy than previous incremental localization methods.
