Sage Journals: Discover world-class research

Abstract

One of the most significant researches in location-based services is the development of effective indoor localization. In this work, we propose a novel model of fingerprint localization, which divides location area into different subareas by fuzzy C-means and calculates location via relative distance fuzzy localization. In offline training stage, fuzzy C-means algorithm is used in localization model to divide localization area into different subareas and then to select the useful access points in subareas to reduce the dimensions of fingerprint. In online location stage, we use the nearest neighbor algorithm to select the subareas and to calculate the coordinate of the target point according to relative distance fuzzy localization algorithm, which converts traditional fingerprint of reference points into distance fingerprint and calculates the coordinate of the target point by fuzzy C-means algorithm. The noise and non-linear attenuation between the wireless signal and distance are taken into full consideration in relative distance fuzzy localization algorithm, which eliminates the random environmental noise. Experiments show that our proposed model is able to save the calculation time and improve the localization accuracy.

Keywords

Fingerprint fuzzy C-means clustering subarea dividing access point selection relative distance

Introduction

Today, the Internet of Things (IoT) has received growing attention and has been investigated in widespread fields such as mobile communication, wireless network, and intelligent devices. One of the focuses of IoT application is location-based service (LBS). Currently, global positioning system (GPS)^1,2 is largely available and widely applied in outdoor location. However, the multipath effect is a downside of GPS technology that limits its usage in the indoor environment. Therefore, indoor localization techniques that are independent from satellite and station signals have been extensively studied, such as wireless local area network (WLAN), radio frequency identification (RFID), Bluetooth, and infrared technology. Among the aforementioned localization technologies, WLAN is the most promising approach because of its low cost, non-line-of-sight propagation, and ubiquitous coverage.³

Indoor localization technologies can be classified into two categories: those based on distance and time, and those based on the characteristics analysis of localization area.⁴ The former includes angle of arrival (AOA),⁵ times of arrival (TOA),⁶ and time difference of arrival (TDOA),⁷ which requires multiple stations for signals measurement. Herein, TOA and TDOA need the hardware that can achieve high accuracy of time synchronization between access points (APs) and reference points.⁸ On the contrary, the latter, such as RFID,⁹ requires the deployment of additional tags to be used in the object location estimation based on received signal strength indication (RSSI). For WLAN, it not only has a low cost, but also achieves a high accuracy in the object location estimation.

Currently, one of the most esteemed solutions for WLAN-based indoor localization is the fingerprint technology.¹⁰ In offline stage, “fingerprints,” stored in the database, is obtained by the signal strength (SS) of APs. In online stage, object locations are calculated by comparing the current fingerprint with the previous one. Traditional fingerprint localization technology requires that all the fingerprint information be compared and calculated, which would result in the high complexity and low real time of the system.^11,12 Virtual Reference Elimination (VIRE)¹³ requires only a small amount of fingerprint information to create fingerprint database through D-value and sparse reconstruction. However, it could be easily influenced by the difference of heterogeneous devices, leading to the larger errors.

These traditional WLAN-based indoor localization technologies demand the calculation of a large amount of fingerprint vectors. To mitigate the complexity, a four-phase localization technology has been proposed to calculate the target location. During the offline stage, localization area is divided into different subareas utilizing fuzzy C-means (FCM) algorithm. Then, binary clustering for APs is calculated via k-means++ algorithm. After that, the useful APs in subareas are selected. During the online stage, target subarea is selected utilizing nearest neighbor (NN) algorithm based on useful APs, after which the target location is calculated by relative distance fuzzy localization (RDFL) algorithms proposed in this article.

This article is organized as follows: In section “Related works,” some basic algorithms are introduced and the key factors affecting the localization accuracy are analyzed. Then, a novel model of indoor location, used to reduce the computation complexity of fingerprint during the online stage, is shown in section “Methodology.” In order to evaluate the performance of the proposed model, several experiments have been conducted, which is shown in section “Experimental result.” Section “Conclusion” concludes the article.

Related works

Although terminal localization technology based on cellular mobile communication network and assisted global positioning system (A-GPS) has been used in wide-ranging applications, they fail to achieve a high accuracy in indoor localization. Here demonstrates some technologies used in indoor localization, such as optical tracking,¹⁴ ZigBee,¹⁵ infrared,^4,16 Bluetooth,¹⁷ WLAN,¹⁸ and RFID.¹⁹ The requirement of optical tracking technology is visible between a tracking target and a detector. Although the ZigBee technology has high localization accuracy, it requires extensive hardware support and time–space cost. Infrared only applies to areas without obstacles because the infrared signals of short-range transmission cannot traverse them. Active Badge^20,21 is an example of infrared technology-based indoor localization system. Bluetooth is easy to integrate, yet it is unstable and susceptible to complicated environment. RFID exploits received signal strength (RSS) to calculated locations, but it fails to make full use of the existing WLAN infrastructure in localization area.

WLAN-based localization technology can be divided into two categories: propagation based and training based. The former requires strict time synchronization and high sensitivity of the hardware to obtain arrival time information. Compared to the latter, those localization techniques under the first category only obtain low localization accuracy and are generally more time–space cost-effective.

Training-based localization technology includes two models: propagation and location fingerprint. The former utilizes an empirical model to describe the relationship between SS and distance.²² However, the impact of indoor noises and the complex relationship between SS and distance lead to lower accuracy in the former model. Therefore, the primary research direction of WLAN localization technology is on the latter, which is the foremost research content of this article. Fingerprint-based localization model creates location fingerprint database in offline stage using the stored location and the mapping relation of signal value and then calculates the location of the object point by comparing the fingerprint information received by an AP via signal matching algorithm in online stage.

A typical fingerprint-based indoor localization system is radio detection and ranging (RADAR), which uses the received SS by APs to match the aggregated signal layout and calculates the location by NN algorithm. Chen²³ has conducted intensive research on the NN algorithm and proposed that the location in WLAN environment could be estimated by robust pattern match and vector matching. Roos et al.²⁴ proposed a fingerprint localization technology based on probability, which utilized kernel function to fit probability density function (PDF) of SS in offline and matched bias functions to calculate location in online. Youssef and Agrawala²⁵ use location-clustering techniques to reduce the computational requirements of the localization algorithm. M Zhou et al.²⁶ proposed a novel information-based approach to derive the fundamental limit of localization and explore the relationship between the localization precision and AP placement, with the use of Fisher information matrix (FIM). Chen and Wang²⁷ proposed a novel indoor subarea localization scheme based on fingerprint passive crowdsourcing and unsupervised clustering, which first classifies unlabeled RSS measurements into several clusters and then relates to indoor subareas to generate subarea fingerprints. Wang et al.²⁸ applied the surface fitting technique to construct SS spatial distribution functions and proposed two location search methods to find the target location. Liu et al.²⁹ obtained accurate acoustic ranging estimates among peer phones and then mapped their locations jointly against Wi-Fi signature map subjecting to ranging constraints. As shown in Figure 1, the fingerprint vector is m dimension representing SS of APs, under a condition that there are m APs in the localization area.

Figure 1.

Model of fingerprint localization.

In localization area, APs’ location and SS generate a radio map which we named it “fingerprint map (FM)” and then calculate location through Euclidean distance between the FM and the real-time SS in location stage. In order to gain high accuracy for the FM, a large number of samples are required to be collected during the offline stage. The complexity of fingerprint location comprises two parts: one is the large amount of fingerprint vectors for samples and the other is the higher dimension fingerprint vector for APs.^30–33 Collecting this amount of fingerprint data can be increasingly more difficult, as the considered area becomes larger and more complicated. The current fingerprint technology has failed to estimate exactly how many APs are affected and where to place them in a given workspace.

A large amount of higher dimension fingerprint vectors may take up vast space–time resources, rather than improving the location accuracy. During the experiment, SS showed a large attenuation when the wireless signal passed through obstacles such as wall and table. As shown in Figure 2, SS is stable in a certain area, but it has noteworthy differences in disparate areas.

Figure 2.

Change of SS indoor.

In response to above problems, k-nearest neighbor (KNN)-based two-step FCM weight (KTFW)³⁴ algorithm has been proposed, and k APs have been selected by KNN and FCM based on received SS and location of APs. Reweighted iterative combinational residual minimum (RICIM) algorithm divides localization area and creates a directed graph with the weight of subarea by determining the junction of the nearest subarea and then utilizes shortest path to calculate location. The technology that collects a handful of SS to create a FM through graph theory is not completely available, especially in a complex indoor environment. KNN mixed with the FCM³⁵ technology has been proposed, which builds a relationship between SS and distance in different subarea, and utilizes support vector machine (SVM) for positioning. Wang et al.³⁶ proposed curve fitting for SS and distance and utilized breadth search or depth search to locate.

Above algorithms reduce the complexity of calculation and provide an available localization area to calculate location in offline stage. For algorithms,^34,35 the KNN-FCM hybrid algorithm is utilized in online stage, which can enhance the load of calculation in location stage, because a large amount of APs need to be selected by KNN and cluster by FCM. Therefore, a method which divides subarea by FCM in offline has been proposed.

For example, A Aksu and P Krishnamurthy³⁷ proposed a method based on subarea divisions, and Xu et al.³⁸ proposed a method based on the generated distance loss model through subarea divisions. However, the above two method inadequately considered the influence in different APs and failed to reduce the complexity of location calculation in online stage. Therefore, AP selection has been proposed.

According to experiments and theoretical analysis, there is often a part of SS of APs which does not change with the fingerprint localization environment. Therefore, AP selection can remove useless APs whose received signal SS is weak and reduce the dimension of fingerprint vectors.

Kushki et al.³⁹ discussed the importance of AP selection in locating stage and described the basic principles of correlation minimization among APs selection. Chen et al.⁴⁰ proposed a power-efficient technique known as CaDet which calculates the location through intelligent selection of APs. Zhu and Deng⁴¹ proposed an AP selection method, known as neighborhood rough sets (NRS). However, this method is prone to omitting some valuable APs and has certain limitations as it considers only the neighborhood area. Zou et al.⁴² proposed an AP selection method, which combines information gain and mutual information entropy to decrease the computation cost. Residual-ranking achieves higher accuracy of coordinate calculation through selecting APs of smaller difference between reference points and object points. K-strongest-RSSI (KSR) selects K APs of the strongest signal by Fisher norm.

After AP selection, some localization technologies utilize KNN or fuzzy set to calculate location. The core idea of KNN is the calculation of Euclidean distance between the SS of APs in offline and the SS of APs in online, as well as the selection of k-nearest APs. It then calculates these location average or gives different weights according to distance to estimate the location of object. KNN can obtain higher accuracy of localization when the environment remains unchanged in both offline and online stages. However, its factors are usually diversified in localization environment, such as temperature, humidity, and weather, which leads to a change in the SS of APs and a reduction in the accuracy of localization. The fuzzy set and KNN are somewhat similar. However, KNN describes the far or near position of the SS of APs in offline and online through Euclidean distance, while fuzzy set describes it through the similarity of signals. First, fingerprint transform model (FTM) transforms the fingerprint information of APs into distance fingerprint information. Second, the location is calculated based on fuzzy set.

Methodology

Algorithm

As shown in Figure 3, fingerprint localization model is divided into two stages: offline and online. During the offline stage, the localization area is divided into different clusters by FCM, using the technique proposed in this article, and the binary classification for APs by k-means++ algorithm in each subarea to select available APs. During the online stage, subareas where the object points exist are selected using NN algorithm. Then, the coordinate of the target point is calculated according to RDFL algorithm.

Figure 3.

Model of fingerprint localization.

FCM algorithm

Among the cluster algorithms based on objective function, the theory of the FCM algorithm is the most mature. It was derived from the optimization in hard c-means clustering algorithm and its general form is described in the following.

Suppose that a sample set $X = {x_{1}, x_{2}, \dots, x_{n}}$ , $x_{k} = (x_{k 1}, x_{k 2}, \dots, x_{km})$ is the feature vector of a sample $x_{k}$ in the sample set, where $x_{kj}$ represents the jth dimension vector value of the vector $x_{k}$ . The sample sets are divided into c clusters, and $u_{ij}$ is defined as the degree of the sample $x_{j}$ belonging to the ith cluster $c_{i}$

${\begin{matrix} u_{ij} \in [0, 1] \\ \sum_{1}^{c} u_{ij}, \forall j \\ 0 < \sum_{1}^{c} u_{ij} < n, \forall i \end{matrix}$ (1)

$J_{m} (u, p) = \sum_{i = 1}^{c} \sum_{j = 1}^{n} u_{ij}^{m} (d_{ij})^{2}, m \in [1, \infty)$ (2)

In formula (2), p represents the cluster center matrix and m represents the weight index of fuzziness. (d_ij)² represents the distance between ith cluster center and the jth sample point. Its definition is as follows

$(d_{ij})^{2} = | | x_{j} - p_{i} | |_{A} = (x_{j} - p_{i})^{T} A (x_{j} - p_{i})$ (3)

In formula (3), A represents the positive definite matrix. Formula (3) represents the Euclidean distance when A is a unit matrix. The aim of clustering is to calculate $\min {J_{m} (U, P)}$ by formula (4)

${\begin{matrix} u_{ij} = {[\sum_{k = 1}^{c} {(\frac{d_{ij}}{d_{kj}})}^{(2 / (m - 1))}]}^{- 1}, 1 \leq j \leq n \\ p_{i} = {[\sum_{k = 1}^{n} {(u_{ik})}^{m}]}^{- 1} \sum_{k = 1}^{n} {(u_{ik})}^{m} x_{k}, 1 \leq i \leq c \end{matrix}$ (4)

The sample set is divided into c clusters through membership degree matrix u and cluster center matrix p.

K-means++ algorithm

K-means++ algorithm improves the k-means algorithm and solves the well-known problem of cluster center selection. It divides a given data set through a certain number of clusters. The algorithm aims to find the center point of a cluster by minimizing the distance between the cluster and members of the same cluster. The algorithm comprises the following steps in Table 1.

Table 1.

K-means++ clustering algorithm.

1. Initialize U₁ centroid point which represents the initial center point of sample sets, where k represents the number of categories

2. Calculate distance between sample points and centroid, where Sum(D(s_t)) represents the summation of centroids

3. Select a random sample point R that is smaller than Sum(D(s_t)), where s_i is the next centroid when Random-D(s_i) < 0

4. Repeat steps (2) and (3) until k cluster centers are obtained. Suppose cluster center is C = {u₁, u₂,…, u_k}

5. Calculate the Euclidean distance between each centroid and each sample (t = 1, 2,…, n) and then select the one which satisfies the equation

6. Update cluster center, where S_i,t represent jth sample in the ith cluster, and n_i represents the number of samples in the ith cluster

7. Determine whether to stop the algorithm. If |ERROR_i-ERROR_i_− 1| < δ, then the algorithm is stopped. Otherwise, return to step (5). ERROR_i represents the average variance of each cluster

Fuzzy set theory

By the classic set theory, X is a collection of points denoted generically by x. Thus, X = {x}. A fuzzy set A in X is characterized by a degree of membership function. When A is a set in the classic set theory whose membership values could be either 0 or 1, its value depends on whether or not x belongs to A. It contains the value “1” if x belongs to A, and “0” otherwise.

LA Zadeh⁴³ proposed a concept of fuzzy set that describes the fuzziness of object by a degree of membership function whose value interval is [0, 1].

The degree of membership close to 1 indicates that the object is close to set A, whereas the degree of membership close to 0 indicates that it is further away from set A. As shown in formula (5), any element x corresponds to a unique degree of membership function $u_{\tilde{A}} (x) \in [0, 1]$

$u_{\tilde{A}} (x) : X \in [0, 1]$ (5)

Suppose X is a finite set, $\tilde{A}$ is a fuzzy set of $X = {x_{1}, x_{2}, \dots, x_{n}}$ , and $u_{\tilde{A}} (x_{i}), i = 1, 2, \dots, n$ is the degree of membership function. Thus, the definition of the fuzzy set A is as follows

$\tilde{A} = \frac{u_{\tilde{A}} (x_{1})}{x_{1}} + \frac{u_{\tilde{A}} (x_{2})}{x_{2}} + \dots + \frac{u_{\tilde{A}} (x_{n})}{x_{n}} = \sum_{i = 1}^{n} \frac{u_{\tilde{A}} (x_{i})}{x_{i}}$ (6)

Let X be an infinite set and the definition of the set A is as follows

$\tilde{A} = \int xe X^{u_{A} (x) / x}$ (7)

Subarea division

Suppose in this location area, n is the number of reference points and m is the number of available APs. Each AP is viewed as a coordinate information source, and a group of RSSI sample received at reference points can provide fingerprint $F_{i}$ and the FM is determined by (equation (8))

$FMAP = {F_{i} | i = 1, 2, \dots, n}$ (8)

With our prior knowledge on signal information between reference points and APs, we record the reference points coordinate information $P_{i} = (x_{i}, y_{i})$ and then the fingerprint database is determined (equation (9))

$FSKEP = {p_{i}, F_{i} | i = 1, 2, \dots, n}$ (9)

Those “FSKEP” is divided into different clusters and each clusters as a subarea by cluster algorithm. In this article, we utilize FCM to divide the fingerprint database into m clusters, where a centroid in each clusters represents characteristic of the particular cluster. According to the characteristic of FCM, the subarea division will be more accurate as the cluster characteristics become clearer.

The FCM cluster algorithm based on fingerprint database is divided into two main categories: one is fingerprint normalization and the other is subarea division. Fingerprint normalization changes the degree of membership of fingerprint from 0 and 1 to [0, 1]. After fingerprint normalization, FCM is utilized to divide fingerprint database into different clusters. The specific steps are as follows.

Fingerprint normalization

In general, the SS of wireless is between −100 and 0 db in indoor environment. If SS and location are imported directly as cluster characteristics, different characteristics of dimension correspond to different contribution for clustering. Suppose two reference points fingerprint $FingerPrin t_{1} = {x_{1}, y_{1}, rss i_{1}}$ and $FingerPrin t_{2} = {x_{2}, y_{2}, rss i_{2}}$ , then the Euclidean distance in these points can be calculated as follows

$D i s t a n c e F i n g e r = \sqrt{{(x_{1} - x_{2})}^{2} + {(y_{1} - y_{2})}^{2} + {(r s s i_{1} - r s s i_{2})}^{2}}$ (10)

In formula (10), because difference measurement unit may cause other characteristics be ignored as its change is not obvious when we use coordinates of x, y, and rssi to calculate the distance of two points.

Suppose reference point i fingerprint $FSKE P_{i} = {FSKE P_{ij} | 1 < j < t_{m}}$ , maximal fingerprint characteristics $MAXFSKE P_{i}$ and minimal fingerprint characteristics is $MINFSKE P_{i}$

$\begin{matrix} FSKE {P_{ij}}^{*} = (MaxR - MinR) * \\ \frac{FSKE P_{ij} - MINFSKE P_{i}}{MAXFSKE P_{i} - MINFSKE P_{i}} + MinR \end{matrix}$ (11)

In formula (11), MaxR and MinR represent the ceiling and floor of normalization region, respectively, which are 1 and 0

$FSKE {P_{i}}^{*} = {FSKE {P_{ij}}^{*} | 1 < j < m}$ (12)

In formula (12), $FSKE P^{*}$ represents normalization fingerprint database

$FSKE P^{*} = {FSKE {P_{i}}^{*} | 1 < i < n}$ (13)

Subarea division

FCM cluster algorithm has been improved based on objective function. In formula (13), data set $FSKE P^{*}$ , 1 < i < n, 1 < j < m, is to be divided, where n is the number of reference points and m is the number of available APs. The specific steps of the algorithm are as follows:

1. Initialize the number of clusters c, weighted index m = 2, iterative stopping threshold s, cluster center $v_{0}$ , iterative counter a = 0, and matrix membership degree matrix U.

2. U is updated by

$\begin{matrix} U_{ij} = {[\sum_{k = 1}^{c} {(\frac{| | FSKE {P_{j}}^{*} - v_{i} | |}{| | FSKE {P_{j}}^{*} - v_{k} | |})}^{\frac{2}{m - 1}}]}^{- 1} \\ 1 \leq i \leq c, 1 \leq j \leq n \end{matrix}$ (14)

In formula (14), $| | FSKE {P_{j}}^{*} - v_{i} | |$ represents the Euclidean distance between cluster center and normalization fingerprint.

3. $v_{i}$ is updated by

$v_{i} = \frac{\sum_{j = 1}^{n} {(u_{ij})}^{m} \times FSKE {P_{j}}^{*}}{\sum_{j = 1}^{n} {(u_{ij})}^{m}} 1 \leq i \leq c$ (15)

4. Determine whether to stop the algorithm. If $| | v_{a} - v_{a + 1} | | \leq s$ , then the algorithm is stopped, and U and v are exported. Otherwise, assign a = a+1 and return to step (2).

Select maximal degree of membership cluster as subclass of reference point

$\begin{matrix} Clas s_{i} = {FSKE {P_{j}}^{*} | 1 \leq j \leq n, \\ U_{ik} = max (U_{mk}), 1 \leq m \leq c} \end{matrix}$ (16)

In formula (16), $Clas s_{i}$ contains the number of reference point between 1 and n − c+1. Reference point j belongs to cluster $Clas s_{i}$ only if when the degree of membership between reference point j and cluster $Clas s_{i}$ is greater than the degree of membership between reference point j and other cluster. According to formula (16), reference points determine the type of cluster they belong to

$v = {v_{i} | 1 \leq i \leq c}$ (17)

In formula (17), $v_{i}$ represents

$v_{i} = {FSKE P'_{j} | 1 \leq j \leq n}$ (18)

In formula (18), $FSKE P'_{j}$ represents the new character of cluster center i. Reference point j exports a value between 0 and 1. The different fingerprint database consists of reference points in different subareas by FCM-clustering algorithm. Optimization of available APs through k-means++ algorithm is as follows.

AP selection

The fingerprint technique requires a fingerprint vector by a combination of geographical coordinate of the preference point and RSSI value received by different APs. In online stage, the location is estimated by comparing the measured SS value acquisition and the previously measured value. As previously mentioned, the more the number of APs is, the higher the dimension of fingerprint vector becomes. Nevertheless, not all of the APs can improve the localization accuracy; on the contrary, the fault data may be imported. As shown in Figure 4, the performance of SS values is between 5 preference points and 65 APs, and a mass of SS value of APs is weak.

Figure 4.

RSSI between AP.

In Figure 4, APs can be divided into two sub-clusters: one is the APs of strong signal that works for location accuracy and the other is APs of weak signal that are abandoned. In this article, k-means++ algorithm has been utilized to divide AP into two sub-cluster which are the APs of strong signals and the APs of weak signals. We select those APs which fall under the category of the former sub-cluster as useful APs

$F R_{i} = {RSS I_{j} | j = 1, 2, 3, \dots, m}$ (19)

$FR = {F R_{i} | i = 1, 2, 3, \dots, n}$ (20)

$F R^{*} = F R^{T}$ (21)

In formula (19), FR_i represents the fingerprint of ith preference point. In formula (20), FR $n \times m$ dimension matrix represents fingerprint database. In formula (21), $F R^{*}$ represents the transposed matrix of FR (Table 2).

Table 2.

Pseudo-code of k-means++ algorithm.

Input $F R^{*}$
Output Cluster
Begin:
Dim V; //V is the cluster center which selected randomly
Dim b; //b is the number of iterations
V(1) = Random( $F R^{*}$ );
Dim SumDist;
Dim MinDist;
For i = 1 to m
mDist = Dist(V(1), $F R^{*}$ (i));
If mDist < MinDist;
Then MinDist = mDist;
End If
SumDist += mDist;
End
Rd = Random(SumDist);
For i = 1 to m
mDist = Dist(V(1), $F R^{*}$ (i));
If rd.mDist < 0
Then V(2) = $F R^{*}$ (i);
End If
End For
Do{
For i = 1 to m
Dist1 = Dist(V(1), $F R^{*}$ (i));
Dist2 = Dist(V(2), $F R^{*}$ (i));
If Dist1 < Dist2
Then Cluster(1) = $F R^{*}$ (i);
Else
Cluster(2) = $F R^{*}$ (i);
End If
End For
For i = 1 to Cluster.length
V(i) = Sum(Cluster(i))/Cluster(i).size;
End For
Error1 = Sum((Cluster(1)-V(1))²)
Error2 = Sum((Cluster(2)-V(2))²)
b++;
}while((Error1(b)-Error1(b-1)) < δ && (Error2(b)-Error2(b-1)) < ε);
End

Subarea selection

In the online location stage, we first match the measured RSSI sample with cluster center and then select the subarea that is most similar to the measured RSSI sample as the area of the sample. According to NN algorithm, the measured object point and clusters center distance are compared, and the area that has the closest distance with the object’s area is selected.

Suppose that the measured sample fingerprint is $T = {R T_{1}, R T_{2}, \dots, R T_{m}}$ , and the distance of clusters center and measured sample is represented by

$D T_{i} = | | v_{i} - T | | = \sqrt{\sum_{j = 1}^{m} {(v_{ij} - T_{j})}^{2}}$ (22)

In formula (22), $v_{i}$ represents the cluster center that has the closest distance with the object The object belongs to the cluster C_i where center is $v_{i}$ .

Fuzzy set localization

Fingerprint transform model

The indoor wireless signal transmission accords with logarithmic path loss model

$RSSI (d)_{dB} = RSSI (d_{0}) - 10 β \lg (\frac{d}{d_{0}}) + ε$ (23)

In formula (23), $RSSI (d_{0})$ represents the SS when distance is between AP and signal source, and $ε$ is a random variable that obeys normal distribution ( $ε ~ N (0, σ d B^{2})$ ). $β$ represents the path loss factor, and the indoor environment is usually set to 3 or 4.

Suppose the distance between the two APs and the signal source is $d_{1}$ and $d_{2}$ , and the SS difference among them is ΔdB

$RSSI (d_{1}) = RSSI (d_{0}) - 10 β \lg (\frac{d_{1}}{d_{0}}) + ε_{1}$ (24)

$RSSI (d_{1}) + Δ dB = RSSI (d_{0}) - 10 β \lg (\frac{d_{2}}{d_{0}}) + ε_{2}$ (25)

Since two APs are in the same localization area, we suppose that ε₁ and ε₂ are equal. Formula (24) is subtracted from formula (25), and the result is as follows

$Δ dB = 10 β \lg (\frac{d_{2}}{d_{1}})$ (26)

Convert formula (26) to formula (27)

$d_{2} = 10^{\frac{Δ dB}{10 β}} d_{1}$ (27)

As shown in formula (27), $Δ dB$ describes the distance relationship between the two APs. As shown in formula (28), $FingerPrin t_{i}$ represents the SS between the ith AP and m signal source. $MinFingerPrin t_{i}$ represents the weakest SS between the ith AP and jth signal source

$\begin{matrix} FingerPrin t_{i} = {RSS I_{0}, RSS I_{1}, \dots, RSS I_{j}, \dots RSS I_{m}} \\ MinFingerPrin t_{i} = \min (FingerPrin t_{i}) \end{matrix}$ (28)

The difference between $FingerPrin t_{i}$ and $FingerPrin t_{i}$

$FingerPrin t'_{i} = {RSS I'_{0}, RSS I'_{1}, \dots, 0, \dots RSS I'_{m}}$ (29)

According to formulas (27) and (29), formula (30) is as follows

${FingerPrin t ″}_{i} = {10^{\frac{RSS I'_{0}}{10 β}} d_{1}, 10^{\frac{RSS I'_{1}}{10 β}} d_{1}, \dots, d_{1}, \dots 10^{\frac{RSS I'_{m}}{10 β}} d_{1}}$ (30)

The form of formula (30) after normalization is as follows

${FingerPrin t ‴}_{i} = {10^{\frac{RSS I'_{0}}{10 β}}, 10^{\frac{RSS I'_{1}}{10 β}}, \dots, 1, \dots 10^{\frac{RSS I'_{m}}{10 β}}}$ (31)

Formula (31) represents the distance fingerprint of ith AP, and these distance fingerprints constitute the “distance fingerprint map (DFP)” of the location area.

Relative distance fuzzy localization

Suppose there are n APs in the localization area, and U represents all distance fingerprints

$U = {{FingerPrin t ‴}_{i} | i = 1, 2, \dots, n}$ (32)

As shown in formula (33), according to fingerprint conversion model, we can calculate the distance fingerprint ${TFingerPrin t ‴}_{i}$

${TFingerPrin t ‴}_{i} = {{TRSS I ‴}_{0}, {T RSS I ‴}_{1}, \dots, 1, \dots {TRSS I ‴}_{m}}$ (33)

The degree of membership function of AP is defined as follows

$Similarit y_{i} = \frac{{FingerPrin t ‴}_{i} \cdot {TFingerPrin t ‴}_{i}}{| {TFingerPrin t ‴}_{i} | | {FingerPrin t ‴}_{i} |}$ (34)

$Similarit y_{i}$ represents the angle cosine of vector. The larger the angle is, the higher the similarity is.

According to formula (34), we could calculate the similarity between APs and object point, sort them in descending order and select the top k APs. Suppose the location of AP is $(x_{i}, y_{i}), i = 1, 2, \dots, k$ , the location of object is as follows

$(\hat{x}, \hat{y}) = \frac{\sum_{j = 1}^{k} [Similarit y_{i} \cdot (x_{j}, y_{j})]}{\sum_{j = 1}^{k} Similarit y_{i}}$ (35)

Experimental result

The experimental region, which includes a corridor, four offices, and eight classrooms, is approximately 860 m², with five Aps deployed.⁴⁴ In every classroom, a reference point could receive signals from at least three APs on the current floor and several APs on the neighboring floors. For instance, reference points (blue points) in a room are separated by 3.7 m and the nearest distance between the reference points in current room and adjacent room is 2.6 m. According to Figure 5, the number of APs from 3 to 13 can be perceived in each position, and the average number is 7.

Figure 5.

Layout of experimental context.

The experiment comprises three parts: subarea division, AP selection, and subarea selection. The specific procedure is shown in the following.

Subarea division

According to the proposed method, FM is divided into different subareas. As shown in Figures 6 and 7, the available number of clusters is 13 and 14, respectively. Comparing with Figures 5 –7, we can find that Figure 6 has the best performance.

Figure 6.

Division of cluster (C = 13).

Figure 7.

Division of cluster (C = 14).

When the number of cluster is 14, we randomly select 68 test points and calculate their locations by KNN, fuzzy, and Bayesian algorithm. First, the locations of 68 test points are calculated by KNN (k = 3) algorithm. The probability of error of the two localization processes is compared in Figure 8, where one process includes subarea division, and the other excludes.

Figure 8.

Probability of error of KNN algorithm.

The distance of error is defined as the distance between calculated location and real location in localization stage. The probability of error is calculated as follows

$P = \frac{\sum_{i = 1}^{n} m e_{i}}{\sum_{i = 1}^{n} m_{i}}$ (36)

where $m e_{i}$ is the number of error points, and $m_{i}$ is the number of test points in the ith experiment. As shown in Figure 8, the probability of error is 0.14 and 0.08 when distance of error is less than 1 m, that is, the number of test points, which calculated by the localization process inclusion of subarea division, is large than the other.

Second, the locations of 68 test points are calculated by fuzzy algorithm. The probability of error of the two localization processes is compared in Figure 9, where one process includes subarea division, and the other excludes.

Figure 9.

Probability of error of fuzzy algorithm.

Third, the locations of 68 test points are calculated by Bayesian algorithm. The probability of error of two localization processes is compared in Figure 10, where one process includes subarea division, and the other excludes.

Figure 10.

Probability of error of Bayesian algorithm.

Fourth, the locations of 68 test points are calculated by KNN, fuzzy, and Bayesian algorithm with subarea division (the number of subareas is 14). The probability of error of these is compared in Figure 11, which indicates that the performance of fuzzy is better than others. The experimental results show that subarea division can improve the accuracy of localization.

Figure 11.

Probability of error of KNN, fuzzy, and Bayesian algorithm.

AP selection

There are 68 test points selected randomly and 65 APs in Figure 5, and each test point perceives the number of APs differently. The experiment is divided into two parts: one is non-divide subarea in which the sample set consists of 68 test points and the other is divided subarea in which the sample set of each subarea consists of the test points distributed in the subarea, which the sum of test points in each subarea being 68. After that, the probability of error based on fuzzy algorithm is obtained.

First, four test points are selected randomly in the area without subarea divisions, and then the distribution of received SS is observed. As shown in Figure 12, the SS on the horizontal axis is dispersed, which indicates that the difference of APs in different position is greater. Therefore, binary clustering for APs calculated via k-means++ algorithm would lead to larger errors.

Figure 12.

SS of four sampling points inner subarea.

As shown in Figure 13, APs are divided into two subclasses utilizing the k-means++ algorithm when location area is non-divided. The number of selected APs is 7, and KNN is applied to calculate location. The experimental results show that the localization accuracy of AP selection is lower than the one without AP selection. This is because that some APs are lost in the AP selection by k-means++ algorithm.

Figure 13.

Probability of error of AP select.

As shown in Figure 14, four test points are selected randomly in the area with subarea divisions, and then the distribution of received SS is observed. The SS on the horizontal axis is intensive, which indicates that the difference of APs in different positions is not obvious. As shown in Figure 15, after AP selection, the number of useful APs is from 4 to 7 in each subareas, and the average number is 5.6.

Figure 14.

SS of four sampling points inner subarea.

Figure 15.

Bar of AP select inner subarea.

When the number of subareas is 14, locations of these 68 test points are calculated by KNN, fuzzy, and Bayesian algorithm. First, the locations of 68 test points are calculated by KNN (k = 3) algorithm. The probability of error of the two localization processes is compared in Figure 16, where one process includes AP selection, and the other excludes.

Figure 16.

Probability of error of KNN algorithm.

Second, the locations of 68 test points are calculated by fuzzy algorithm. The probability of error of the two localization processes is compared in Figure 17, where one process includes AP selection, and the other excludes.

Figure 17.

Probability of error of fuzzy algorithm.

Third, the locations of 68 test points are calculated by Bayesian algorithm. The probability of error of the two localization processes is compared in Figure 18, where one process includes AP selection, and the other excludes.

Figure 18.

Probability of error of Bayesian algorithm.

Fourth, the locations of 68 test points are calculated by KNN, fuzzy, and Bayesian algorithm with AP selection. The probability of error of these is compared in Figure 19, and the performance of fuzzy is better than others.

Figure 19.

Probability of error of KNN, fuzzy, and Bayesian algorithm.

After subarea division and AP selection, the number of useful APs is reduced from 65 to 5.6. Table 3 shows that the localization time of KNN, fuzzy, and Bayesian algorithm are reduced by about 21%, 20%, and 13%, respectively.

Table 3.

Times statistics of around APs selection.

	No selected AP	Selected AP	Reduced times	Reduced rate (%)
KNN	0.2117 × 10⁻³	0.1672 × 10⁻³	0.445 × 10⁻⁴	21
Fuzzy	0.2606 × 10⁻³	0.2075 × 10⁻³	0.531 × 10⁻⁴	20
Bayesian	0.3461 × 10⁻³	0.2995 × 10⁻³	0.466 × 10⁻⁴	13

KNN: k-nearest neighbor; APs: access points.

Subarea selection

As mentioned above, the experimental area is divided into 14 subareas. The received SS of test point is normalized, and then the nearest distance is selected from the set containing the Euclidean distance between the normalized values and each subarea. Finally, the subarea corresponding to the nearest distance is considered as the area where test point exists.

In the experiment, 68 test points, distributed in different areas, are randomly selected and their residing subareas are calculated by NN and fuzzy algorithms. As shown in Figure 20, the number of error points both is 4 and the accuracy of subarea selection is up to 94%.

Figure 20.

Result of subarea selection.

RDFL localization

As shown in Figure 21, the probability of error of fuzzy is similar to RDFL when the distance between the calculated location and real location is within 2 m. Meanwhile, the probability of error of fuzzy is lower than that of RDFL when the distance of error between calculated location and real location is greater than 2 m.

Figure 21.

Probability of error of RDFL and fuzzy algorithms.

As shown in Table 4, the accuracy of RDFL is improved by 0.46 m compared to that of fuzzy set technology.

Table 4.

Average distance of error and average probability of error.

	Average distance of error	Average probability of error
Fuzzy	3.04	0.72
RDFL	2.58	0.77

RDFL: relative distance fuzzy localization.

As shown in Figure 22, the probability of error of KNN which is based on FTM is lower than that of RDFL.

Figure 22.

Probability of error of FTM-KNN and RDFL algorithm.

As shown in Table 5, in spite of the 0.12 m improvement in accuracy using RDFL, the average probability of error of RDFL is around the same as that of FTM-based KNN.

Table 5.

Average distance of error and average probability of error.

	Average distance of error	Average probability of error
FTM-KNN	2.70	0.76
RDFL	2.58	0.77

FTM: fingerprint transform model; KNN: k-nearest neighbor; RDFL: relative distance fuzzy localization.

As shown in Figure 23, the probability of error of RDFL is higher than that of other methods. As shown in Table 6, the average distance of error of the RDFL is reduced to 2.58 m, which is smaller than that of other localization methods.

Figure 23.

Probability of error different algorithms.

Table 6.

Average distance of error and average probability of error.

	Average distance of error	Average probability of error
KNN	3.78	0.67
Fuzzy	3.04	0.72
FTM-KNN	2.70	0.76
RDFL	2.58	0.77
Bayesian	5.21	0.75
FTM-Bayesian	5.03	0.72

RDFL: relative distance fuzzy localization; FTM: fingerprint transform model; KNN: k-nearest neighbor.

Conclusion

An indoor positioning model based on fingerprint has been designed. In this article, the proposed approach consists of FCM and k-means++ algorithm. During the offline stage, localization area is divided into different subareas utilizing FCM algorithm. Then, binary clustering for APs is calculated via k-means++ algorithm, after which useful APs in subareas are selected. During the online stage, target subarea is selected utilizing NN algorithm based on selected APs. After that, target location is calculated by RDFL algorithm which has also been proposed in this article. Experimental results demonstrate that the proposed method is more effective and more robust, and its performance is competitive against other state-of-the-art methods.

Footnotes

The authors are grateful for the anonymous reviewers who have made constructive comments.

Academic Editor: Shanshan Zhao

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research,authorship,and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research,authorship,and/or publication of this article: This work is supported by the Natural Science Foundation of China (No. 61172018),Scientific Research Program of Shaanxi Province Education Department (No. 15JS077),Project of Xi’an social science planning foundation (No. 16J133),and Xi’an BeiLin Science Research Plan (No. GX1623,GX1626).

References

Okamoto

Miyazaki

. System for measuring position by using global positioning system and receiver for global position system. US Patent 5,434,787, 1995.

. Cooperative positioning and tracking in disruption tolerant networks. IEEE T Parall Distr 2015; 26: 382–391.

Ivanov

Nett

Schemmer

. Automatic WLAN localization for industrial automation. In: WFCS 2008 IEEE international workshop on factory communication systems, Dresden, 21–23 May 2008, pp.93–96. New York: IEEE.

Hightower

Borriello

Location systems for ubiquitous computing. Computer 2001; 34: 57–66.

Friedman

Charbiwala

Schmid

. Angle-of-arrival assisted radio interferometry (ARI) target localization. In: MILCOM 2008, military communications conference, San Diego, CA, 16–19 November 2008, pp.1–7. New York: IEEE.

Chan

Y-T

Tsui

W-Y

H-C

. Time-of-arrival based localization under NLOS conditions. IEEE T Veh Technol 2006; 55: 17–24.

Liu

. Research on the TDOA measurement of active RFID real time location system. In: 2010 3rd IEEE international conference on computer science and information technology (ICCSIT), Chengdu, China, 9–11 July 2010, pp.410–412. New York: IEEE.

Laaraiedh

Avrillon

. Comparison of hybrid localization schemes using RSSI, TOA, and TDOA. In: 11th European wireless conference 2011—sustainable wireless technologies (European wireless), Vienna, Austria, 27–29 April 2011, pp.1–5. Berlin: VDE.

Bekkali

Sanson

Matsumoto

. RFID indoor positioning based on probabilistic RFID map and Kalman filtering. In: Third IEEE international conference on wireless and mobile computing, networking and communications WiMOB, White Plains, NY, 8–10 October 2007, p.21. New York: IEEE.

10.

Abusara

AM.

Indoor positioning techniques and approaches for Wi-Fi based systems. Sharjah, UAE: American University of Sharjah, 2015.

11.

Takenga

Kyamakya

A low-cost fingerprint positioning system in cellular networks. In: CHINACOM’07. Second international conference on communications and networking in China, Shanghai, China, 22–24 August 2007, pp.915–920. New York: IEEE.

12.

Yang

Liu

Locating in fingerprint space: wireless indoor localization with little human intervention. In: Proceedings of the 18th annual international conference on mobile computing and networking, Istanbul, 22–26 August 2012, pp.269–280. New York: ACM.

13.

Yiyang

Yunhao

LM.

VIRE: active RFID-based localization using virtual reference elimination. In: ICPP 2007. International conference on parallel processing, Xi’an, China, 10–14 September 2007, p.56. New York: IEEE.

14.

Alsindi

Pahlavan

. Performance of TOA calculate algorithms in different indoor multipath conditions. In: IEEE wireless communications and networking conference, Atlanta, GA, 21–25 March 2004, pp.495–500. New York: IEEE.

15.

Sugano

Kawazoe

Ohta

. Indoor localization system using RSSI measurement of wireless sensor network based on ZigBee standard. Target 2006; 538: 050.

16.

Want

Hopper

Falcao

The active badge location system. ACM T Inform Syst 2002; 10: 91–102.

17.

Bargh

De Groote

. Indoor localization based on response rate of Bluetooth inquiries. In: Proceedings of the first ACM international workshop on mobile entity localization and tracking in GPS-less environments, San Francisco, CA, 14–19 September 2008, pp.49–54. New York: ACM.

18.

Baier

Technology for WLAN localization and location based service supply. US Patent 8,644,275, 2014.

19.

Wegener

Fross

Rossler

. Localization of objects using passive RFID technology. In: 2014 11th international multi-conference on systems, signals & devices (SSD), Barcelona, 11–14 February 2014, pp.1–4. New York: IEEE.

20.

Privantha

Balakrishna

Teller

. The cricket compass for context-aware mobile applications. In: Proceedings of the 7th annual international conference on mobile computing and networking, Sydney, Australia, 6–9 December 2010, pp.1–14. Rome: ACM Press.

21.

Hazas

Ward

A novel broadband ultrasonic location system. In: Proceedings of the 4th international conference ubiquitous computing, Göteborg, 29 September–1 October 2002, pp.264–280. Berlin: Springer.

22.

Bandyopadhyay

Majumder

Ghosal

. A circular layer location management scheme for mobile ad-hoc networks. In: 2006 IFIP international conference on wireless and optical communications networks, Bangalore, India, 11–13 April 2006, pp.774–779. New York: IEEE.

23.

Chen

. An application of robust template matching to user location on wireless infrastructure. In: ICPR 2004. Proceedings of the 17th international conference on pattern recognition, Cambridge, 23–26 August 2004, vol. 3, pp.687–690. New York: IEEE.

24.

Roos

Myllymäki

Tirri

. A probabilistic approach to WLAN user location calculate. Int J Wireless Inform Network 2002; 9: 155–164.

25.

Youssef

Agrawala

The Horus location determination system. Wirel Netw 2008; 14: 357–374.

26.

Zhou

Qiu

Tian

. An information-based approach to precision analysis of indoor WLAN localization using location fingerprint. Entropy 2015; 17: 8031–8055.

27.

Chen

Wang

FinCCM: fingerprint crowdsourcing, clustering and matching for indoor subarea localization. IEEE Wirel Commun Lett 2015; 4: 677–680.

28.

Wang

Zhou

Yang

. Indoor positioning via subarea fingerprinting and surface fitting with received signal strength. Perv Mob Comput 2015; 23: 43–58.

29.

Liu

Gan

Yang

. Push the limit of WiFi based localization for smartphones. In: Proceedings of the 18th annual international conference on mobile computing and networking, Istanbul, 22–26 August 2012, pp.305–316. New York: ACM.

30.

Nilsson

Bigun

Localization of corresponding points in fingerprints by complex filtering. Pattern Recogn Lett 2003; 24: 2135–2144.

31.

Elnahrawy

Martin

. The limits of localization using signal strength: a comparative study. In: IEEE SECON 2004. 2004 first annual IEEE communications society conference on sensor and ad hoc communications and networks, Santa Clara, CA, 4–7 October 2004, pp.406–414. New York: IEEE.

32.

Trappe

Zhang

. Robust statistical technologies for securing wireless localization in sensor networks. In: IPSN 2005. Fourth international symposium on information processing in sensor networks, Boise, ID, 15 April 2005, pp.91–98. New York: IEEE.

33.

Yang

Liu

. WILL: wireless indoor localization without site survey. IEEE T Parall Distr 2013; 24: 839–848.

34.

Yubin

Yongliang

Lin

A KNN-based two-step fuzzy clustering weighted algorithm for WLAN indoor positioning. High Tech Lett 2011; 17: 223–229.

35.

Sun

. KNN-FCM hybrid algorithm for indoor location in WLAN. In: 2009 2nd international conference on power electronics and intelligent transportation system (PEITS), Shenzhen, China, 19–20 December 2009, pp.251–254. New York: IEEE.

36.

Wang

Zhou

Liu

. Indoor localization based on curve fitting and location search using received signal strength. IEEE T Ind Electron 2015; 62: 572–582.

37.

Aksu

Krishnamurthy

Sub-area localization: a simple calibration free approach. In: Proceedings of the 13th ACM international conference on modeling, analysis, and simulation of wireless and mobile systems, Bodrum, 17–21 October 2010, pp.63–72. New York: ACM.

38.

F-Y

L-B

Wang

Z-X.

A new WLAN indoor localization system based on distance-loss model with area partition. J Electron Inform Technol 2008; 6: 032.

39.

Kushki

Plataniotis

Venetsanopoulos

AN.

Kernel-based positioning in wireless local area networks. IEEE T Mobile Comput 2007; 6: 689–705.

40.

Chen

Yang

Yin

. Power-efficient access-point selection for indoor location calculate. IEEE T Knowl Data En 2006; 18: 877–888.

41.

Zhu

Y-J

Deng

Z-L

. AP selection for indoor localization based on neighborhood rough sets. In: Vehicular technology conference (VTC Fall), Quebec City, QC, Canada, 3–6 September 2012, pp.1–5. New York: IEEE.

42.

Zou

Zhang

. An indoor positioning algorithm using joint information entropy based on WLAN fingerprint. In: 2014 international conference on computing, communication and networking technologies (ICCCNT), Hefei, China, 11–13 July 2014, pp.1–6. New York: IEEE.

43.

Zadeh

LA.

Fuzzy sets. Inform Control 1965; 8: 338–353.

44.

Jekabsons

Zuravlyov

. Refining Wi-Fi based indoor positioning. In: Proceedings of 4th international scientific conference applied information and communication technologies, http://www.cs.rtu.lv/jekabsons/Files/Jek_AICT2010.pdf