Sage Journals: Discover world-class research

Abstract

Due to GPS restrictions, an inertial sensor is usually used to estimate the location of indoor mobile robots. However, it is difficult to achieve high-accuracy localization and control by inertial sensors alone. In this paper, a new method is proposed to estimate an indoor mobile robot pose with six degrees of freedom based on an improved 3D-Normal Distributions Transform algorithm (3D-NDT). First, point cloud data are captured by a Kinect sensor and segmented according to the distance to the robot. After the segmentation, the input point cloud data are processed by the Approximate Voxel Grid Filter algorithm in different sized voxel grids. Second, the initial registration and precise registration are performed respectively according to the distance to the sensor. The most distant point cloud data use the 3D-Normal Distributions Transform algorithm (3D-NDT) with large-sized voxel grids for initial registration, based on the transformation matrix from the odometry method. The closest point cloud data use the 3D-NDT algorithm with small-sized voxel grids for precise registration. After the registrations above, a final transformation matrix is obtained and coordinated. Based on this transformation matrix, the pose estimation problem of the indoor mobile robot is solved. Test results show that this method can obtain accurate robot pose estimation and has better robustness.

Keywords

Pose Estimation Point Cloud Registration 3D-Normal Distributions Transform Kinect

1. Introduction

Simultaneous Localization and Mapping (SLAM) has long been a research topic in the field of robotics, with pose estimation one of the key subjects in SLAM research [1]. Due to the restriction on the positioning accuracy of GPS and the blocking of obstacles, the pose estimation of an indoor mobile robot usually relies on inertial sensors. However, it is difficult to obtain the desired positioning accuracy because of the inherent systematic errors and the accumulated measurement error [2]. Therefore, recently developed technologies such as LASER point cloud and depth sensor have been applied to SLAM in recent years [3 –6]. Point cloud registration is a technology to calculate the displacement of two point clouds. The essence of point cloud registration is to minimize the distance between point cloud data measured from two perspectives by virtue of coordinate transformation. The transformation matrix obtained from point cloud registration will be used to calculate the current pose of the robot.

Currently, the most common algorithm for point cloud registration is the Iterative Closest Point (ICP) algorithm [7]. The ICP method provides the framework of the subsequent iteration-based registration algorithms. Many efforts have been made to improve the speed and accuracy of this algorithm [8 –10]. An improved method for point cloud registration algorithm based on Probability Density Functions (PDFs) is the Normal Distributions Transform (NDT) algorithm [11]. The NDT algorithm was combined with the Monte Carlo Localization and used for mobile robot localization in reference [12]. The 3D-NDT algorithm was proposed and compared with the ICP algorithm in reference [13]. Another improved NDT algorithm, the Multi-Layered Normal Distributions Transform (ML-NDT) algorithm was mentioned in reference [14]. The NDT algorithm and the other improved algorithms have been applied in point cloud classification [15], mobile robotic mapping [16] and path planning [17]. For example, the Continuous Normal Distributions Transform (C-NDT) algorithm was proposed and applied in robotic mapping in reference [18] and in reference [19] the NDT algorithm was applied in an autonomous wheel loader. As the NDT algorithm has improved in speed and accuracy, it has been more widely used in robot SLAM. Among the improved algorithms above, the size of voxel grids affects the accuracy and speed of the point cloud registration directly and the appropriate size of voxel grids was achieved with minimum fuss and maximum efficiency.

A mobile robot pose estimation method based on the improved 3D-NDT point cloud registration algorithm is proposed in this paper. The method is based on the distribution of point cloud data from the surrounding environment and the point cloud data are divided by the distance to the sensor. The divided point cloud data registration is performed using the 3D-NDT algorithm based on voxel grids of different sizes and the voxel grid size can be adapted according to the point cloud data distribution; thus the efficiency and accuracy of point cloud registration is improved. Meanwhile, the distant point cloud data use the 3D-Normal Distributions Transform algorithm (3D-NDT) with large-sized voxel grids for initial registration based on the transformation matrix from the odometry method. This improved method could avoid any influence on the subsequent registration algorithm caused by the wrong transformation matrix and could improve the robustness of the point cloud registration. Based on these improvements, the new method can obtain the pose estimation of a mobile robot accurately in a short time.

2. 3D-NDT Algorithm

The 3D-NDT registration algorithm was used for precise registration. In contrast to other point cloud registration algorithms, the 3D-NDT algorithm does not directly use the point cloud data to describe the surface of the object or the environment [13]. If the point cloud data are directly adopted to describe the surface of the environment, a lot of redundant data will be generated when the point cloud data are collected and the complexity of the algorithm will also be increased, since the point cloud has unclear information with respect to surface features (such as direction, smoothness, hole, etc.). The main objective of the 3D-NDT point cloud registration algorithm is to obtain the transformation matrix, which represents the maximum probability of allowing the input point cloud to overlap with the target a point cloud. The rotation matrix and translation matrix in the transformation matrix should be optimal. The algorithm converts the point cloud data within a 3D voxel grid cell into a continuously differentiable PDF. First, the point cloud data are divided into uniformly distributed 3D voxel grid cells with a fixed size and then each 3D voxel grid cell is described by a PDF, which models the generation process of surface point cloud within each 3D voxel grid cell. In other words, the method with a normal distribution describes the position of each point cloud within the 3D voxel grid cell by a piecewise and continuously differentiable probability distribution function [21]:

$P (\vec{x}) = c exp (- \frac{{(\overset{⇀}{x} - \overset{⇀}{u})}^{T} C^{- 1} (\overset{⇀}{x} - \overset{⇀}{u})}{2})$ (1)

where ${\vec{x}}_{i}$ (i=1,……,n)is a point cloud within the 3D voxel grid cell; $\vec{u}$ is the mean vector of point cloud within the 3D voxel grid cell containing $\vec{x}$ , c is a constant, C is the covariance matrix of point cloud within the 3D voxel grid cell containing $\vec{x}$ , and $\vec{u}$ and C within each 3D voxel grid cell can be defined as follows:

$\vec{u} = \frac{1}{n} \sum_{i = 1}^{n} \vec{x_{i}}$ (2)

$C = \frac{1}{n - 1} \sum_{i = 1}^{n} ({\vec{x}}_{i} - \vec{u}) {({\vec{x}}_{i} - \vec{u})}^{T}$ (3)

We assume that there are two point cloud data X and Y, where X is the input point cloud and Y is the target point cloud. If the 3D-NDT algorithm is used for the registration of point cloud data X and Y, the specific steps are as follows:

Initialize the point cloud: to represent the target cloud point Y as PDF, distribute the point in Y to the corresponding voxel grid cells and compute $\vec{u}$ and C for each voxel grid cell;

Coordinate transformation: each point in X is transformed to the target point cloud data Y by using coordinate transformation;

Solve the objective function: calculate the objective function $s (\vec{p})$ of the mapping point of X in Y, as shown in the following equation:

$s (\vec{p}) = - \sum_{i = 1}^{n} P (T (\vec{p}, \vec{x_{i}}))$ (4)

Where $T (\vec{p}, {\vec{x}}_{i})$ is the transformation matrix and can be defined as follows:

$T (\vec{p}, \vec{x_{i}}) = [\begin{matrix} t r_{x}^{2} + c & t r_{x} r_{y} - s r_{z} & t r_{x} r_{y} + s r_{z} \\ t r_{x} r_{y} + s r_{z} & t r_{y}^{2} + c & t r_{y} r_{z} + s r_{x} \\ t r_{x} r_{z} - s r_{y} & t r_{y} r_{z} + s r_{x} & t r_{z}^{2} + c \end{matrix}] \vec{x} + [\begin{matrix} t_{x} \\ t_{y} \\ t_{z} \end{matrix}]$ (5)

Where,

$\vec{p} = [t | r | ϕ], \vec{t} = [\begin{matrix} t_{x} & t_{y} & t_{z} \end{matrix}], \vec{r} = [\begin{matrix} r_{x} & r_{y} & r_{z} \end{matrix}], s = sin ϕ, c = cos ϕ .$

To minimize the objective function $s (\vec{p})$ , use the Newton iterative method and Hessian matrix to find the optimal solution $\vec{p}$ of $s (\vec{p})$ . The Hessian matrix can be used to solve the matching between the current scanned data and the other scanned data without solving the correspondence problem directly. Assuming $f = s (\vec{p})$ , in order to minimize the function f, we have to deal with the following equation in each iteration process:

$H Δ \vec{p} = - \vec{g}$ (6)

$\vec{p} \leftarrow \vec{p} + Δ \vec{p}$ (7)

Where $\vec{g}$ is the transpose gradient of f, which can be expressed as:

$g_{i} = \frac{\partial f}{\partial p_{i}}$ (8)

Each element H_ij in the Hessian matrix H can be expressed as:

$H_{i j} = \frac{\partial^{2} f}{\partial p_{i} \partial p_{j}}$ (9)

Return to Step 4 until the convergence conditions are met;

The 3D-NDT point cloud registration is completed and the transformation matrix is obtained.

During the 3D-NDT point cloud registration, the key step is to select the size of voxel grids. It is sensitive to segmentation because large spatial cells filter out relevant details, whereas small cells augment the computational cost [22]. If the size is large, the corresponding input point cloud data will be a large amount and the fusion range will increase and thus accelerate the speed of the algorithm. However, the non-overlapping part between goal point cloud data and the input point cloud data in voxel grids could cause reduced accuracy of the registration. If the size of voxel grids is appropriate, it can accurately describe the surface details of the surrounding environment - the smaller the size of voxel grids, the higher the accuracy of the registration that is achieved. However, if the size of voxel grids is too small, some input point cloud data may not find their corresponding voxel grids and these point cloud data cannot correct the position of the registration, which will increase the running time of the algorithm.

3. Pose Estimation of Indoor Mobile Robot Based on the Improved 3D-NDT Algorithm

In the 3D-NDT algorithm, the calculation of voxel data corresponding to the target point cloud uses the statistics of points in each voxel grid rather than a single point, so it is not required to process the target point cloud. However, the classic 3D-NDT algorithm does not process the input point cloud. In practice, the point cloud data collected by the Kinect sensor usually contain some invalid data due to the inherent system restriction. Therefore, the large scale of redundant data in the input point cloud greatly increases the complexity of the algorithm.

The improved 3D-NDT algorithm proposed in this paper uses the Approximate Voxel Grid Filter to process the input point cloud according to the distribution of point cloud data. In this way, the amount of input point cloud data can be reduced, which increases the computing speed of the algorithm. Meanwhile, the input point cloud data will be divided according to the distance between the point cloud data and the sensor. The divided point cloud data registration is performed using the 3D-NDT algorithm based on voxel grids of different sizes and the voxel grid size can be adapted according to the point cloud data distribution.

As shown in Figure 1, the main steps of the indoor mobile robot pose estimation method based on the improved 3D-NDT point cloud registration algorithm are the following:

Figure 1.

Flow chart of improved 3D-NDT point cloud registration algorithm

Obtain the Point Data: the indoor mobile robots use Kinect sensor through the Open NI Grabber interface in the point cloud library (PCL) to obtain the point cloud data of surrounding-input point cloud X and the target point cloud Y, which is stored in the format of PCD, including three-dimensional coordinates of the point, etc.; meanwhile, the invalid data must be removed from the acquired point cloud data;

Segmentation of the Point Cloud: the input point cloud will be divided according to the distance d_i between each point of the input point cloud X and the sensors. If d_i<=d₀, then the input point cloud is near, giving rise to X₁. If d_i>d₀, then the input point cloud is far, giving rise to X₂;

Point Cloud Filtering: X₁ and X₂ are processed respectively with different sizes of voxel grid filter;

Initial Registration: the registration of X₂ and Y with odometry obtaining the transformation matrix use the large-sized voxel grids in the 3D-NDT algorithm. E represents the epsilon parameters that define the minimal increment of the transformation vector [x, y, z, roll, pitch, yaw] in terms of the length and angle, respectively, where e₀>e₁;

Precise Registration: combining with the transformation matrix of the initial registration, the registration of X₁ and Y use small-sized voxel grids in the 3D-NDT algorithm;

Calculate Pose Estimation Matrix: the final transformation matrix obtained by precise registration uses a certain coordinate transformation and then obtains the mobile robot pose estimation matrix.

3.1. The point cloud segmentation based on Euclidean distance

Actually, due to depth range limit of the Kinect sensor, the obtained point cloud data may be not too accurate, as the distance to the sensor is more than limit range. At the same time, the selection of voxel grid size in the 3D-NDT algorithm and the Approximate Voxel Grid Filter will affect the pose estimation of the robot. Hence the input point cloud data are divided into two parts by distance based on their 3D coordinate information, in order to select a suitable voxel grid size for the subsequent filtering and registration. Supposing that p_i (x_i, y_i, z_i) is one point of the input point cloud X, d_i is its distance to the Kinect sensor, p₀ (x₀, y₀, z₀) is the centre point of the input point cloud X, d₀ is its distance to the Kinect sensor, the input point cloud X₁ is near to the sensor and the input point cloud X₂ is far from the sensor, the distance between the point cloud data and the sensor is calculated as follows:

$d_{i} = \sqrt{x_{i}^{2} + y_{i}^{2} + z_{i}^{2}}$ (10)

$x_{0} = \frac{1}{n} \sum_{i = 1}^{n} x_{i} y_{0} = \frac{1}{n} \sum_{i = 1}^{n} y_{i} z_{0} = \frac{1}{n} \sum_{i = 1}^{n} z_{i}$ (11)

The main steps of the point cloud segmentation based on the Euclidean distance are as follows:

Calculate the point p₀ and its corresponding d₀ in the input point cloud X;

Traverse each point p_i in the input point cloud; if d_i<= d₀, then it belongs to the input point cloud X₁; otherwise, it belongs to X₂.

3.2. The approximate voxel grid filter

Given that the point cloud data collected by Kinect usually contain some invalid data, the goal of the approximate voxel filtering is therefore not only to reduce the amount of data, but also to remove invalid points. The main steps of the Approximate Voxel Grid Filter algorithm [21] are:

The point cloud data are divided into a certain volume of voxel grid.

The density of the point cloud data in the initial grid is compared with the preset density threshold. If the density of the point cloud data in the initial grid is no less than the preset density threshold, the point cloud data in the voxel grids are considered as valid and the voxel grid is reserved. Otherwise, it is considered invalid, thereby avoiding the impact of redundant data.

The centre of all points in each valid voxel is used as an approximate representation of all points in the voxel grid. The size of the voxel grid directly affects the quality of the filtered input data point cloud. Moreover, the size of the voxel grid is proportional to the amount of data compression. In other words, the bigger the size of voxel grid, the smaller the amount of filtered point cloud data, as shown in Figure 2.

Figure 2.

The Approximate Voxel Grid Filter algorithm

During the approximate voxel filtering, the size of the voxel grids directly affects the quality of the input point cloud data and is proportional to the amount of data compression. In other words, the bigger the size of the voxel grids, the smaller the amount of point cloud data after filtering. Therefore, in the course of handling the near input point cloud, the small voxel grids should be used for filtering where possible, in order to maintain the distribution of the original point cloud. Conversely, the far input point cloud uses the large voxel grids for filtering. In this way the running speed of the algorithm is improved.

3.3. Odometry

Odometry is the most widely used localization method. The odometry data come from the relative movement of the robot (such as the data from an inertial sensor) and it is a simple, low-cost and real-time method. No information from an external sensor is required to estimate the pose of the robot. High positioning accuracy can be guaranteed in the short term. Odometry is based on a fundamental geometric operation, which is relatively simple. In this paper, based on the Markov hypothesis that the current state of the robot only relates to the previous state, the robot pose during the kth sampling period is set as $p_{k} = {(x_{k}, y_{k}, z_{k}, φ_{k}, θ_{k}, ψ_{k})}^{T}$ , where $(x_{k}, y_{k}, z_{k})$ is the robot position and $(φ_{k}, θ_{k}, ψ_{k})$ is its orientation in the coordinate system. Since odometry only draws the coordinates in a horizontal plane $z_{k} = 0, θ_{k} = 0, ψ_{k} = 0$ . Assuming the pose is known and the displacement increment of the left and right wheels from the encoder is $U_{l k}$ and $U_{r k}$ respectively, the pose p_k can be determined by the following equation [23]:

$p_{k} = {(x_{k}, y_{k}, z_{k}, φ_{k}, θ_{k}, ψ_{k})}^{T} = [\begin{matrix} x_{k - 1} + Δ D_{k} cos (φ_{k - 1} + Δ φ_{k} / 2) \\ y_{k - 1} + Δ D_{k} s i n (φ_{k - 1} + Δ φ_{k} / 2) \\ 0 \\ φ_{k - 1} + Δ φ_{k} \\ 0 \\ 0 \end{matrix}]$ (12)

Where $Δ D_{k}$ and $Δ φ_{k}$ are the displacement increment and orientation increment of the robot from step k-1 to step k and $B_{w h e e l}$ is the baseline of the robot. They can be determined by the following equations:

$Δ D_{k} = (U_{l k} + U_{r k}) / 2$ (13)

$Δ φ_{k} = (U_{l k} - U_{r k}) / B_{w h e e l}$ (14)

3.4. Initial registration and precise registration

In the actual registration, the 3D-NDT algorithm employs the voxel grid structure and More-Thuente line search method [24], which needs to set some parameters to ensure the stability and accuracy of the registration. First, to set the appropriate size of the Epsilon parameters, which are defined as the allowable minimum increment of the transformation vector [x, y, z, roll, pitch, yaw] from the length and angle, once the threshold is reduced, the algorithm will be terminated. The More-Thuente line search algorithm can adjust the iteration step size according to the distance from the optimal solution in the registration process and its step size parameter is defined as the maximum step size of the More-Thuente line search, which determines the optimal step size below the maximum value. The size of voxel grid in the point cloud is determined by the resolution parameter, which requires that the distribution of the point cloud can be expressed in a minimum size of the voxel grid.

For better performance of the algorithm, the initial and precise registration is considered separately. In the process of initial registration, in order to achieve a faster convergence of the algorithm, the registration parameters and the resolution of the voxel units use larger values. In the precise registration process, in order to obtain an accurate pose estimate of the robot, its registration parameters and resolution of voxel units are smaller values, so that the registration algorithm can obtain high accuracy results.

3.5. The pose estimation of mobile robot

The pose estimation of the indoor mobile robot is derived from calculating the optimal transformation matrix in the registration process, which represents the coordinate transformation of the input point cloud and the target point clouds. The point cloud data coordinate system acquired through the Kinect sensor is shown in Figure 3, where the origin is the sensor location, the z-axis of the sensor coordinate system is in front of it, the positive x-axis extends to the left and the positive y-axis extends upwards. For simple calculation, let the local coordinate system of the robot coincide with the global coordinate system when obtaining the target point; the robot pose is seen as the origin of the global coordinate system; the local coordinate of the robot coincides with the sensor global coordinate system in the process of the robot moving.

Figure 3.

Local coordinate of Kinect sensor

In this paper, the rotation matrix adopts the Euler angle representation, which is more intuitive. if the mobile robot's six freedom degree of pose estimation is $\vec{P} = (x, y, z, r o l l, p i t c h, y a w)$ , where (roll, pitch, yaw) denotes the rotation angle of the x-axis, y-axis, and z-axis, respectively. The pose of the robot can be estimated by the following method:

Assuming the optimal transformation matrix is:

$\vec{M_{u}} = (\begin{matrix} \vec{R_{u}} & \vec{T_{u}} \\ 0 & 1 \end{matrix})$ (15)

Where the rotation matrix is

$\begin{array}{l} \vec{R_{u}} = (\begin{matrix} 1 & 0 & 0 \\ 0 & cos (r o l l) & - sin (r o l l) \\ 0 & cos (r o l l) & cos (r o l l) \end{matrix}) (\begin{matrix} cos (p i t c h) & 0 & - sin (p i t c h) \\ 0 & 1 & 0 \\ sin (p i t c h) & 0 & 1 \end{matrix}) \\ (\begin{matrix} cos (y a w) & - sin (y a w) & 0 \\ sin (y a w) & cos (y a w) & 0 \\ 0 & 0 & 1 \end{matrix}) = (\begin{matrix} r_{11} & r_{12} & r_{13} \\ r_{21} & r_{22} & r_{23} \\ r_{31} & r_{32} & r_{33} \end{matrix}) \end{array}$ (16)

The translation vector is $\vec{T_{u}} = {[t_{1}, t_{2}, t_{3}]}^{T}$ .

The relationship between the position of the robot and the translation vector $\vec{T_{u}}$ is:

$(x, y, z) = (t_{1}, t_{2}, t_{3})$

The relationship between the pose of the robot and the rotation matrix $\vec{R_{u}}$ is:

$(r o l l, p i t c h, y a w) = (arctan (- r_{23} / r_{33}), arcsin (r_{13}), arctan(- r_{12} / r_{11}))$

4. Experiments and Analysis

To verify the validity of the algorithm, a mobile robot with Kinect (named KR-1) and a navigation platform was adopted in the experiment. As shown in Figure 4, Kinect was installed in front of the robot platform and the robot was connected to a PC by Wi-Fi. The configuration of the PC was: Intel Pentium G645 2.9GHz, 4GB memory, Windows 7, 32-bit. The software platform was Visual Studio 2010, based on Point Cloud Library and Kinect for Windows SDK.

Figure 4.

The KR-1 mobile robot navigation platform

As shown in Figure 5, the robot moves straight at a constant speed and stops after a period of time to measure its true pose, which is (0.061, 0, 0.573, 0, 0, 6.07) (distances are measured in metres and angles in degrees). The point cloud data obtained by Kinect at the beginning represent the target point cloud and the point cloud data obtained when the robot stopped represent the input point cloud. The real test environment is shown in Figure 6, the target point cloud is shown in Figure 7, the input point cloud is shown in Figure 8 and the output point cloud is shown in Figure 9.

Figure 5.

The test trajectory of the robot

Figure 6.

The real testing environment

Figure 7.

The target point cloud

Figure 8.

The input point cloud

Figure 9.

The output point cloud

4.1. Comparative analysis

In order to illustrate and validate the performance of the improved 3D-NDT algorithm (I-NDT), experiments testing the position estimation were conducted with our improved algorithm, normal 3D-NDT algorithm and the traditional ICP algorithm. The number of the target point cloud is 214600 and the initial transformation matrix of the robot obtained by the 3D-NDT algorithm and I-NDT algorithm is (0.018, 0, 0.522, 0, 0, 1.97°). In the normal 3D-NDT algorithm, the voxel grid size is (0.05, 1.0); 0.05 is the size of the voxel grid during the Approximate Voxel Grid Filter and 1.0 is the size during the 3D-NDT algorithm, hence the Epsilon parameter is 0.01, the More-Thuente parameter is 0.1 and the number of iterations is 135. In the initial registration of the I-NDT algorithm, the voxel grid size is (0.2, 1.5), the Epsilon parameter is 0.02, the More-Thuente parameter is 0.2 and the number of iterations is 135. In the precise registration the voxel grid size is (0.035, 0.45), the Epsilon parameter is 0.01, the More-Thuente parameter is 0.1 and the number of iterations is 135. In ICP, the Epsilon parameter is 0.01 and the number of iterations is 135. Comparisons of the experimental results of the three registration algorithms are shown in Table 1.

Table 1.

Comparison of experimental results of three registration algorithms

Algorithm	Number of the Input Point Cloud			The pose estimation of mobile robot		Error	Time/(s)
Algorithm	The Input Point Cloud (near)	The Input Point Cloud (far)	Total	(x, y, z)/(m)	(roll, pitch, yaw)/(°)	Error	Time/(s)
I-NDT	5651	2336	7987	(0.079,0,0.579)	(0,0, 7.80)	0.012m1.70°	1.944
3D-NDT	_	_	11894	(0.092,0,0.588)	(0,0, 8.89)	0.023m2.82°	6.464
ICP	_	_	110274	(0.112,0,0.608)	(0,0, 10.4)	0.043m4.33°	27.686

As shown in Table 1, in our I-NDT algorithm, the size of the voxel grid is determined by the distance of the input point cloud data. The size of the input point cloud (near) is about twice the size of the input point cloud (far). The amount of the input point cloud after filtering is 7% of the origin point cloud, which could certainly improve the speed of the algorithm. Compared with the other two algorithms, it has the highest accuracy and the smallest pose estimation error. In the normal 3D-NDT algorithm, the fixed size of voxel grid was applied and the number of the input point cloud after filtering is invariant, so the speed of the normal algorithm is about 10 times faster than the other improved algorithms. Compared with the I-NDT algorithm, although the number of the input point cloud was larger, the gap in its accuracy and error is about twice that of the I-NDT algorithm. In the ICP algorithm without the Approximate Voxel Grid Filter, the number of the input point cloud is the number of the original input point cloud, hence the algorithm running speed is 10 times faster than the other two algorithms and its pose estimation accuracy is far better than the other two algorithms. The reason is that the ICP algorithm has a degree of overlap between the input point and the target point cloud. In general, the improved 3D-NDT algorithm (I-NDT) ensures the accuracy of registration without reducing the real time.

To verify the robustness of the algorithms, the initial transformation matrix of the robot obtained by odometry is changed into (0,0,0,0,0,0) with the I-NDT algorithm and the 3D-NDT algorithm separately; the comparison results of the pose estimation are shown in Table 2.

Table 2.

Comparison of experimental results of registration algorithms based on the error initial transformation matrix of the odometry method

Algorithm	Number of the Input Point Cloud			The pose estimation of mobile robot		Error	Time/ (s)
Algorithm	The Input Point Cloud (near)	The Input Point Cloud (far)	Total	(x, y, z)/(m)	(roll, pitch, yaw)/(°)	Error	Time/ (s)
I-NDT	5651	2336	7987	(0.081,0,0.579)	(0,0, 7.86)	0.013m1.79°	3.561
3D-NDT	_	_	11894	(0.094,0,0.590)	(0,0, 9.05)	0.025m2.98°	16.173

According to the results in Table 2, there are obvious differences between the two algorithms for example, the run time of the 3D-NDT algorithm is longer than that of the I-NDT algorithm. The reason is that, in the initial registration, the 3D-NDT algorithm with wrong initial transformation matrix results in the registration moving the wrong direction. The purpose of the initial registration of the I-NDT with odometry is to obtain the initial transformation matrix and improve the 3D-NDT algorithm with a longer distance and larger voxel grid than normal. Thus the I-NDT algorithm can correct the error quickly and avoid wasting time in the wrong direction. Compared with the normal 3D-NDT algorithm, the I-NDT algorithm has better robustness.

In addition, we compared the accuracy of the above algorithms with the RGB-D freiburg2_pioneer_360 dataset from the TUM RGB-D dataset, which has accurate ground truth for evaluation [25], and the ground truth trajectory length is 16.118m. During the testing process, the depth image of the datasets was changed into point cloud data (PCD) format for registration positioning. The comparisons of the experimental results of the two registration algorithms are shown in Table 3 and Table 4.

Table 3.

Mean and standard deviation of translational and rotational errors of two algorithms

Algorithm	The pose estimation of mobile robot
	(x, y, z)/ (m)				(roll, pitch, yaw)/ (°)
	MAX	MIN	MEAN	StdDev	MAX	MIN	MEAN	StdDev
I-NDT	0.045	0.0004	0.014	0.012	3.495	0.017	0.974	0.859
3D-NDT	0.067	0.0002	0.016	0.017	3.838	0.011	1.948	1.748

Table 4.

Mean and standard deviation of Algorithm run times of each algorithm.

Algorithm	Algorithm Runtimes(s)
Algorithm	MAX	MIN	MEAN	StdDev
I-NDT	4.71	1.41	2.604	0.719
3D-NDT	24.04	72.05	27.33	7.774

According to the results in Table 3 and Table 4, the I-NDT algorithm has a better performance in accuracy and speed. Particularly in terms of speed, the running time of the I-NDT algorithm is only 1/10 of the 3D-NDT algorithm, which is because the I-NDT algorithm has a more accurate initial guess matrix than the 3D-NDT algorithm and could reduce a large amount of the input cloud data.

5. Conclusions

To solve the problem of indoor mobile robot localization, a pose estimation method based on an improved 3D-NDT algorithm was proposed in this paper. On the basis of the normal 3D-NDT algorithm and the Approximate Voxel Grid Filter, the improved algorithm adopted the point cloud segmentation based on Euclidean distance and the size of the voxel grid was adapted to the distribution of point clouds. The initial registration process combined with the Odometry and 3D-NDT algorithm based on large voxel grids was to obtain the transformation matrix and the precise registration used the 3D-NDT algorithm based on small voxel grids. Finally, the pose estimation of the indoor mobile robot was obtained based on the transformation matrix after registration. The experiment results showed that, compared with the normal 3D-NDT algorithm, the running time of the improved 3D-NDT algorithm (I-NDT) significantly reduced the original running time, which could meet the real-time requirement of a robot SLAM system. Meanwhile, the accuracy of registration was also improved and better robustness was validated. The obtained pose estimation of the robot provided a more accurate representation of the current position from the experiments. Future work will focus on the classification of voxel grid in the NDT algorithm and the accuracy of the filtering algorithm.

Footnotes

6. Acknowledgements

This work was supported by the National Natural Science Foundation of China (NO. 61303183),the Natural Science Foundation of Jiangsu Province (NO. BK20130204) and the student Innovation and Entrepreneurship Foundation of China University of Mining and Technology (NO.201413).

References

Zhang

(2014) A novel robust approach for SLAM of mobile robot. J. Cent. South Univ. 21: 2208–2215.

Puyol

Bobkov

Robertson

(2014) Pedestrian Simultaneous Localization and Mapping in Multistory Buildings Using Inertial Sensors. IEEE Trans. Intell. Transp. Syst. 15: 1714–1727.

Henry

Krainin

Herbst

. (2012) RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments. Int. J. Robot. Res. 31(5): 647–663.

Susperregi

Martínez-Otzeta

Ansuategui

. (2013) RGB-D, LASER and thermal sensor fusion for people following in a mobile robot. Int. J. Adv. Robot. Syst, 10:271. doi: 10.5772/56123.

Singh

Sharma

Pal

(2013) Mobile Robot Localization with Kinect RGB-D Sensor. In: Subir

, editor. International Conference on Advances in Robotics (AIR 2013). 2013 Jul 4–6; Pune, India. USA: ACM. pp.1–6.

Munaro

Menegatti

(2014) Fast RGB-D people tracking for service robots. Auton Robots. 37: 227–242.

Besl

McKay

(1992) Method for registration of 3-D shapes. IEEE Trans. Pattern Anal. Mach. Intell. (USA). 14: 239–256.

Xin

(2010) An improved ICP algorithm for point cloud registration. In: Geng

, editor. 2010 International Conference on Computational and Information Sciences (ICCIS 2010). 2010 Dec 17–19; Chengdu, China. Los Alamitos, CA, USA: IEEE Computer Society. pp. 565–568.

Ratter

Sammut

McGill

(2013) GPU Accelerated Graph SLAM and Occupancy Voxel Based ICP for Encoder-Free Mobile Robots. In: Amato

, editor. International Conference on Intelligent Robots and Systems (IROS). 2013 Nov 3–8; Tokyo, Japan. Piscataway, NJ, USA: IEEE. pp. 540–547.

10.

Ezra

Sharir

Efrat

(2008) On the performance of the ICP algorithm. Comput. Geom., Theory Appl. 41(1): 77–93.

11.

Biber

Strasser

(2003) The normal distributions transform: a new approach to LASER scan matching. 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems. 2003 Oct 27–31; Las Vegas, NV, USA. Piscataway, NJ, USA: IEEE. pp. 2743–2748.

12.

Cai

Hong

(2005) Localization of Mobile Robots by NDT LASER Scan Matching Algorithm. ROBOT. 27(005): 414–419.

13.

Magnusson

Nuchter

Lorken

. (2009) Evaluation of 3D registration reliability and speed - A comparison of ICP and NDT. In: Katsushi

, editor. 2009 IEEE International Conference on Robotics and Automation (ICRA). 2009 May 12–17; Kobe, Japan. Piscataway, NJ, USA: IEEE. pp. 3907–3912.

14.

Ulas

Temelta

(2013) 3D multi-layered normal distribution transform for fast and long range scan matching. J. Intell. Robot. Syst. 71(1): 85–108.

15.

Choe

Shim

Chung

(2013) Urban structure classification using the 3D normal distribution transform for practical robot applications. Adv. Robot. 27(5): 351–371.

16.

Takeuchi

Tsubouchi

(2006) A 3-D scan matching using improved 3-D normal distributions transform for mobile robotic mapping. In: Wang

Fung

Sheng

, editors. 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2006). 2006 Oct 9–15; Beijing, China. USA: IEEE. pp. 3068–3073.

17.

Stoyanov

Magnusson

Andreasson

. (2010) Path planning in 3D environments using the normal distributions transform. 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2010). 2010 Oct 18–22; Taipei, Taiwan. Piscataway, NJ, USA: IEEE. pp. 3263–3268.

18.

Campbell

Whitty

Lim

(2012) Mobile 3D Indoor Mapping Using the Continuous Normal Distributions Transform. 2012 International Conference on Indoor Positioning and Indoor Navigation (IPIN 2012). 2012 Nov 13–15; Sydney, NSW, Australia. USA: IEEE Computer Society. pp. 1–9.

19.

Almqvist

Magnusson

Lilienthal

(2013) Improving Point Cloud Accuracy Obtained from a Moving Platform for Consistent Pile Attack Pose Estimation. J. Intell. Rob. Syst. Theor. Appl. 75: 101–128.

20.

Zhu

(2012) PCL tutorial. Beijing, China: Beijing University of Aeronautics and Astronautics Press. 333.

21.

Magnusson

(2009) The three-dimensional normal distributions transform: an efficient representation for registration, surface analysis, and loop detection. Örebro, Sweden: Örebro Universitet. 58–61.

22.

Zheng

Zhang

(2015) Experimental Evaluation of RGB-D Visual Odometry Methods. Int. J. Robot. Res. 12:26.

23.

Chenavier

Crowley

(1992) Position estimation for a mobile robot using vision and odometry. In: Proceedings of the 1992 IEEE International Conference on Robotics and Automation. 1992 May; Nice, France. pp. 2588–2593. doi: 10.1109/ROBOT. 1992.220052.

24.

Moré

Thuente

(1994), Linesearch algorithms with guaranteed sufficient decrease, ACM Transactions on Mathematical Software. 20: 286–307.

25.

Sturm

Burgard

Cremers

Evaluating Egomotion and Structure-from-Motion Approaches Using the TUM RGB-D Benchmark. IEEE/RJS International Conference on Intelligent Robot, 2012.