Abstract
1. Introduction
Simultaneous Localization and Mapping (SLAM) has long been a research topic in the field of robotics, with pose estimation one of the key subjects in SLAM research [1]. Due to the restriction on the positioning accuracy of GPS and the blocking of obstacles, the pose estimation of an indoor mobile robot usually relies on inertial sensors. However, it is difficult to obtain the desired positioning accuracy because of the inherent systematic errors and the accumulated measurement error [2]. Therefore, recently developed technologies such as LASER point cloud and depth sensor have been applied to SLAM in recent years [3–6]. Point cloud registration is a technology to calculate the displacement of two point clouds. The essence of point cloud registration is to minimize the distance between point cloud data measured from two perspectives by virtue of coordinate transformation. The transformation matrix obtained from point cloud registration will be used to calculate the current pose of the robot.
Currently, the most common algorithm for point cloud registration is the Iterative Closest Point (ICP) algorithm [7]. The ICP method provides the framework of the subsequent iteration-based registration algorithms. Many efforts have been made to improve the speed and accuracy of this algorithm [8–10]. An improved method for point cloud registration algorithm based on Probability Density Functions (PDFs) is the Normal Distributions Transform (NDT) algorithm [11]. The NDT algorithm was combined with the Monte Carlo Localization and used for mobile robot localization in reference [12]. The 3D-NDT algorithm was proposed and compared with the ICP algorithm in reference [13]. Another improved NDT algorithm, the Multi-Layered Normal Distributions Transform (ML-NDT) algorithm was mentioned in reference [14]. The NDT algorithm and the other improved algorithms have been applied in point cloud classification [15], mobile robotic mapping [16] and path planning [17]. For example, the Continuous Normal Distributions Transform (C-NDT) algorithm was proposed and applied in robotic mapping in reference [18] and in reference [19] the NDT algorithm was applied in an autonomous wheel loader. As the NDT algorithm has improved in speed and accuracy, it has been more widely used in robot SLAM. Among the improved algorithms above, the size of voxel grids affects the accuracy and speed of the point cloud registration directly and the appropriate size of voxel grids was achieved with minimum fuss and maximum efficiency.
A mobile robot pose estimation method based on the improved 3D-NDT point cloud registration algorithm is proposed in this paper. The method is based on the distribution of point cloud data from the surrounding environment and the point cloud data are divided by the distance to the sensor. The divided point cloud data registration is performed using the 3D-NDT algorithm based on voxel grids of different sizes and the voxel grid size can be adapted according to the point cloud data distribution; thus the efficiency and accuracy of point cloud registration is improved. Meanwhile, the distant point cloud data use the 3D-Normal Distributions Transform algorithm (3D-NDT) with large-sized voxel grids for initial registration based on the transformation matrix from the odometry method. This improved method could avoid any influence on the subsequent registration algorithm caused by the wrong transformation matrix and could improve the robustness of the point cloud registration. Based on these improvements, the new method can obtain the pose estimation of a mobile robot accurately in a short time.
2. 3D-NDT Algorithm
The 3D-NDT registration algorithm was used for precise registration. In contrast to other point cloud registration algorithms, the 3D-NDT algorithm does not directly use the point cloud data to describe the surface of the object or the environment [13]. If the point cloud data are directly adopted to describe the surface of the environment, a lot of redundant data will be generated when the point cloud data are collected and the complexity of the algorithm will also be increased, since the point cloud has unclear information with respect to surface features (such as direction, smoothness, hole, etc.). The main objective of the 3D-NDT point cloud registration algorithm is to obtain the transformation matrix, which represents the maximum probability of allowing the input point cloud to overlap with the target a point cloud. The rotation matrix and translation matrix in the transformation matrix should be optimal. The algorithm converts the point cloud data within a 3D voxel grid cell into a continuously differentiable PDF. First, the point cloud data are divided into uniformly distributed 3D voxel grid cells with a fixed size and then each 3D voxel grid cell is described by a PDF, which models the generation process of surface point cloud within each 3D voxel grid cell. In other words, the method with a normal distribution describes the position of each point cloud within the 3D voxel grid cell by a piecewise and continuously differentiable probability distribution function [21]:
where
We assume that there are two point cloud data X and
Initialize the point cloud: to represent the target cloud point
Coordinate transformation: each point in
Solve the objective function: calculate the objective function
Where
Where,
To minimize the objective function
Where
Each element
Return to Step 4 until the convergence conditions are met;
The 3D-NDT point cloud registration is completed and the transformation matrix is obtained.
During the 3D-NDT point cloud registration, the key step is to select the size of voxel grids. It is sensitive to segmentation because large spatial cells filter out relevant details, whereas small cells augment the computational cost [22]. If the size is large, the corresponding input point cloud data will be a large amount and the fusion range will increase and thus accelerate the speed of the algorithm. However, the non-overlapping part between goal point cloud data and the input point cloud data in voxel grids could cause reduced accuracy of the registration. If the size of voxel grids is appropriate, it can accurately describe the surface details of the surrounding environment - the smaller the size of voxel grids, the higher the accuracy of the registration that is achieved. However, if the size of voxel grids is too small, some input point cloud data may not find their corresponding voxel grids and these point cloud data cannot correct the position of the registration, which will increase the running time of the algorithm.
3. Pose Estimation of Indoor Mobile Robot Based on the Improved 3D-NDT Algorithm
In the 3D-NDT algorithm, the calculation of voxel data corresponding to the target point cloud uses the statistics of points in each voxel grid rather than a single point, so it is not required to process the target point cloud. However, the classic 3D-NDT algorithm does not process the input point cloud. In practice, the point cloud data collected by the Kinect sensor usually contain some invalid data due to the inherent system restriction. Therefore, the large scale of redundant data in the input point cloud greatly increases the complexity of the algorithm.
The improved 3D-NDT algorithm proposed in this paper uses the Approximate Voxel Grid Filter to process the input point cloud according to the distribution of point cloud data. In this way, the amount of input point cloud data can be reduced, which increases the computing speed of the algorithm. Meanwhile, the input point cloud data will be divided according to the distance between the point cloud data and the sensor. The divided point cloud data registration is performed using the 3D-NDT algorithm based on voxel grids of different sizes and the voxel grid size can be adapted according to the point cloud data distribution.
As shown in Figure 1, the main steps of the indoor mobile robot pose estimation method based on the improved 3D-NDT point cloud registration algorithm are the following:

Flow chart of improved 3D-NDT point cloud registration algorithm
Obtain the Point Data: the indoor mobile robots use Kinect sensor through the Open NI Grabber interface in the point cloud library (PCL) to obtain the point cloud data of surrounding-input point cloud
Segmentation of the Point Cloud: the input point cloud will be divided according to the distance
Point Cloud Filtering:
Initial Registration: the registration of
Precise Registration: combining with the transformation matrix of the initial registration, the registration of
Calculate Pose Estimation Matrix: the final transformation matrix obtained by precise registration uses a certain coordinate transformation and then obtains the mobile robot pose estimation matrix.
3.1. The point cloud segmentation based on Euclidean distance
Actually, due to depth range limit of the Kinect sensor, the obtained point cloud data may be not too accurate, as the distance to the sensor is more than limit range. At the same time, the selection of voxel grid size in the 3D-NDT algorithm and the Approximate Voxel Grid Filter will affect the pose estimation of the robot. Hence the input point cloud data are divided into two parts by distance based on their 3D coordinate information, in order to select a suitable voxel grid size for the subsequent filtering and registration. Supposing that
The main steps of the point cloud segmentation based on the Euclidean distance are as follows:
Calculate the point
Traverse each point
3.2. The approximate voxel grid filter
Given that the point cloud data collected by Kinect usually contain some invalid data, the goal of the approximate voxel filtering is therefore not only to reduce the amount of data, but also to remove invalid points. The main steps of the Approximate Voxel Grid Filter algorithm [21] are:
The point cloud data are divided into a certain volume of voxel grid.
The density of the point cloud data in the initial grid is compared with the preset density threshold. If the density of the point cloud data in the initial grid is no less than the preset density threshold, the point cloud data in the voxel grids are considered as valid and the voxel grid is reserved. Otherwise, it is considered invalid, thereby avoiding the impact of redundant data.
The centre of all points in each valid voxel is used as an approximate representation of all points in the voxel grid. The size of the voxel grid directly affects the quality of the filtered input data point cloud. Moreover, the size of the voxel grid is proportional to the amount of data compression. In other words, the bigger the size of voxel grid, the smaller the amount of filtered point cloud data, as shown in Figure 2.

The Approximate Voxel Grid Filter algorithm
During the approximate voxel filtering, the size of the voxel grids directly affects the quality of the input point cloud data and is proportional to the amount of data compression. In other words, the bigger the size of the voxel grids, the smaller the amount of point cloud data after filtering. Therefore, in the course of handling the near input point cloud, the small voxel grids should be used for filtering where possible, in order to maintain the distribution of the original point cloud. Conversely, the far input point cloud uses the large voxel grids for filtering. In this way the running speed of the algorithm is improved.
3.3. Odometry
Odometry is the most widely used localization method. The odometry data come from the relative movement of the robot (such as the data from an inertial sensor) and it is a simple, low-cost and real-time method. No information from an external sensor is required to estimate the pose of the robot. High positioning accuracy can be guaranteed in the short term. Odometry is based on a fundamental geometric operation, which is relatively simple. In this paper, based on the Markov hypothesis that the current state of the robot only relates to the previous state, the robot pose during the
Where
3.4. Initial registration and precise registration
In the actual registration, the 3D-NDT algorithm employs the voxel grid structure and More-Thuente line search method [24], which needs to set some parameters to ensure the stability and accuracy of the registration. First, to set the appropriate size of the Epsilon parameters, which are defined as the allowable minimum increment of the transformation vector [x, y, z,
For better performance of the algorithm, the initial and precise registration is considered separately. In the process of initial registration, in order to achieve a faster convergence of the algorithm, the registration parameters and the resolution of the voxel units use larger values. In the precise registration process, in order to obtain an accurate pose estimate of the robot, its registration parameters and resolution of voxel units are smaller values, so that the registration algorithm can obtain high accuracy results.
3.5. The pose estimation of mobile robot
The pose estimation of the indoor mobile robot is derived from calculating the optimal transformation matrix in the registration process, which represents the coordinate transformation of the input point cloud and the target point clouds. The point cloud data coordinate system acquired through the Kinect sensor is shown in Figure 3, where the origin is the sensor location, the z-axis of the sensor coordinate system is in front of it, the positive x-axis extends to the left and the positive y-axis extends upwards. For simple calculation, let the local coordinate system of the robot coincide with the global coordinate system when obtaining the target point; the robot pose is seen as the origin of the global coordinate system; the local coordinate of the robot coincides with the sensor global coordinate system in the process of the robot moving.

Local coordinate of Kinect sensor
In this paper, the rotation matrix adopts the Euler angle representation, which is more intuitive. if the mobile robot's six freedom degree of pose estimation is
Assuming the optimal transformation matrix is:
Where the rotation matrix is
The translation vector is
The relationship between the position of the robot and the translation vector
The relationship between the pose of the robot and the rotation matrix
4. Experiments and Analysis
To verify the validity of the algorithm, a mobile robot with Kinect (named KR-1) and a navigation platform was adopted in the experiment. As shown in Figure 4, Kinect was installed in front of the robot platform and the robot was connected to a PC by Wi-Fi. The configuration of the PC was: Intel Pentium G645 2.9GHz, 4GB memory, Windows 7, 32-bit. The software platform was Visual Studio 2010, based on Point Cloud Library and Kinect for Windows SDK.

The KR-1 mobile robot navigation platform
As shown in Figure 5, the robot moves straight at a constant speed and stops after a period of time to measure its true pose, which is (0.061, 0, 0.573, 0, 0, 6.07) (distances are measured in metres and angles in degrees). The point cloud data obtained by Kinect at the beginning represent the target point cloud and the point cloud data obtained when the robot stopped represent the input point cloud. The real test environment is shown in Figure 6, the target point cloud is shown in Figure 7, the input point cloud is shown in Figure 8 and the output point cloud is shown in Figure 9.

The test trajectory of the robot

The real testing environment

The target point cloud

The input point cloud

The output point cloud
4.1. Comparative analysis
In order to illustrate and validate the performance of the improved 3D-NDT algorithm (I-NDT), experiments testing the position estimation were conducted with our improved algorithm, normal 3D-NDT algorithm and the traditional ICP algorithm. The number of the target point cloud is 214600 and the initial transformation matrix of the robot obtained by the 3D-NDT algorithm and I-NDT algorithm is (0.018, 0, 0.522, 0, 0, 1.97°). In the normal 3D-NDT algorithm, the voxel grid size is (0.05, 1.0); 0.05 is the size of the voxel grid during the Approximate Voxel Grid Filter and 1.0 is the size during the 3D-NDT algorithm, hence the Epsilon parameter is 0.01, the More-Thuente parameter is 0.1 and the number of iterations is 135. In the initial registration of the I-NDT algorithm, the voxel grid size is (0.2, 1.5), the Epsilon parameter is 0.02, the More-Thuente parameter is 0.2 and the number of iterations is 135. In the precise registration the voxel grid size is (0.035, 0.45), the Epsilon parameter is 0.01, the More-Thuente parameter is 0.1 and the number of iterations is 135. In ICP, the Epsilon parameter is 0.01 and the number of iterations is 135. Comparisons of the experimental results of the three registration algorithms are shown in Table 1.
Comparison of experimental results of three registration algorithms
As shown in Table 1, in our I-NDT algorithm, the size of the voxel grid is determined by the distance of the input point cloud data. The size of the input point cloud (near) is about twice the size of the input point cloud (far). The amount of the input point cloud after filtering is 7% of the origin point cloud, which could certainly improve the speed of the algorithm. Compared with the other two algorithms, it has the highest accuracy and the smallest pose estimation error. In the normal 3D-NDT algorithm, the fixed size of voxel grid was applied and the number of the input point cloud after filtering is invariant, so the speed of the normal algorithm is about 10 times faster than the other improved algorithms. Compared with the I-NDT algorithm, although the number of the input point cloud was larger, the gap in its accuracy and error is about twice that of the I-NDT algorithm. In the ICP algorithm without the Approximate Voxel Grid Filter, the number of the input point cloud is the number of the original input point cloud, hence the algorithm running speed is 10 times faster than the other two algorithms and its pose estimation accuracy is far better than the other two algorithms. The reason is that the ICP algorithm has a degree of overlap between the input point and the target point cloud. In general, the improved 3D-NDT algorithm (I-NDT) ensures the accuracy of registration without reducing the real time.
To verify the robustness of the algorithms, the initial transformation matrix of the robot obtained by odometry is changed into (0,0,0,0,0,0) with the I-NDT algorithm and the 3D-NDT algorithm separately; the comparison results of the pose estimation are shown in Table 2.
Comparison of experimental results of registration algorithms based on the error initial transformation matrix of the odometry method
According to the results in Table 2, there are obvious differences between the two algorithms for example, the run time of the 3D-NDT algorithm is longer than that of the I-NDT algorithm. The reason is that, in the initial registration, the 3D-NDT algorithm with wrong initial transformation matrix results in the registration moving the wrong direction. The purpose of the initial registration of the I-NDT with odometry is to obtain the initial transformation matrix and improve the 3D-NDT algorithm with a longer distance and larger voxel grid than normal. Thus the I-NDT algorithm can correct the error quickly and avoid wasting time in the wrong direction. Compared with the normal 3D-NDT algorithm, the I-NDT algorithm has better robustness.
In addition, we compared the accuracy of the above algorithms with the RGB-D freiburg2_pioneer_360 dataset from the TUM RGB-D dataset, which has accurate ground truth for evaluation [25], and the ground truth trajectory length is 16.118m. During the testing process, the depth image of the datasets was changed into point cloud data (PCD) format for registration positioning. The comparisons of the experimental results of the two registration algorithms are shown in Table 3 and Table 4.
Mean and standard deviation of translational and rotational errors of two algorithms
Mean and standard deviation of Algorithm run times of each algorithm.
According to the results in Table 3 and Table 4, the I-NDT algorithm has a better performance in accuracy and speed. Particularly in terms of speed, the running time of the I-NDT algorithm is only 1/10 of the 3D-NDT algorithm, which is because the I-NDT algorithm has a more accurate initial guess matrix than the 3D-NDT algorithm and could reduce a large amount of the input cloud data.
5. Conclusions
To solve the problem of indoor mobile robot localization, a pose estimation method based on an improved 3D-NDT algorithm was proposed in this paper. On the basis of the normal 3D-NDT algorithm and the Approximate Voxel Grid Filter, the improved algorithm adopted the point cloud segmentation based on Euclidean distance and the size of the voxel grid was adapted to the distribution of point clouds. The initial registration process combined with the Odometry and 3D-NDT algorithm based on large voxel grids was to obtain the transformation matrix and the precise registration used the 3D-NDT algorithm based on small voxel grids. Finally, the pose estimation of the indoor mobile robot was obtained based on the transformation matrix after registration. The experiment results showed that, compared with the normal 3D-NDT algorithm, the running time of the improved 3D-NDT algorithm (I-NDT) significantly reduced the original running time, which could meet the real-time requirement of a robot SLAM system. Meanwhile, the accuracy of registration was also improved and better robustness was validated. The obtained pose estimation of the robot provided a more accurate representation of the current position from the experiments. Future work will focus on the classification of voxel grid in the NDT algorithm and the accuracy of the filtering algorithm.
