Sage Journals: Discover world-class research

Abstract

The robots that will be needed in the near future are human-friendly robots that are able to coexist with humans and support humans effectively. To realize this, it is necessary for a robot to carry out human tracking as one of its human-affinitive movements.

In this research, a predictable robotic space is introduced in order for a robot to follow a walking human by the shortest time trajectory. The mobile robot is controlled to follow the walking human using distributed networked sensors. The moving object is assumed to be a point-object and projected onto an image plane to form a geometrical constraint equation that provides position data of the object based on the kinematics of the robotic space. The computer simulation and experimental results on the mobile robot's success in estimating information and following a walking human are presented.

Keywords

mobile robot distributed sensors robotic space estimation Kalman filter

1. Introduction

In recent years, the research field on the intelligent environment has been expanding (Jin et al., 2001; Lee et al., 1999; Lee et al., 2002). An intelligent environment is the space where many intelligent devices, such as computers and sensors, are distributed. As many intelligent devices cooperate, the environment comes to have intelligence. The environment supports humans in the intelligent environment physically and informationally, facilitating use of advanced computers and complicated mechanical systems. The human-tracking robot in this research implements a method of maintaining a certain relative positional relationship between the human and the robot. It has been reported that a human being accompanied by a mobile robot leads to mutual interactions (Akiyama et al., 2002). Jin et al. (2006) describe a mobile robot which always faces and follows human acts as an assistant robot.

Lee et al. (1999) considered four-legged mobile robots for following humans. However, these studies only addressed the problem of how to follow humans, not how to detect the presence of humans. They developed their studies on the premise that human detection is possible. Technology which includes the recognition of humans, a position estimation technique for a mobile robot and humans, and a control strategy for following humans who are walking in a stable way, is required in order to realize robot human-tracking behaviour. In order to recognize humans, most human-tracking robots are mounted with many sensors, such as charge-coupled device (CCD) cameras, ultrasonic sensors, etc. These sensors detect the relative position of the mobile robot to the target human. The mobile robot in Kim et al. (2001) recognizes a human's skin colour using a CCD camera, and traces the target human by combining pan-tilt control of a CCD camera. In addition to the vision sensor, a voice recognition sensor and LED sensors are mounted in the mobile robot (Lee et al., 1999; Lee et al., 2002), which is able to follow humans in an outdoor environment. Most of the proposed human-tracking robots burden the target human with special equipment. It is very difficult for a mobile robot to continue following a human taking the shortest time-path while avoiding other obstacles, without missing a target walking at a natural speed, since the stand-alone robot has limitations in terms of recognition performance (Inamura et al., 1998).

In this research, a robotics space is used in order to solve these problems. A mobile robot cooperates with multiple intelligent sensors, which are distributed around the environment. The distributed sensors recognize the walking human and the mobile robot, and give control commands to the robot to follow the walking human in the shortest time-path. We aim to achieve a human-tracking robot without placing any burden on the human with a mobile robot that is simple in structure. We propose predictable robotic space as an intelligent environment with many intelligent sensors to estimate the shortest path and time, and are building an environment where humans and mobile robots can coexist. The human-tracking robot of this research is one of the physical agents for human support in robotic space.

First, this paper briefly introduces the configuration of the robotic space and the basic human-tracking concept in Section II. Section III explains the coordinates for mobile robot control, and Section IV deals with trajectory estimation of a walking human using computer simulations. In Section V, the motion planning involved in tracking a walking human and experimental results to verify the efficiency of the system are shown. Finally, Section VI presents conclusions and mentions possible related work for the future.

2. Structure of the Robotic Space

2.1 Human-Tracking Concept

Robotic Space (Lee et al., 2002) is a space where many intelligent devices are distributed throughout the whole of the space, as shown in Fig. 1. These intelligent devices have sensing, processing and networking functions, and are called distributed networked sensors (DNS). These devices observe the positions and behaviour of the humans and robots coexisting in the robotic space. The information acquired by each DNS is shared among the others through the network communication system. Based on the accumulated information, the environment as a system is able to understand the intention of humans. For supporting humans, the environment/system utilizes machines including computers and robots.

Figure 1.

Structure of robotic space by distributed cameras.

As the basic concepts for the human-tracking robot in robotic space, a new scheme for a mobile robot to estimate and follow walking humans using images of cameras in robotic space is proposed. The positions of a walking human and a mobile robot were estimated respectively using the kinematics of the camera adopted as the sensor in robotic space, and images of the walking human and mobile robot, assuming that the robot is flat and small on the floor.

The linear and angular velocities of the walking human were estimated for the human-tracking robot to predict the future trajectory of the human; the robot then estimates the shortest time-path to follow the walking human. A state estimator was designed to overcome the uncertainties from the image data caused by the point-object assumption and physical noises, using a Kalman filter. Based on the estimated velocities of the human, the position of the human-tracking robot was controlled to follow a walking human on the centre of the image frame.

2.2 System Structure

Three DNSs are used to recognize the mobile robot and to generate the control commands. The other three DNSs are used to recognize the position of the human. DNSs are placed as shown in Fig. 1. The placement of the three DNSs for human recognition is optimized to expand the viewable area of the cameras so that the head and hands of the human can be recognized over a wide area. On the other hand, the placement of DNSs for the mobile robot has to be decided by trial and error. It is desirable that the DNSs for the mobile robot recognize the whole of the area covered by the three DNSs for human recognition in order to achieve the human-tracking system and reliable mobile robot control. Thus, three DNSs are placed so that the area for human recognition is completely covered. Human walking information is extracted by background subtraction and by detecting the skin colour of a face and hands on captured images.

A differential wheel velocity-type mobile robot is used for the human-tracking robot. Since the DNSs take charge of the sensing and processing in robotic space, the mobile robots do not need any special functions or devices, except for an ability to move and a wireless network device to allow for communication with the DNSs.

This mobile robot (Pioneer3-DX datasheet et al., 2012) is connected to the DNS network via wireless LAN, as shown in Fig. 2, and shares the resources of the DNSs.

Figure 2.

Human and mobile robot in robotic space.

3. Coordinates for Human-Tracking Control

The camera system in robotic space has the ability of panning and tilting, as shown in Fig. 3. The position and posture of the camera are defined according to the base frame. According to the Denavit-Hartenberg convention, the homogeneous matrix can be obtained after establishing the coordinate system and representing parameters, and an attitude vector of the homogeneous matrix represents Roll(θ_R), Pitch(θ_P) and Yaw(θ_Y) angles by tilting and panning angles of the camera (Kim et al, 2001).

Figure 3.

Estimation of position information, r̂₀ (upper figure), ₀ (lower one).

To measure the distance from a camera to objects using the camera images, at least two image frames captured for the same object at different locations are necessary. Usually, a stereo-camera system has been used to obtain the distance information (Yoda et al, 2006; Agrawal et al, 2006). However, there exist uncertainties in feature point matching and it takes too long to be implemented in real-time. This approach requires only a frame to measure the distance to the object from the CCD camera. Since the approach becomes possible by assuming that a point-object is located on the floor, there also exist uncertainties in the position estimation. To minimize the uncertainty in the position estimation and to estimate the velocities of the moving object together, a state estimator is designed based on the Kalman filter. The image coordinates for the point object, (j,k), is transformed to the image centre coordinates which are orientation invariant in terms of the Roll angle, θ_R (Jin et al, 2006) and the size of the image frame, P_x andP_y, (j, k'):

$[\begin{matrix} j^{'} \\ k^{'} \end{matrix}] = [\begin{matrix} \cos (θ_{R}) & - \sin (θ_{R}) \\ \sin (θ_{R}) & \cos (θ_{R}) \end{matrix}] [\begin{matrix} j - \frac{P_{x}}{2} \\ k - \frac{P_{y}}{2} \end{matrix}]$ (1)

where P_x and P_y represent x and y directional size of the image frame in pixels, respectively.

To estimate the real location, (x₀,y₀), ${\hat{θ}}_{0}$ and r̂₀ are estimated using the linear relationship between the real object range within the view angle and the image frame. That is, for a given set of ( ${\hat{θ}}_{0}$ , ${\hat{r}}_{0}$ ), there is one-to-one correspondence between the real object point and the image point. When a point image is captured at (J, k') on the image centre frame, the real object position, 8₀ and r₀ can be estimated as follows, as illustrated in Fig. 3:

${\hat{r}}_{0} = \frac{z_{c c d}}{\cos (\frac{π}{2} - θ_{P} + \frac{k^{'}}{P_{y}} θ_{r y})}$ (2)

$\overset{\land}{θ_{0}} = \frac{j^{'}}{P_{x}} θ_{r x}$ (3)

where θ_rx and θ_ry represent the x and y directional view angles of the CCD camera, respectively. When (j, k) is a image coordinate for the point object, (j, k') is an image coordinate transformed to the image centre coordinate. The position of the object in relation to the robot coordinates, (x, y), can be estimated using ${\hat{θ}}_{0}$ and ${\hat{r}}_{0}$ as follows:

${\hat{x}}_{0} = \cos (θ_{Y}) + {\hat{r}}_{0} \cdot \cos (θ_{Y} + {\hat{θ}}_{0})$ (4)

${\hat{y}}_{0} = \sin (θ_{Y}) + {\hat{r}}_{0} \cdot \sin (θ_{Y} + {\hat{θ}}_{0})$ (5)

where θ_Y represents the angle between the mobile robot and the camera of DINDs.

4. Trajectory Estimation of a Human

4.1 Human modelling

When the velocity and acceleration of the walking human and mobile robot can each be estimated, the next human position (T_x, T_y) can be predicted as follows (DeWit et al, 1992; Sorenson, 1996):

${\hat{T}}_{x + δ t} = {\hat{T}}_{x} + {\hat{V}}_{x} δ t + \frac{1}{2} {\hat{A}}_{x} δ t^{2}$ (6)

${\hat{T}}_{y + δ t} = {\hat{T}}_{y} + {\hat{V}}_{y} δ t + \frac{1}{2} {\hat{A}}_{y} δ t^{2}$ (7)

where δt is the sampling time, and (T̂_x, T̂_y), (V̂_x, V̂_y) and (Â_x, Â_y) are the current Cartesian coordinate estimates of the human position, velocity and acceleration, respectively.

In the X-Y coordinates, movement of the object can be delineated into the linear velocity element and the angular velocity element, as follows (Jin et al, 2006):

$\begin{array}{l} δ x_{k + δ t, k} = v_{k} δ t \cos (θ_{k} + \frac{1}{2} ω_{k} δ t) \\ \approx v_{k} \cos (θ_{k}) δ t - \frac{1}{2} ω_{k} v_{k} \sin (θ_{k}) δ t^{2} \end{array}$ (8)

$\begin{array}{l} δ y_{k + δ t, k} = v_{k} δ t \sin (θ_{k} + \frac{1}{2} ω_{k} δ t) \\ \approx v_{k} \sin (θ_{k}) δ t + \frac{1}{2} ω_{k} v_{k} \cos (θ_{k}) δ t^{2} \end{array}$ (9)

$δ θ_{k + δ t, k} = ω_{k} δ t$ (10)

$δ v_{k + δ t, k} = ξ_{v}$ (11)

$δ ω_{k + δ t, k} = ξ_{ω}$ (12)

where v_k and w_k are linear and angular velocities of the objects, and ξ^v and ξ^ω are the variations in linear velocity and angular velocity, respectively.

From (8)–(12), we can obtain the state transition matrix, as follows:

$\begin{array}{l} x_{k} = Φ_{k, k - 1} x_{k - 1} + w_{k - 1} \\ Z_{k} = H_{k} x_{k} + v_{k} \end{array}$ (13)

where

$\begin{array}{c} x_{k} = [\begin{matrix} x_{k} \\ y_{k} \\ θ_{k} \\ v_{k} \\ ω_{k} \end{matrix}], \\ Φ_{k, k - 1} = [\begin{matrix} 1 & 0 & 0 & δ t \cos (θ_{k - 1}) & \frac{1}{2} v_{k - 1} δ t \sin (θ_{k - 1}) \\ 0 & 1 & 0 & δ t \sin (θ_{k - 1}) & \frac{1}{2} v_{k - 1} δ t \cos (θ_{k - 1}) \\ 0 & 0 & 1 & 0 & δ t \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \end{matrix}] \\ w_{k - 1} = [\begin{array}{l} 0 \\ 0 \\ 0 \\ ξ_{v} \\ ξ_{ω} \end{array}], Z_{k} = [\begin{matrix} x_{k} \\ y_{k} \end{matrix}], H_{k} = [\begin{matrix} 1 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \end{matrix}], and \\ v_{k} = [\begin{matrix} γ_{x} \\ γ_{y} \end{matrix}] . \end{array}$

Notice that Φ_k is the state transition matrix, w_k is the vector representing process noise, Z_k is the measurement vector, H_k represents the relationship between the measurement and the state vector, and γ_x and γ_y are x and y directional measurement errors, respectively.

4.2 State estimation

Input data such as image information include uncertainties and noises generated during the data capturing and processing steps. The state transition of a moving object also includes irregular components. Therefore, as a robust state estimator against these irregularities, a Kalman filter was adopted to form a state observer (Adam et al., 2000; Jang et al, 1997; Sorenson, 1996). The Kalman filter minimizes the estimation error by modifying the state transition model based on the error between the estimated vectors and the measured vectors, with an appropriate filter gain. The state vector, which consists of position on the x-y plane, linear/angular velocities, and linear/angular accelerations, can be estimated using the measured vectors representing the position of a moving object on the image plane.

The covariance matrix of estimated error must be calculated to determine the filter gain. The projected estimate of the covariance matrix of estimated error is represented as where P'_k is a zero-mean covariance matrix representing the prediction error, Φ_k represents system noise, P_k₋₁ is an error covariance matrix for the previous step, and Q_k₋₁ represents other measurement and computational errors.

$P_{k}^{'} = Φ_{k, k - 1} P_{k - 1} Φ_{k, k - 1}^{T} + Q_{k - 1}$ (14)

The optimal filter gain K_k, which minimizes the errors associated with the updated estimate, is

$K_{k} = P_{k}^{'} H_{k}^{T} [H_{k} P_{k}^{'} H_{k}^{T} + R_{k}]^{- 1}$ (15)

where H_k is the observation matrix and R_k is the zero-mean covariance matrix of the measurement noise.

The estimate of the state vector x_k from the measurement Z_k is expressed as

${\hat{x}}_{k} = Φ_{k, k - 1} {\hat{x}}_{k - 1} + K_{k} [Z_{k} - H_{k} Φ_{k, k - 1} {\hat{x}}_{k - 1}] .$ (16)

Therefore, x_k is updated based on the new values provided by Z_k. The error covariance matrix that will be used for the prediction P_k can be updated as follows (Sorenson, 1996):

$P_{k} = P_{k}^{'} - K_{k} H_{k} P_{k}^{'}$ (17)

After the current time is updated to k + 1, a new estimation can be provided using Eqs. (14) to (17). Fig. 4 represents a real and an estimated trajectory of a moving object.

Figure 4.

Trajectory of walking human.

To incorporate the measurement noise, which is empirically assumed to be zero-mean, Gaussian random noise with a variance of 2, the linear and angular velocities of the object were set as follows:

$\begin{array}{l} v_{k} = 15 * (sin (0.02 * k) + 1) + ξ_{v} [cm ∕ sec] \\ ω_{k} = 0.7 * cos (0.01 * k) + ξ_{ω} [rad ∕ sec] \end{array}$ (18)

where the linear and angular velocities (ξ^v, ξ^w) were assumed to include the Gaussian random noise with the variance of 3 and 0.1, respectively. Fig. 5 shows the Kalman filter estimation of the states in a noisy environment.

Figure 5.

State estimations θ_k, v_k, and ω_k, using a Kalman filter.

4.3 Trajectory estimation

The states of a moving object can be estimated if the initial state and input are given for the state transition model. Therefore, the states can be estimated for the next inputs by estimating the linear velocity and angular velocity of the moving object using the Kalman filter as a state estimator. From the linear velocity/acceleration and rotational angular velocity/acceleration data, the next states can be approximated, as in the following first order equations:

$v_{k + n} = {\hat{v}}_{k} + {\hat{a}}_{l k} n T$ (19)

$ω_{k + n} = {\hat{ω}}_{k} + {\hat{a}}_{ω k} n T$ (20)

In Fig. 5, the result includes possible noise since it is a dynamically varying system, although it is suppressed by the Kalman filter. The least square estimation method is therefore utilized, which has robust anti-noise characteristics (Jang et al, 1997).

$\hat{E} = {(A^{T} A)}^{- 1} A^{T} y$ (21)

where $\hat{E} = [\begin{matrix} {\hat{v}}_{k} & {\hat{ω}}_{k} \\ {\hat{a}}_{l k} & {\hat{a}}_{ω k} \end{matrix}], \begin{matrix} \end{matrix} A = [\begin{matrix} 1 & - T \\ 1 & - 2 T \\ : & : \\ 1 & - m T \end{matrix}],$ ,

and $y = [\begin{matrix} v_{k - 1} & ω_{k - 1} \\ v_{k - 2} & ω_{k - 2} \\ : & : \\ v_{k - m} & ω_{k - m} \end{matrix}]$ .

From the estimated inputs and using the state transition model, the trajectory of a moving object can be estimated as follows:

${\hat{x}}_{k + m} = x_{k} + \sum_{h = 0}^{m} v (h) \cos [θ (h)] T$ (22)

${\hat{y}}_{k + m} = y_{k} + \sum_{h = 0}^{m} v (h) \sin [θ (h)] T$ (23)

$v (h) = {\hat{v}}_{k} + {\hat{a}}_{l k} h T$ (24)

$θ (h) = {\hat{θ}}_{k} + {\hat{ω}}_{k} h T + \frac{1}{2} {\hat{a}}_{ω k} h T^{2} .$ (25)

5. Experiments

For the first task, to follow a walking human, the mobile robot needs to be controlled by considering the relation between the position of the mobile robot and the position of the walking human. Fig. 6 shows the motion planning process of a mobile robot for following a walking human.

Figure 6.

Estimation of the trajectory for human-tracking.

Each DNS computer system estimates the position of the moving object within m sampling time and selects the shortest distance from its current position to the walking human, assuming that its location is known a priori. The localization scheme of the mobile robot using the information on the walking human, which improves the accuracy in capturing, is developed in Kim et al., 2001. The target point of the mobile robot at k-th sampling time is denoted as ◯_R(k + m), which is one of the estimated points of the mobile robot after m sampling time.

${\hat{x}}_{R} (k + m_{o p t}) = \min_{m = 1 ~ M} ‖ {\hat{x}}_{O} (k + m) - {\hat{x}}_{R} (k + m) ‖$ (26)

where ◯_R(k + m) is the position of the mobile robot after m sampling time, and the mobile robot moves along the shortest path towards the target point ◯_O(k + m).

With regard to the second task, to demonstrate and illustrate the proposed method, we present an example. It is assumed that the velocity limit of a mobile robot is 30 cm/sec and the initial locations of the mobile robot and the moving object are (−50, −50) and (−250, 300), in cm in relation to the reference frame, respectively. The velocity and angular velocity of the moving object are as follows:

$v_{k} = 30 (\cos (0.01 k) + 1) + ξ_{v} \begin{matrix} \end{matrix} [c m ∕ \sec]$ (27)

$ω_{k} = 0.7 \sin (0.03 k + \frac{π}{1.5}) + ξ_{ω} \begin{matrix} \end{matrix} [r a d ∕ \sec]$ (28)

The forward direction and rotational angular velocity of the moving object are Gaussian random variables with variances of 2 and 0.1, respectively, which are obtained experimentally.

Fig. 7(a) shows the trajectory of a walking human and the mobile robot trying to follow the human by estimating the trajectory. Fig. 7(b) represents the distance between the mobile robot and the walking human, the error between the estimated velocity and the real velocity, and the error between the estimated angular velocity and the real angular velocity, respectively. Although the error of the estimated velocities is high at first, they converge to zero immediately.

Figure 7.

Results of experiment.

The experiment was performed to generate the shortest time trajectory for following the walking human based on the curved path showed in Fig. 7. Fig. 8 shows the experimental results for estimating information and following a walking human. The mobile robot is attached to 70×20[cm] red panels. The human walks at random velocities in the range of 45–50[cm/sec]. First, the mobile robot detects the moving human using cameras in robotic space. When the walking human is detected within view, the mobile robot tracks it following the proposed method. Fig. 8 illustrates the mobile robot following a human in the shortest time-path as in the temporary position of Fig. 7(a). The shortest path was estimated using the trajectories of the mobile robot and the human while the mobile robot was tracking the walking human.

Figure 8.

Experimental results for human-tracking at human position H_t and robot position R_t.

6. Conclusions

This paper proposes a method of human-tracking for a mobile robot to track and follow a walking human in robotic space. First, a control algorithm using a position estimation of moving objects, walking human and mobile robot, based on the kinematic relationship of consecutive image frames, was proposed for a mobile robot to follow a walking human whose position is estimated incompletely. The proposed model is able to absorb the gap between the motion of the human and the mobile robot. Movement estimation of the objects using a Kalman filter based on DNSs for tracking was shown, as well as the motion planning of a mobile robot for following a walking human based on its estimated trajectory, within the shortest time.

Future studies will involve applying this system to complex environments where many people, mobile robots and obstacles coexist. Since the proposed algorithm absorbs the kinematic differences between humans and robots, any kind of mobile robot, including legged robots, can be used as a human-tracking robot, as long as the robot is able to move at the speed of human walking.

Footnotes

7. Acknowledgments

This research was supported by the Basic Research Promotion Grant funded by the Dongseo University,and the Basic Science Research Program of the National Research Foundation of Korea (NRF),funded by the Ministry of Education,Science and Technology (No. 2010-0021054).

References

Adam

Rivlin

and Shimshoni

(2000) Computing the Sensory Uncertainty Field of a Vision-based Localization Sensor. IEEE International Conference on Robotics & Automation: 2993–2999.

Akiyama

Lee

J.H.

and Hashimoto

(2002) Evaluation of CCD camera arrangement for positioning system in intelligent space. Proceedings of the Seventh International Symposium Artificial Life and Robotics: 310–315.

Jin

T.S.

Lee

J.M.

and Hashimoto

(2006) Position Control of Mobile Robot for Human-Following in Intelligent Space with Distributed Sensors. International Journal of Control, Automation, and Systems: 4, 204–216.

DeWit

C.C.

Sordalen

O.J.

(1992) Exponential stabilization of mobile robots with nonholonomic constraints. IEEE Transactions on Automatic Control: 37, 1791–1797.

Grosso

Tistarelli

(1995) Active/Dynamic stereo vision. IEEE Transactions on Pattern Analysis and Machine Intelligence: 7, 868–879.

Inamura

Shibata

Matsumoto

Inaba

and Inoue

(1998) Finding and following a human based on on-line visual feature determination through discourse. Proceedings of the International Conference Intelligent Robots and Systems: 348–353.

Jang

Sun

and Mizutani

(1997) Neuro-Fuzzy and Soft Computing, Prentice-Hall, 423p.

Sorenson

H.W.

(1996) Kalman Filtering Techniques. Advances in Control Systems Theory and Applications: 3, 219–292.

Kim

B.H.

Roh

D. K.

Lee

J.M.

Lee

M.H.

Son

Lee

M.C.

Choi

J.W.

, and Han

S.H.

(2001) Localization of a Mobile Robot using Images of a Moving Target. Proceedings of the 2001 IEEE International Conference on Robotics & Automation: 253–258.

10.

Yoda

Hosotani

and Sakaue

(2006) Multi-point Stereo Camera System for Controlling Safety at Railroad Crossing. Proceedings of the IEEE International Conference on Computer Vision Systems: 51.

11.

Agrawal

Konolige

(2006) Real-Time Localization in Outdoor Environments using Stereo Vision and Inexpensive GPS. 18th International Conference on Pattern Recognition: 3, 1063–1068.

12.

Lee

J.H.

Ando

and Hashimoto

(1999) Design policy of localization for mobile robots in general environment. Proceedings of the IEEE/RSJ International Conference Intelligent Robots and Systems: 1733–1738.

13.

Lee

J.H.

Hashimoto

(2002) Intelligent space-concept and contents. Advanced Robotics: 16, 265–280.

14.

Jin

T.S.

Lee

J.M.

(2006) Robot Position Estimation and Tracking using the Particle Filter and SOM in Robotic Space. Lecture Notes in Computer Science, Springer-Verlag: 4282, 530–539.

15.

Pioneer3-DX 2012 August 25. ActivMedia Robotics, Amherst, NH. Available at: http://www.mobilerobots.com/