Abstract
Our purpose is to allow an autonomous robot to find and to categorise objects in a visual scene according to the actions it performs. The robot information comes from a CCD gray-level camera. The edges are extracted and a simple DOG filter is used to find ‘corner’-like forms in the image. These positions are used as possible focus points. The robot eye performs saccadic movements on the whole visual scene. A log-polar transform of the image is performed in the neighbourhood of the focus points to mimic the projection of the retina on the primary cortical areas. It simplifies object recognition by allowing size and rotation invariance. Those local views are learned on a self-organised topological map according to a vigilance level. At the same time, the robot tries to associate them with a particular action. For instance, we want the robot to learn to turn left when it sees a ‘turn-left’ arrow in the image. The problem is that the robot cannot see only a single object in the visual scene. There are many distractors such as doors, holes, and other objects not significant for the robot behaviour. At the beginning, a probabilistic conditioning rule allows the robot to associate all the seen objects to the performed movement. The robot repeatedly removes or creates new synaptic links to take into account only salient associations. As a result, object categorisation is not performed at the visual level (pure recognition of visual shape), but at the motor level (the action the robot has to perform). Our experiments show that learning and recognition of an object can be greatly simplified if we take into account the sensory-motor loop of the robot in its environment.
Get full access to this article
View all access options for this article.
