Abstract
Recognizing manipulations performed by a human and the transfer and execution of this by a robot is a difficult problem. We address this in the current study by introducing a novel representation of the relations between objects at decisive time points during a manipulation. Thereby, we encode the essential changes in a visual scenery in a condensed way such that a robot can recognize and learn a manipulation without prior object knowledge. To achieve this we continuously track image segments in the video and construct a dynamic graph sequence. Topological transitions of those graphs occur whenever a spatial relation between some segments has changed in a discontinuous way and these moments are stored in a transition matrix called the semantic event chain (SEC). We demonstrate that these time points are highly descriptive for distinguishing between different manipulations. Employing simple sub-string search algorithms, SECs can be compared and type-similar manipulations can be recognized with high confidence. As the approach is generic, statistical learning can be used to find the archetypal SEC of a given manipulation class. The performance of the algorithm is demonstrated on a set of real videos showing hands manipulating various objects and performing different actions. In experiments with a robotic arm, we show that the SEC can be learned by observing human manipulations, transferred to a new scenario, and then reproduced by the machine.
Keywords
Get full access to this article
View all access options for this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
