Sage Journals: Discover world-class research

Abstract

This paper describes a framework that enables robots to efficiently learn human-centric models of their environment from natural language descriptions. Typical semantic mapping approaches are limited to augmenting metric maps with higher-level properties of the robot’s surroundings (e.g. place type, object locations) that can be inferred from the robot’s sensor data, but do not use this information to improve the metric map. The novelty of our algorithm lies in fusing high-level knowledge that people can uniquely provide through speech with metric information from the robot’s low-level sensor streams. Our method jointly estimates a hybrid metric, topological, and semantic representation of the environment. This semantic graph provides a common framework in which we integrate information that the user communicates (e.g. labels and spatial relations) with metric observations from low-level sensors. Our algorithm efficiently maintains a factored distribution over semantic graphs based upon the stream of natural language and low-level sensor information. We detail the means by which the framework incorporates knowledge conveyed by the user’s descriptions, including the ability to reason over expressions that reference yet unknown regions in the environment. We evaluate the algorithm’s ability to learn human-centric maps of several different environments and analyze the knowledge inferred from language and the utility of the learned maps. The results demonstrate that the incorporation of information from free-form descriptions increases the metric, topological, and semantic accuracy of the recovered environment model.

Keywords

Semantic mapping Rao-Blackwellization mapping particle filter natural language understanding human-robot interaction

Get full access to this article

View all access options for this article.

References

Beeson

Modayil

Kuipers

(2010) Factoring the mapping problem: Mobile robot map-building in the Hybrid Spatial Semantic Hierarchy. International Journal of Robotics Research 29(4): 428–459.

Bosse

Newman

Leonard

. (2004) Simultaneous localization and map building in large-scale cyclic environments using the atlas framework. International Journal of Robotics Research 23(12): 1113–1139.

Brunskill

Kollar

Roy

(2007) Topological mapping using spectral clustering and classification. In: Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS), San Diego, CA, USA, 29 October–2 November 2007, pp. 3491–3496.

Bugmann

Klein

Lauria

. (2004) Corpus-based robotics: A route instruction example. In: Proceedings of intelligent autonomous systems, pp. 96–103.

Chen

Mooney

(2011) Learning to interpret natural language navigation instructions from observations. In: Proceedings of the national conference on artificial intelligence (AAAI), San Francisco, CA, USA, pp. 859–865.

Cummins

Newman

(2008) FAB-MAP: Probabilistic localization and mapping in the space of appearance. International Journal of Robotics Research 27(6): 647–665.

Doucet

de Freitas

Murphy

. (2000) Rao–Blackwellised particle filtering for dynamic Bayesian networks. In: Proceedings of the conference on uncertainty in artificial intelligence (UAI), Stanford, CA, USA, pp. 176–183.

Dzifcak

Scheutz

Baral

. (2009) What to do and how to do it: Translating natural language directives into temporal and dynamic logic representation for goal management and action execution. In: Proceedings of the IEEE international conference on robotics and automation (ICRA), Kobe, Japan, 12–17 May 2009, pp. 4163–4168.

Eustice

Singh

Leonard

(2005) Exactly sparse delayed-state filters. In: Proceedings of the IEEE international conference on robotics and automation (ICRA), Barcelona, Spain, 18–22 April 2005, pp. 2417–2424.

10.

Fearnhead

Clifford

(2003) Online inference for hidden Markov models. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 65(4): 887–889.

11.

Galindo

Saffiotti

Coradeschi

. (2005) Multi-hierarchical semantic maps for mobile robotics. In: Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS), Barcelona, Spain, 18–22 April 2005, pp. 2278–2283.

12.

Gutmann

Konolige

(1999) Incremental mapping of large cyclic environments. In: Proceedings of the IEEE international symposium on computational intelligence in robotics and automation (CIRA), Monterey, CA, USA, pp. 318–325.

13.

Harnad

(1990) The symbol grounding problem. Physica D 42: 335–346.

14.

Hemachandra

Kollar

Roy

. (2011) Following and interpreting narrated guided tours. In: Proceedings of the IEEE international conference on robotics and automation (ICRA), Shanghai, 9–13 May 2011, pp. 2574–2579.

15.

Hemachandra

Walter

Tellex

. (2013) Learning semantic maps from natural language descriptions. Available at: http://vimeo.com/67438012.

16.

Hemachandra

Walter

Tellex

. (2014) Learning spatial–semantic representations from natural language descriptions and scene classifications. In: Proceedings of the IEEE international conference on robotics and automation (ICRA), Hong Kong, China, 31 May–7 June 2014 (in press).

17.

Jackendoff

(1985) Semantics and Cognition. Cambridge, MA: The MIT Press.

18.

Kaess

Ranganathan

Dellaert

(2008) iSAM: Incremental smoothing and mapping. Transactions on Robotics 24(6): 1365–1378.

19.

Kollar

Roy

(2009) Utilizing object–object and object–scene context when planning to find things. In: Proceedings of the IEEE international conference on robotics and automation (ICRA), Kobe, Japan, 12–17 May 2009, pp. 4116–4121.

20.

Kollar

Tellex

Roy

. (2010) Toward understanding natural language directions. In: Proceedings of the ACM/IEEE international conference on human–robot interaction (HRI), Osaka, Japan, pp. 259–266.

21.

Konolige

(2004) Large-scale map-making. In: Proceedings of the national conference on artificial intelligence (AAAI), San Jose, CA, USA, pp. 457–463.

22.

Krieg-Brückher

Frese

Lüttich

. (2005) Specification of an ontology for route graphs. Spatial Cognition IV: Reasoning, Action, Interaction 3343: 390–412.

23.

Kuipers

(2000) The spatial semantic hierarchy. Artificial Intelligence 119(1): 191–233.

24.

Kuipers

Modayil

Beeson

. (2004) Local metrical and global topological maps in the Hybrid Spatial Semantic Hierarchy. In: Proceedings of the IEEE international conference on robotics and automation (ICRA), New Orleans, LA, USA, 26 April–1 May 2004, pp. 4845–4851.

25.

Leonard

Newman

(2003) Consistent, convergent, and constant-time SLAM. In: Proceedings of the international joint conference on artificial intelligence (IJCAI), Acapulco, Mexico, pp. 1143–1150.

26.

Liu

(1996) Metropolized independent sampling with comparisons to rejection sampling and importance sampling. Statistics and Computing 6: 113–119.

27.

Lynch

(1960) The Image of the City. Cambridge, MA: The MIT Press.

28.

MacMahon

Stankiewicz

Kuipers

(2006) Walk the talk: Connecting language, knowledge, and action in route instructions. In: Proceedings of the national conference on artificial intelligence (AAAI), Boston, MA, USA, pp. 1475–1482.

29.

Martínez Mozos

Triebel

Jensfelt

. (2007) Supervised semantic labeling of places using information extracted from sensor data. Robotics and Autonomous Systems 55(5): 391–402.

30.

Matuszek

Fox

Koscher

(2010) Following directions using statistical machine translation. In: Proceedings of the ACM/IEEE international conference on human–robot interaction (HRI), Osaka, Japan, pp. 251–258.

31.

Matuszek

Herbst

Zettlemoyer

. (2012) Learning to parse natural language commands to a robot control system. In: Proceedings of the international symposium on experimental robotics (ISER), Québec City, Canada pp. 403–415.

32.

Meger

Forssén

Lai

. (2008) Curious George: An attentive semantic robot. Robotics and Autonomous Systems 56(6): 503–511.

33.

Modayil

Beeson

Kuipers

(2004) Using the topological skeleton for scalable global metrical map-building. In: Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS), Sendai, Japan, 28 September–2 October 2004, pp. 1530–1536.

34.

Olson

Leonard

Teller

(2007) Spatially-adaptive learning rates for online incremental SLAM. In: Proceedings of Robotics: Science and systems (RSS), Atlanta, GA, USA.

35.

Pronobis

Jensfelt

(2012) Large-scale semantic mapping and reasoning with heterogeneous modalities. In: Proceedings of the IEEE international conference on robotics and automation (ICRA), St Paul, MN, USA 14–18 May 2012, pp. 3515–3522.

36.

Pronobis

Martínez Mozos

Caputo

. (2010) Multi-modal semantic place classification. International Journal of Robotics Research 29(2–3): 298–320.

37.

Ranganathan

Dellaert

(2009) Bayesian surprise and landmark detection. In: Proceedings of the IEEE international conference on robotics and automation (ICRA), Kobe, Japan, 12–17 May 2009, pp. 2017–2023.

38.

Ranganathan

Dellaert

(2011) Online probabilistic topological mapping. International Journal of Robotics Research 30(6): 755–771.

39.

Russell

Norvig

(2003) Artificial Intelligence: A Modern Approach. 2nd edition. Upper Saddle River, NJ: Prentice Hall, pp. 97–104.

40.

Lowe

Little

(2005) Vision-based global localization and mapping for mobile robots. Transactions on Robotics 21(3): 364–375.

41.

Skubic

Perzanowski

Blisard

. (2004) Spatial language for human–robot dialogs. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews 34(2): 154–167.

42.

Smith

Cheeseman

(1986) On the representation and estimation of spatial uncertainty. International Journal of Robotics Research 5(4): 56–68.

43.

Tellex

Kollar

Dickerson

. (2011) Understanding natural language commands for robotic navigation and mobile manipulation. In: Proceedings of the national conference on artificial intelligence (AAAI), San Francisco, CA, USA pp. 1507–1514.

44.

Tellex

Thaker

Deits

. (2012) Toward information theoretic human–robot dialog. In: Proceedings of Robotics: Science and systems (RSS), Sydney, Australia.

45.

Thrun

Gutmann

Fox

. (1998) Integrating topological and metric maps for mobile robot navigation: A statistical approach. In: Proceedings of the national conference on artificial intelligence (AAAI), Madison, WI, USA, pp. 989–995.

46.

Thrun

Liu

Koller

. (2004) Simultaneous localization and mapping with sparse extended information filters. International Journal of Robotics Research 23(7–8): 693–716.

47.

Torralba

Murphy

Freeman

. (2003) Context-based vision system for place and object recognition. In: Proceedings of the international conference on computer vision (ICCV), Nice, France, pp. 273–280.

48.

Vasudevan

Siegwart

(2008) Bayesian space conceptualization and place classification for semantic maps in mobile robotics. Robotics and Autonomous Systems 56(6): 522–537.

49.

Walter

Eustice

Leonard

(2007) Exactly sparse extended information filters for feature-based SLAM. International Journal of Robotics Research 26(4): 335–359.

50.

Walter

Hemachandra

Homberg

. (2013) Learning semantic maps from natural language descriptions. In: Proceedings of robotics: science and systems (RSS), Berlin, Germany.

51.

Zender

Martnez Mozos

Jensfelt

. (2008) Conceptual spatial representations for indoor mobile robots. Robotics and Autonomous Systems 56(6): 493–502.