History of Robotics Research and Development of Japan2012Integration, Intelligence, etc.Robots that learn concepts and words through interaction

Takayuki NagaiThe University of Electro-Communications
Tomoaki NakamuraThe University of Electro-Communications
Takaya ArakiThe University of Electro-Communications
We have proposed a framework for object concept formation based on multimodal categorization by robots using statistical models such as latent Dirichlet allocation (LDA) [1]. We showed that multimodal categorization makes it possible for the robot to categorize objects in the same manner as humans do. This means that suitable object concepts can be formed through multimodal categorization, and such concepts are useful for predicting unobservable properties of unseen objects for the robot. We strongly believe that this kind of ability establishes a basis of “true understanding” and is a very important factor for human–robot coexistence. To achieve this type of learning in real environments autonomously, the robot has to obtain multimodal data such as visual, auditory, and haptic information by itself. Hence, an autonomous multimodal information acquisition mechanism is required for the robot. The goal of this research is to develop a robot that learns object concepts online by autonomously obtaining multimodal information on a daily basis. Linguistic labels are also important for object concept formation. In [2], we have shown that the robot can learn meanings of words by connecting multimodal concepts, which are formed by multimodal categorization, and corresponding words. Here, we take a step further to consider the learning process of entire object concepts, including word meanings at once using both words and multimodal perceptual information. The word information must carry useful cues for human-like categorization. This fact motivates us to include linguistic information for our multimodal categorization. Of course, the formed object concepts can be used for inferring suitable words for unseen objects. It should be noted that the word information must be given by a human user. However, it is not practical for a human user to always accompany the robot to provide linguistic information. The robot expects to have linguistic information only when a human user is available. Therefore, the robot is required to have the ability to form object concepts using perceptual information and partially given (incomplete) words. In this research, a robot that acquires multimodal information in a fully autonomous way using mounted sensors was realized. The robot can acquire visual information from a 3D visual sensor, auditory information by shaking the object, and haptic information by grasping it. We also developed an online algorithm of multimodal categorization based on autonomously acquired multimodal information and partial words that are given by the human user. For this purpose, we first developed a batch-type learning algorithm based on multimodal LDA using Gibbs sampling. Then, the batch algorithm was extended to an online version so that the robot can discard the data after using them for learning. 1th RSJ Advanced Robotics Best Paper Award in 2013. IROS 2011 Best Paper Award Finalist. IROS 2011 Best Student Paper Award Finalist.
ロボットプラットフォーム DiGORO


Correspondence papers

Takaya Araki, Tomoaki Nakamura, Takayuki Nagai, Kotaro Funakoshi, Mikio Nakano, and Naoto Iwahashi:Online Object Categorization Using Multimodal Information Autonomously Acquired by a Mobile Robot

 Advanced Robotics, Vol. 26, Issue 17, pp. 1995-2020, 2012

Takaya Araki, Tomoaki Nakamura, Takayuki Nagai, Shogo Nagasaka, Tadahiro Taniguchi, and Naoto Iwahashi:Online Learning of Concepts and Words Using Multimodal LDA and Hierarchical Pitman-Yor Language Model

IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 1623-1630, Portugal, Oct.. 2012.

Takaya Araki, Tomoaki Nakamura, and Takayuki Nagai:Long-term Learning of Concept and Word by Robots: Interactive Learning Framework and Preliminary Results

IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 2280-2287, Tokyo, Nov. 2013.

Related papers

[1]Tomoaki Nakamura, Takayuki Nagai, Naoto Iwahashi, ""Multimodal Object Categorization by a Robot"", IEICE Trans. on Information and Systems, vol.J92-D, no.10, pp.2507-2518,2008.10 (in Japanese).

[2]Tomoaki Nakamura, Takaya Araki, Takayuki Nagai, Naoto Iwahashi, ""Grounding of Word Meanings in LDA-Based Multimodal Concepts "", Advanced Robotics, 25, pp.2189-2206, 2011.

[3]Tomoaki Nakamura, Takaya Araki, Takayuki Nagai, Naoto Iwahashi, ""Multimodal Object Categorization Based on Hierarchical Dirichlet Process by a Robot"", Trans. of the Society of Instrument and Control Engineers, Vol.49 No.4, Apr.2013 (in Japanese).

[4]Yoshiki Ando, Tomoaki Nakamura, Takaya Araki, Takayuki Nagai, ""Formation of Hierarchical Concepts Using Hierarchical Multimodal LDA by Robots"", Journal of the Robotics Society of Japan,Vol.31, No7, pp.2-12, sep.2013 (in Japanese).

Related Article