History of Robotics Research and Development of Japan2012Integration, Intelligence, etc.Robots that learn concepts and words through interaction

Takayuki Nagai	The University of Electro-Communications
Tomoaki Nakamura	The University of Electro-Communications
Takaya Araki	The University of Electro-Communications

We have proposed a framework for object concept formation based on multimodal categorization by robots using statistical models such as latent Dirichlet allocation (LDA) [1]. We showed that multimodal categorization makes it possible for the robot to categorize objects in the same manner as humans do. This means that suitable object concepts can be formed through multimodal categorization, and such concepts are useful for predicting unobservable properties of unseen objects for the robot. We strongly believe that this kind of ability establishes a basis of “true understanding” and is a very important factor for human–robot coexistence. To achieve this type of learning in real environments autonomously, the robot has to obtain multimodal data such as visual, auditory, and haptic information by itself. Hence, an autonomous multimodal information acquisition mechanism is required for the robot. The goal of this research is to develop a robot that learns object concepts online by autonomously obtaining multimodal information on a daily basis. Linguistic labels are also important for object concept formation. In [2], we have shown that the robot can learn meanings of words by connecting multimodal concepts, which are formed by multimodal categorization, and corresponding words. Here, we take a step further to consider the learning process of entire object concepts, including word meanings at once using both words and multimodal perceptual information. The word information must carry useful cues for human-like categorization. This fact motivates us to include linguistic information for our multimodal categorization. Of course, the formed object concepts can be used for inferring suitable words for unseen objects. It should be noted that the word information must be given by a human user. However, it is not practical for a human user to always accompany the robot to provide linguistic information. The robot expects to have linguistic information only when a human user is available. Therefore, the robot is required to have the ability to form object concepts using perceptual information and partially given (incomplete) words. In this research, a robot that acquires multimodal information in a fully autonomous way using mounted sensors was realized. The robot can acquire visual information from a 3D visual sensor, auditory information by shaking the object, and haptic information by grasping it. We also developed an online algorithm of multimodal categorization based on autonomously acquired multimodal information and partial words that are given by the human user. For this purpose, we first developed a batch-type learning algorithm based on multimodal LDA using Gibbs sampling. Then, the batch algorithm was extended to an online version so that the robot can discard the data after using them for learning. 1th RSJ Advanced Robotics Best Paper Award in 2013. IROS 2011 Best Paper Award Finalist. IROS 2011 Best Student Paper Award Finalist.

記事はありません。

Related Article