This document serves as a report on the international session on AI, Learning, and Control (2/2). Below we will dive into some interesting works that utilize machine learning techniques to augment control in robots.
2. Session Report
The session kicks off in the field of manipulation. Grasping poses and task completion have high priority in industrial manipulation tasks.  proposes a novel planar grasping method for robot arms that aims to address these issues. The method is decomposed as follows. First, using RGB-D information, a grasping pose is determined with the help of transformers via estimating the object’s 3D pose. Then the model evaluates whether any adjustments are necessary to change the object’s position to make it suitable for grasping. The authors claim that by providing a template, the model improves grasp stability and intelligence. Furthermore, the proposed method enhances efficiency and flexibility in industrial assembly situations.
Imitation learning is a popular method used to train a robot from a set of expert demonstrations that helps a robot to learn motion that deviates less from the demonstrations. There is, however, a caveat, that arises from the lack of efficiency in demonstrations that causes the robot to fail in long-horizon tasks.  aims to address this limitation via interactive imitation learning with sub-goal regression planning for a cooking task. The authors of this paper successfully show that the policy can execute various task goals with fewer demonstrations.
The proposed method uses a hierarchical policy, that consists of a high-level policy that generates sub-goals, a low-level policy that chooses actions to be executed in the environment, and a goal switching policy that indicates whether a new sub-goal is necessary or not.
From the results, the authors show that the proposed method can be applied to train robot policies for a diverse range of long-horizon tasks. The authors also express interest in implementing their method in the real environment.
Figure 1: Framework of hierarchical policy for long-horizon tasks 
Various tools are used in engineering applications, typically in manufacturing, and they undergo fatigue and stress that affect their lives; hence it is best to reduce this effect to save costs.  proposes a framework that takes task completion and tool lifespan into account, by predicting the tool’s ‘Remaining Useful Life’ (RUL) and includes it in the objective of the problem. The authors show that this method effectively increases the life of a tool.
The method uses Finite Element Analysis to understand the tool’s stress distribution history from the external forces (states of the environment), and then uses Miner’s Rule to evaluate the cumulative damage up to that point in time. Soft Actor Critic was used where the resulting policy learns to adapt its usage of the tool based on the damage assessed. This method was tested on a T-shaped tool and an F-shaped tool.
Current object detection methods struggle with recognizing novel objects during drone flights. Collecting a dataset from scratch is expensive and time-consuming, and currently established datasets lack diversity with respect to orientation and different environmental settings.  proposes a two-stage fine-tuning framework for few-shot object detection and introduces a novel splitting method that helps the model to generate new classes not encountered before and improve its generalizability. The dataset used in this work is the VisDrone dataset, and the authors have shown the effectiveness of their proposed framework in terms of object detection performance for seen and unseen classes. The authors tackle this problem by employing a splitting method that is composed of a group-wise hierarchy and t-SNE visualization.
Although the framework still suffers from detecting novel objects in aerial imagery, the authors express interest to conduct further investigation and make improvements to the model’s generalizability and improve detection accuracy in challenging aerial scenarios.
In vision-based control, autoencoders can be used to compress and reconstruct RGB data from its latent representations, though it’s possible that they may contain noise that affects the performance of the control policies. In , the authors come up with Force-Map to optimize an autoencoder that uses force information in manipulating delicate objects. The authors verify their framework on two tasks namely: compression and horizontal stacking and they show that their model works for force-sensitive applications. The authors wish to employ Sim2Real techniques to allow the policy to be used in real world scenarios.
The session concludes with SenLane, which is a road lane segmentation dataset that captures the challenging conditions of snowy roads in Northeast Japan, aimed to provide high quality and diversity to test the robustness of models in harsh weather conditions. Currently it comprises 1000 hand-annotated frames and is planned to be extended to 1500 frames. The SenLane is directed at training models for autonomous driving in harsh winter conditions. For future work, the authors intend to evaluate the performance of the SenLane dataset using SOTA machine learning models for road lane detection and instance segmentation. Through evaluation and benchmarking, the authors wish to establish a strong and robust dataset for data that has not yet been considered and contribute to the machine learning community.
In this document, we have seen different works that exploit machine learning techniques in robot control.  proposes a novel planar grasping method to optimize grasping of object.  augments imitation learning through hierarchical learning to solve long-horizon tasks.  aims to extend tool life by using FEM and Miner’s Rule in the objective of the problem.  address the limitation of unseen data by proposing a model to detect novel categories and create a robust dataset respectively.  incorporates a Force-Map into an autoencoder for force-sensitive applications to improve policy performance.
 J. Chen, R. Ishikawa T. Oishi: “Task-Related Planar Grasping with Object Pose Re-Adjustment”, 第41回日本ロボット学会学術講演会予稿集, 1F4-01, 2023.
 C. Ochoa, H. Oh, and T. Matsubara: “Interactive Imitation with Sub-goal Regression Planning for Long-horizon Tasks” , 第41回日本ロボット学会学術講演会予稿集, 1F4-02, 2023.
 P. Y. Wu, C. Y. Kuo, H. Tahara, and T. Matsubara: “Reinforcement learning of tool-use policies that maximize remaining useful life”, 第41回日本ロボット学会学術講演会予稿集, 1F4-03, 2023.
 R. A. Nihal, B. Yen, K. Itoyama, and K. Nakadai: “Few-shot detection on Drone Captured Scenarios”, 第41回日本ロボット学会学術講演会予稿集, 1F4-04, 2023.
 A. Mustafa, R. Hanai1, I. Ramirez1, F. Erich, Y. Domae1, and T. Ogata:“Force-Map for Robust Feature Representation and Its Application to Object Manipulation”, 第41回日本ロボット学会学術講演会予稿集, 1F4-05, 2023.
 W. Wang, N. Chiba, H. Yang, and K. Hashimoto: “SenLane: a road lane instance segmentation dataset captured during snowy conditions”, 第41回日本ロボット学会学術講演会予稿集,1F4-06, 2023.
シッダート パドマナバン (Siddharth Padmanabhan)
Graduated Bachelor of Technology in Mechanical Engineering from Vellore Institute of Technology, India in 2019. Obtained Master of Engineering from Osaka University in 2021. Currently enrolled in doctoral program in Osaka University, research interests lie in whole body motion and control of full-sized humanoid using reinforcement learning.