Title - Learning Reactive Motor Skills of Reaching and Grasping via Deep Reinforcement Learning
Abstract - Adaptive and reactive grasping of objects is an essential capability of autonomous robot skills. Yet it is challenging to learn such sensorimotor control that can coordinate hand-finger motions and is robust to disturbances and failure grasps. In this project, we proposed a learning scheme to train feedback control policies which coordinate reactive reaching and grasping actions.
We formulated geometric metrics and task-orientated quantities for the reward design. Further, to improve the success rate, we deployed key initial states of difficult poses in the training which can induce potential failures due to uncertainties. The extensive simulation validations and benchmarking demonstrated that the learned policy was robust to grasp both static and moving objects stably. Moreover, the policy generated successful failure recoveries within a short time in difficult configurations and was still robust with synthetic noises in the state feedback which were unseen during training.
Title: Synthetic Data Generation for Global Human Pose Estimation
Abstract: We propose a supervised learning-based approach for predicting camera space 3D human body pose, body shape, position and ground plane from a long monocular video that includes occlusions. Using 3D motion capture data sets, we render minute-long videos by randomly changing the body shape and camera parameters, occluders and backgrounds. Specifically for the body shape and camera estimation we leverage long-term information based on attentive neural networks similar to the BERT transformer.
Title: Universal Representation Learning from Multiple Domains for Few-shot Classification“.
Abstract: In this work, we look at the problem of few-shot classification that aims to learn a classifier for previously unseen classes and domains from few labeled samples. Recent methods use adaptation networks for aligning their features to new domains or select the relevant features from multiple domain-specific feature extractors. In this work, we propose to learn a single set of universal deep representations by distilling knowledge of multiple separately trained networks after co-aligning their features with the help of adapters and centered kernel alignment. We show that the universal representations can be further refined for previously unseen domains by an efficient adaptation step in a similar spirit to distance learning methods. We rigorously evaluate our model in the recent Meta-Dataset benchmark and demonstrate that it significantly outperforms the previous methods while being more efficient.