IPAB Workshop 12/05/2022

Speaker: Brian Seipp

Title: Active reinforcement learning for tactile feature exploration

Speaker: Tactile sensing is an essential and under explored modality for robots to explore their world. Existing methods are frequently guided by a prior knowledge about their environment or augmented with other sensing modalities. While these methods have merit, there are many examples where they fail and result in expensive sampling strategies of the target areas. Our work attempts to address this gap by learning a locally sized prediction map of where the feature of interest is likely to exist. Reinforcement learning is then leveraged to balance exploration vs exploitation when interpreting this map. Our results show significant improvement over other sampling strategies such as Bayesian optimization and random sampling when tasked with discovering the entire feature in a specified space. I will discuss these methods and results as well as describe our sim to real experiments confirming our phantom results.

Speaker: Wanming Yu

Title: Accessibility-Based Clustering for Efficient Learning of Locomotion Skills

Abstract: For model-free deep reinforcement learning of quadruped locomotion, the initialization of robot configurations is crucial for data efficiency and robustness. This work focuses on algorithmic improvements of data efficiency and robustness simultaneously through automatic discovery of initial states, which is achieved by our proposed K-Access algorithm based on accessibility metrics. Specifically, we formulated accessibility metrics to measure the difficulty of transitions between two arbitrary states, and proposed a novel K-Access algorithm for state-space clustering that automatically discovers the centroids of the static-pose clusters based on the accessibility metrics. By using the discovered centroidal static poses as the initial states, we can improve data efficiency by reducing redundant explorations, and enhance the robustness by more effective explorations from the centroids to sampled poses. Focusing on fall recovery as a very hard set of locomotion skills, we validated our method extensively using an 8-DoF quadrupedal robot Bittle. Compared to the baselines, the learning curve of our method converges much faster, requiring only 60% of training episodes. With our method, the robot can successfully recover to standing poses within 3 seconds in 99.4% of the test cases. Moreover, the method can generalize to other difficult skills successfully, such as backflipping.

Speaker: Bo Zhao

Talk title: Dataset Condensation for Data-efficient Deep Learning

Abstract: Increasingly larger datasets are required to achieve the state-of-the-art in many fields. Storing these datasets and training models on them become significantly more expensive, especially when validating multiple model designs and hyper-parameters. We propose a training set synthesis technique for data-efficient learning, called Dataset Condensation, that learns to condense a large dataset into a small set of informative synthetic samples for training deep neural networks from scratch. In this talk, I will present our recent progresses in which we learn the informative synthetic training set by gradient matching and distribution matching. Except the classic supervised learning, we also explore the use of our method in continual learning and neural architecture search and report promising gains.

May 12 2022 13.00 - 14.00

IPAB Workshop 12/05/2022

Brian Seipp, Wanming Yu, Bo Zhao

G.07, IF