IPAB Workshop - 28/07/2022

Georgios Papoudakis

Title: Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks.

 

Abstract:

Multi-agent deep reinforcement learning (MARL) suffers from a lack of commonly-used evaluation tasks and criteria, making comparisons between approaches difficult. In this work, we provide a systematic evaluation and comparison of three different classes of MARL algorithms (independent learning, centralised multi-agent policy gradient, value decomposition) in a diverse range of cooperative multi-agent learning tasks. Our experiments serve as a reference for the expected performance of algorithms across different learning tasks, and we provide insights regarding the effectiveness of different learning approaches. We open-source EPyMARL, which extends the PyMARL codebase to include additional algorithms and allow for flexible configuration of algorithm implementation details such as parameter sharing. Finally, we open-source two environments for multi-agent research which focus on coordination under sparse rewards.

 

Arrasy Rahman

Title: Addressing Partial Observability in Open Ad Hoc Teamwork

 

Abstract:

Open ad hoc teamwork is the problem of training a single agent to efficiently collaborate with a previously unseen group of teammates whose composition may change over time. Prior work in open ad hoc teamwork has focused on problems where the learning agent has complete observability over its environment and teammates. In our work, we tackle open ad hoc teamwork in environments where the learning agent only has partial observability of its surroundings. We solve the challenge of ad hoc teamwork under partial observability by proposing different methodologies to maintain belief estimates over the latent environment states and team composition. Our previous solution for solving fully observable open ad hoc teamwork utilises these belief estimates to compute the learning agent's optimal policy under partial observability. Empirical results demonstrate that our approach can learn efficient policies in open ad hoc teamwork environments with partial observability.

Jul 21 2022 -

IPAB Workshop - 28/07/2022

Georgios Papoudakis & Arrasy Rahman

G.03, IF and Zoom