Friday, 26th April - 11am Max Bartolo : Seminar

Title: Dynamic Adversarial Data Collection and LLMs

 

Abstract:

Dynamic Adversarial Data Collection (DADC) involves human annotators tasked with finding examples that continuously improving models struggle to predict correctly. Models trained on DADC-collected training data have been shown to be more robust in adversarial and out-of-domain settings, and are considerably harder for humans to fool. However, DADC is more time-consuming than traditional data collection and thus more costly per example. In this talk, I’ll discuss research around DADC, and ways we can improve on DADC without suffering the additional cost including synthetic data generation and Generative Annotation Assistants (GAAs) -- generator-in-the-loop models that provide real-time suggestions that annotators can either approve, modify, or reject entirely. I'll also discuss thoughts, ideas and observations around how these ideas can be applied to developing state-of-the-art Large Language Models.

 

 

Apr 26 2024 -

Friday, 26th April - 11am Max Bartolo : Seminar

This event is co-organised by ILCC and by the UKRI Centre for Doctoral Training in Natural Language Processing, https://nlp-cdt.ac.uk.

Online teams only