26th April 2022 - 4pm - Yoon Kim: Seminar

Title: Efficient Transfer Learning with Large Language Models

Abstract:

Transfer learning with large pretrained language models is the dominant paradigm in natural language processing. With moderately-sized models (e.g., BERT), transfer learning involves full finetuning to obtain a task-specific model with its own parameters for each task, which makes the approach hard to scale to storage-constrained scenarios. With larger models (e.g., GPT-3), the model is adapted to each task via natural language prompts and thus the pretrained parameters remain fixed. However, few-shot learning capabilities via prompting emerge only when model sizes are large enough, and thus inference remains expensive. This talk explores two approaches for improving the memory- and inference-efficiency of large language models within the transfer learning paradigm. For finetuned models, we show that only a small subset of the model parameters (0.5%) need to be updated to match the performance of fully-finetuned models. For prompted models, we show that co-training (wherein two models are trained on confidently-labeled outputs from each other) can produce much smaller models that outperform the original prompted model.

Bio:

Yoon Kim is an assistant professor at MIT in the Department of Electrical Engineering and Computer Science. He obtained his PhD from Harvard University, where he was advised by Alexander Rush.

 

Add to your calendar

 vCal  iCal

Apr 26 2022 -

26th April 2022 - 4pm - Yoon Kim: Seminar

This event is co-organised by ILCC and by the UKRI Centre for Doctoral Training in Natural Language Processing, https://nlp-cdt.ac.uk

Online invitation