Friday, 15th March - 11am Francesco Tudisco : Seminar

 

Title:    Exploiting low-rank geometry in deep learning

 

Abstract:

As model and data sizes continue to expand, modern AI faces pressing questions about timing, costs, energy consumption, and accessibility. In response, there has been a surge of interest in network compression techniques aimed at mitigating computational costs while preserving model performance. While many existing methods focus on post-training pruning to reduce inference costs, a subset addresses the challenge of diminishing training overhead, with layer factorization emerging as one of the prominent approaches. In fact, a variety of empirical and theoretical evidence has recently shown that deep networks exhibit a form of low-rank bias, hinting at the existence of highly performing low-rank subnetworks.

This talk will focus on our recent work on analyzing and leveraging implicit low-rank bias for efficient model compression in deep learning. Taking advantage of the Riemannian geometry of the low-rank format, we devise a geometry-aware variation of SGD to train small, factorized network layers while simultaneously adjusting their rank. We provide theoretical guarantees of convergence and approximation capabilities together with experimental evaluation showing competitive performance across various moderate-size network architectures.

 

Bio:

Francesco obtained his PhD in Mathematics from the University of Rome. He then moved as a postdoc to Saarbruecken (Germany) and as a Marie Curie Individual Fellow at the University of Strathclyde soon afterwards. Before joining the School of Mathematics (University of Edinburgh) as a Reader in Machine Learning, he was Associate Professor of Numerical Analysis at the Gran Sasso Science Institute graduate school (Italy).

Francesco’s research interests lie at the intersection between machine learning and scientific computing. His recent work includes the use of model-order reduction techniques from matrix and tensor differential equations to design training algorithms for deep learning, the analysis of neural networks in the infinite width and infinite depth limits, nonlinear spectral theory with applications to machine learning on graphs, and the design and analysis of physics-inspired deep learning models in scientific simulations.

 

 

Mar 15 2024 -

Friday, 15th March - 11am Francesco Tudisco : Seminar

This event is co-organised by ILCC and by the UKRI Centre for Doctoral Training in Natural Language Processing, https://nlp-cdt.ac.uk.

IF G.03