ANC Workshop - Diego Oyarzun
Tuesday, 6th December 2022
Opportunities and challenges for deep learning in biotechnology - Diego Oyarzun
Abstract:
A central goal in biotechnology is engineering cells that produce high-value chemicals. These compounds feed into many products we use in our everyday lives, including food, cosmetics, medicines, and materials. Since cells can feed from sustainable sources (e.g. food waste), this technology offers a promising path to move away from petrochemical-based production and promote a more circular economy.
From a computational standpoint, the problem is to regress protein production from short sequences of DNA, i.e. strings of 50-200 chars from a four-letter alphabet. Such regressors can then be wrapped into optimisation routines to find new DNA sequences that produce more of the target protein.
In a recent paper to appear in Nature Communications (preprint here), we showed that off-the-shelf deep learning architectures can effortlessly provide high predictive accuracy that is sufficient for most applications in biotechnology. The real challenge is to make these algorithms work for end users in biology, particularly in terms of the large data requirements for training. Biological data is expensive and few laboratories have the incentives or budgets to invest six figure sums solely for the purpose of model training. In this talk, I will discuss some of our technical results and seek feedback from our ANC colleagues on approaches for low-N regression that could be useful in this class of problems.
The work is the result of a collaboration between biology PhD students, machine learners (Oisin Mac Aodha from ANC), and molecular biologists in France. The ideas in the talk are also the subject of an upcoming article in the journal Current Opinion in Biotechnology.
Event type: Workshop
Date: Tuesday, 6th December 2022
Time: 11:00
Location: G.03
Speaker(s): Diego Oyarzun
Chair/Host: Nigel Goddard