**CANCELLED** IPAB Seminar - 24/11/2016

Talk title: Systematic exploration of unsupervised methods for mapping behaviour.

Talk abstract: Quantifying behaviour precisely is essential for understanding how the brain generates it as an output. But ecologically relevant behaviours are often subtle and hard to quantify using simple methods like kinematic threshold. Multidimensional approaches can help and have typically taken the form of supervised classification where a researcher manually annotates high dimensional recordings of behaviour to generate a ground truth training data set. Such approaches are tedious and can be distorted by unconscious bias. Newer unsupervised approaches, which aim to identify modes of variation and natural clusters in behavioural data without any prior assumptions, are promising but conceptually daunting, because they lack a ground truth against which their performance can be assessed. Moreover there are many reasonable approaches to unsupervised clustering, all of which can produce plausible clusters. In an attempt to provide tools for the comparison of unsupervised methods, we developed metrics which characterize the extent to which clustering outputs conform to conservative, practically axiomatic, assumptions about the organization of behaviour. For example, it is reasonable to assume that behaviours exist on longer time scales than the neural signals (i.e. action potentials) that encode and induce them. Therefore, a clustering method that annotates a sequence of behaviour with many bouts of behaviour lasting less than 10ms is presumably worse than methods that label fewer of such bouts. Using this approach of quantifying basic assumptions about behaviour in general, we performed a systematic comparison of alternative unsupervised approaches for clustering data from a custom instrument that tracks the positions of a fly?s legs as it performs spontaneous walking behaviour on a floating ball. We found that methods which retain the data in a high as possible dimensional representation before clustering perform the best. Specifically, a pipeline consisting of moderate compression with PCA followed by Gaussian mixture modelling, and high dimensional water shedding to combine overlapping Gaussian modes was promising.

Nov 24 2016 -

**CANCELLED** IPAB Seminar - 24/11/2016

Benjamin de Bivort (Harvard)

4.31/33