1 February 2019 - Vlad Niculae: Seminar
Learning with Sparse Latent Structure
Structured representations are a powerful tool in machine learning, and in particular in natural language processing: The discrete, compositional nature of words and sentences leads to natural combinatorial representations such as trees, sequences, segments, or alignments, among others. At the same time, deep, hierarchical neural networks with latent representations are increasingly widely and successfully applied to language tasks. Deep networks conventionally perform smooth, soft computations resulting in dense hidden representations. We study deep models with structured and sparse latent representations, without sacrificing differentiability. This allows for fully deterministic models which can be trained with familiar end-to-end gradient-based methods. We demonstrate sparse and structured attention mechanisms, as well as latent computation graph structure learning, with successful empirical results on large scale problems including sentiment analysis, natural language inference, and neural machine translation. Joint work with Claire Cardie, Mathieu Blondel, and André Martins.
Vlad is a postdoc in the DeepSpin project at the Instituto de Telecomunicações in Lisbon, Portugal, researching structure and sparsity for machine learning & natural language processing. He earned a PhD in Computer Science from Cornell University in 2018, advised by Claire Cardie. Vlad also maintains the polylearn library for factorization machines and polynomial networks in Python, in addition to being a long time core developer for scikit-learn.
More information at https://vene.ro