AIAI Seminar - 13 February 2023 - Talk by Victor Prokhorov
Speaker: Victor Prokhorov
Title: Multimodal Interpretability from Partial Sight
Abstract:
We seek to build DGMs that capture the joint distribution over co-observed visual and language data (e.g. abstract scenes, COCO, VQA), while faithfully capturing the conceptual mapping between the observations in an interpretable manner. This relies on two key observations: (a) perceptual domains (e.g. images) are inherently interpretable, and (b) a key characteristic of useful abstractions are that they are low(er) dimensional (than the data) and correspond to some conceptually meaningful component of the observation. We will seek to leverage recent work on conditional neural processes (Garnelo et al, 2018) to develop partial-image representations to mediate effectively, and in an interpretable manner, between vision and language data. Evaluation of this framework will involve both the ability to generate multimodal data against state-of-the-art approaches, as well as on human-measured interpretability of the learnt representations. Our project image represents multi-modal data (images, text) as a "partial specification" that allows effective encoding and reconstruction of data.
AIAI Seminar - 13 February 2023 - Talk by Victor Prokhorov
G.03, Informatics Forum