Dialogue and Multimodal Interaction
A list of potential topics for PhD students in the area of Dialogue and Multimodal Interaction.
Robot Learning via Trial and Error and an Extended Conversation with an Expert
A field of robotics known as Learning from Demonstration teaches robots new skills through a mix of trial and error and physical enactment or manipulation by a human expert. There is some preliminary work that enhances this evidence with linguistic utterances, but their underlying messages are rudimentary (e.g., "no"), or pertain to just the current situation (e.g., "go left"). This project will investigate how using current semantic parsing and symbol grounding can enhance the task of learning optimal policies, when the expert's utterances include quantification and abstraction (e.g., "when putting fruit in a bowl, always grasp it softly and lower it slowly").
The content of multimodal interaction
Supervisor: Alex Lascarides
To design, implement and evaluate a semantic model of conversation that takes place in a dynamic environment.
It is widely attested in descriptive linguistics that non-linguistic events dramatically affect the interpretation of linguistic moves and conversely, linguistic moves affect how people perceive or conceptualise their environment. For instance, suppose I look upset and so you ask me "What's wrong?" I look over my shoulder towards a scribble on the living room wall, and then utter "Charlotte's been sent to her room". An adequate interpretation of my response can be paraphrased as: Charlotte has drawn on the wall, and as a consequence she has been sent to her room. In other words, you need to conceptualise the scribble on the wall as the result of Charlotte's actions; moreover, this non-linguistic event, with this description, is a part of my response to your question. Traditional semantic models of dialogue don't allow for this type of interaction between linguistic and non-linguistic contexts. The aim of this project is to fix this, by extending and refining an existing formal model of discourse structure to support the semantic role of non-linguistic events in context in the messages that speakers convey. The project will draw on data from an existing corpus of people playing Settlers of Catan, where there are many examples of complex semantic relationships among the player's utterances and the non-linguistic moves in the board game. The project involves formally defining a model of discourse structure that supports the interpretation of these multimodal moves, and developing a discourse parser through machine learning on the Settlers corpus.
Supervisor: Alex Lascarides
Goal: To learn and act upon the user's preferences, using evidence from an extended embodied dialogue with the user and the observable consequences of actions in the environment.
There are many tasks where robots or software agents must learn particular preferences of the user after deployment and pre-training (e.g., domestic robots). These preferences may be out of distribution from those in the training data they were trained on. The most natural way to tackle such cases, for the user at least, is often via an extended embodied conversation, where the user says things like "When stacking the dishwasher, always put the dinner plates in the left rack" and the robot may respond with "Even when the only space for the large saucepan is on the left?". The aim of this project is design, implement and evaluate algorithms that: (a) extract preference information from extended dialogue; and (b) combine this qualitative and structured information about preferences that's extracted from dialogue with the quantitative reward functions learned from trial and error when acting in the environment.