26 January 2018 - Matt Post: Seminar
Grammatical error correction across the spectrum of error types
Grammatical Error Correction (GEC) is the task of detecting and correcting mistakes in natural language, most commonly motivated as a means of helping nonnative speakers improve their written English.
Yet the task is hard to precisely define and encompasses a wide spectrum of error types.
Common test corpora annotate corrections that range from small, focused changes that don't have much effect on the sentence's interpretation to large phrasal rewrites that defy easy categorization.
I will discuss perspectives on the role of text correction, ranging from pedantry and shibboleth to perspicuity and immersion.
This motivates an examination of common metrics in GEC and our argument for an evaluation that emphasizes textual fluency instead of minor errors.
I conclude with work we've done aimed at both ends of the spectrum: correcting small, common mistakes (via modification to a dependency parser) and improving native language fluency (via a neural translation model trained with reinforcement learning).
This is work done with Keisuke Sakaguchi (JHU), Courtney Napoles (JHU), and Joel Tetrault (Grammarly).
Matt Post is a visiting research scientist at Amazon Research (Berlin), where he is spending a year on leave from his position as a research scientist at the Human Language Technology Center of Excellence at Johns Hopkins University (JHU).
His interests are mostly in machine translation and other text-to-text rewriting tasks.
He has helped to run the manual evaluation of the annual Conference on Machine Translation (WMT), served on the NAACL board (2015--2016), and is a co-chair for the machine translation track at ACL 2018.
He obtained his Ph.D. in 2011 from the University of Rochester.