14 July 2017 - Bharat Ram Ambati: Seminar


A Mostly Data-driven Approach to Inverse Text Normalization


For an automatic speech recognition system to produce sensibly formatted, readable output, the spoken-form token sequence produced by the core speech recognizer must be converted to a written-form string. This process is known as inverse text normalization (ITN). Here we present a mostly data-driven ITN system that leverages a set of simple rules and a few hand-crafted grammars to cast ITN as a labeling problem. To this labeling problem, we apply a compact bi-directional LSTM. We show that the approach performs well using practical amounts of training data.


Bharat Ram Ambati is currently working as Machine Learning Engineer in Apple Siri Speech team. In the past one year at Apple, he is mainly working on text normalization and other text processing tasks. Before that he completed his PhD on "Transition based CCG Parsing for English and Hindi" from University of Edinburgh under Mark Steedman.

Add to your calendar

 vCal  iCal

Jul 14 2017 -

14 July 2017 - Bharat Ram Ambati: Seminar

ILCC seminar by Bharat Ram Ambati in IF 4.31/4.33

Informatics Forum 4.31/4.33