A Mostly Data-driven Approach to Inverse Text Normalization
For an automatic speech recognition system to produce sensibly formatted, readable output, the spoken-form token sequence produced by the core speech recognizer must be converted to a written-form string. This process is known as inverse text normalization (ITN). Here we present a mostly data-driven ITN system that leverages a set of simple rules and a few hand-crafted grammars to cast ITN as a labeling problem. To this labeling problem, we apply a compact bi-directional LSTM. We show that the approach performs well using practical amounts of training data.
Bharat Ram Ambati is currently working as Machine Learning Engineer in Apple Siri Speech team. In the past one year at Apple, he is mainly working on text normalization and other text processing tasks. Before that he completed his PhD on "Transition based CCG Parsing for English and Hindi" from University of Edinburgh under Mark Steedman.
Add to your calendar
14 July 2017 - Bharat Ram Ambati: Seminar
Informatics Forum 4.31/4.33