TY - JOUR A1 - Ward, Nigel G. A1 - Vega, Alejandro A1 - Baumann, Timo T1 - Prosodic and temporal features for language modeling for dialog JF - Speech communication N2 - If we can model the cognitive and communicative processes underlying speech, we should be able to better predict what a speaker will do. With this idea as inspiration, we examine a number of prosodic and timing features as potential sources of information on what words the speaker is likely to say next. In spontaneous dialog we find that word probabilities do vary with such features. Using perplexity as the metric, the most informative of these included recent speaking rate, volume, and pitch, and time until end of utterance. Using simple combinations of such features to augment trigram language models gave up to a 8.4% perplexity benefit on the Switchboard corpus, and up to a 1.0% relative reduction in word error rate (0.3% absolute) on the Verbmobil II corpus. KW - Dialog dynamics KW - Dialog state KW - Prosody KW - Interlocutor behavior KW - Word probabilities KW - Prediction KW - Perplexity KW - Speech recognition KW - Switchboard corpus KW - Verbmobil corpus Y1 - 2012 U6 - https://doi.org/10.1016/j.specom.2011.07.009 SN - 0167-6393 VL - 54 IS - 2 SP - 161 EP - 174 PB - Elsevier CY - Amsterdam ER -