A task-dynamic toolkit for modeling the effects of prosodic structure on articulation

Elliot Saltzman, Hosung Nam, Jelena Krivokapic, Louis Goldstein

Research output: Chapter in Book/Report/Conference proceedingConference contribution

98 Citations (Scopus)

Abstract

The original task-dynamic model of speech production incorporated the theoretical tenets of Articulatory Phonology and provided a dynamics of inter-articulator coordination for single and co-produced constriction gestures, given a gestural score that specifies a time-dependent vector of gestural activations for a given utterance. More recently, the model has been significantly extended to provide a framework for investigating the higher order dynamics of prosodic phrasing, syllable structure, lexical stress, and the prominence (accentual) properties associated with higher level prosodic constituents (e.g., foot, word, phrase, sentence). There are two new components in the model. The first is an ensemble of gestural planning oscillators that defines a dynamics of gestural score formation in that, once the ensemble reaches an entrained steady-state of relative phasing, the waveform of each oscillator is used to specify the activation function of that oscillator's associated constriction gesture and to trigger, thereby, the onset of the gesture. The second component is a set of modulation gestures (μ-gestures) that, rather than activating constriction formation and release gestures in the vocal tract, serve to modulate the temporal and spatial properties of all concurrently active constriction gestures. Modulation gestures are of two types: temporal modulation gestures (μT-gestures) that alter the rate of utterance timeflow by smoothly changing all frequency parameters of the planning oscillator ensemble; and spatial modulation gestures (μS -gestures) that spatially strengthen or reduce the motions of constriction gestures by smoothly changing the spatial target parameters of these constriction gestures. Key to the representation of prosodic phrasing has been use of clockslowing temporal modulation gestures (called prosodic gestures [π-gestures] in previous work) that are locally active in the region of phrasal boundaries, and that slow the rate of utterance timeflow in direct proportion to the strength of the associated boundary. Central to the representation of syllable structure is the use of a coupling graph that defines the existence and strength of coupling in the network of gestural planning oscillators, and shapes the manner in which gestures are coordinated. Concepts from graph theory have been crucial to understanding how hypothesized differences among coupling graphs have correctly predicted empirically demonstrated intra-syllabic differences between onsets and codas in both the mean values and variabilities of C-C, C-V, and V-C timing patterns. In this paper, we describe a set of recent developments to our task-dynamic 'toolkit' (planning oscillator ensemble and temporal modulation gestures) and how they have been used to interpret and simulate experimental data on the interactions of stress and prominence in shaping the "prosodically driven phonetic detail" [14] of speech.

Original languageEnglish
Title of host publicationProceedings of the 4th International Conference on Speech Prosody, SP 2008
PublisherInternational Speech Communications Association
Pages175-184
Number of pages10
ISBN (Print)9780616220030
Publication statusPublished - 2008
Externally publishedYes
Event4th International Conference on Speech Prosody 2008, SP 2008 - Campinas, Brazil
Duration: 2008 May 62008 May 9

Publication series

NameProceedings of the 4th International Conference on Speech Prosody, SP 2008

Conference

Conference4th International Conference on Speech Prosody 2008, SP 2008
Country/TerritoryBrazil
CityCampinas
Period08/5/608/5/9

ASJC Scopus subject areas

  • Language and Linguistics
  • Computer Vision and Pattern Recognition
  • Human-Computer Interaction
  • Software
  • Mechanical Engineering

Fingerprint

Dive into the research topics of 'A task-dynamic toolkit for modeling the effects of prosodic structure on articulation'. Together they form a unique fingerprint.

Cite this