|
10thAnnual Conference of the International Speech Communication Association
Interspeech 2009 Brighton
|
Technical Programme
This is the final programme for this session. For oral sessions, the timing on the left is the current presentation order, but this may still change, so please check at the conference itself. If you have signed in to My Schedule, you can add papers to your own personalised list.
Mon-Ses3-P1: Human Speech Production I
| Time: | Monday 16:00 |
Place: | Hewison Hall |
Type: | Poster |
| Chair: | Shrikanth Narayanan |
| #1 | Probabilistic effects on French [t] duration
Francisco Torreira (Radboud Universiteit Nijmegen & Max Planck Institute for Psycholinguistics) Mirjam Ernestus (Radboud Universiteit Nijmegen & Max Planck Institute for Psycholinguistics)
The present study shows that [t] consonants are affected by probabilistic factors in a syllable-timed language as French, and in spontaneous as well as in journalistic speech. Study 1 showed a word bigram frequency effect in spontaneous French, but its exact nature depended on the corpus on which the probabilistic measures were based. Study 2 investigated journalistic speech and showed an effect of the joint frequency of the test word and its following word. We discuss the possibility that these probabilistic effects are due to the speaker's planning of upcoming words, and to the speaker's adaptation to the listener's needs.
|
| #2 | On the production of sandhi phenomena in French: psycholinguistic and acoustic data
Odile Bagou (Groupe NeuroPsychoLinguistique, FLSH, University of Neuchâtel, Switzerland) Violaine Michel (Groupe NeuroPsychoLinguistique, FLSH, University of Neuchâtel, Switzerland) Marina Laganaro (Groupe NeuroPsychoLinguistique, FLSH, University of Neuchâtel, Switzerland)
This study addresses two complementary questions about the production of sandhi phenomena in French. First, we investigated whether the encoding of sandhi phenomena involves a processing cost compared to non-resyllabified sequences. The elicited sequences were then used to address our second question, namely how critical V1CV2 sequences are phonetically realized across different boundary conditions. Results on production latencies suggested that the encoding of liaison enchaînée involves an additional processing cost compared to enchaînement and non resyllabified sequence. More, acoustic analyses indicated durational differences across the three boundary conditions. Implications for both, psycholinguistic and phonological models are discussed.
|
| #3 | Extreme reductions: Contraction of disyllables into monosyllables in Taiwan Mandarin
Chierh Cheng (Department of Speech, Hearing and Phonetic Sciences, University College London, UK) Yi Xu (Department of Speech, Hearing and Phonetic Sciences, University College London, UK)
This study investigates a severe form of segmental reduction known as contraction. In Taiwan Mandarin, a disyllabic word or phrase is often contracted into a monosyllabic unit in conversational speech, just as “do not” is often contracted into “don’t” in English. A systematic experiment was conducted to explore the underlying mechanism of such contraction. Preliminary results show evidence that contraction is not a categorical shift but a gradient undershoot of the articulatory target as a result of time pressure. Moreover, contraction seems to occur only beyond a certain duration threshold. These findings may further our understanding of the relation between duration and segmental reduction.
|
| #4 | Annotation and Features of Non-native Mandarin Tone Quality
Mitchell Peabody (MIT) Stephanie Seneff (MIT)
Native speakers of non-tonal languages, such as American English,
frequently have difficulty accurately producing the tones
of Mandarin Chinese. This paper describes a corpus of Mandarin
Chinese spoken by non-native speakers and annotated for
tone quality using a simple Good-Bad system. We examine
inter-rater correlation of the annotations and highlight the differences
in feature distribution between native, good non-native,
and bad non-native tone productions. We find that the features
of tones judged by a simple majority to be bad are significantly
different from features from tones judged to be good, and tones
produced by native speakers.
|
| #5 | On-line Formant Shifting as a Function of F0
Kateřina Chládková (Amsterdam Center for Language and Communication, University of Amsterdam, The Netherlands) Paul Boersma (Amsterdam Center for Language and Communication, University of Amsterdam, The Netherlands) Václav Jonáš Podlipský (Department of English and American Studies, Palacký University Olomouc, Czech Republic)
We investigate whether there is a within-speaker effect of a higher F0 on the values of the first and the second formant. When asked to speak at a high F0, speakers turn out to raise their formants as well. In the F1 dimension this effect is greater for women than for men. We conclude that while a general formant raising effect might be due to the physiology of a high F0 (i.e. raised larynx and shorter vocal tract), a plausible explanation for the gender-dependent size of the effect on F1 values can only be found in the undersampling hypothesis.
|
| #6 | Production Boundary between Fricative and Affricate in Japanese and Korean Speakers
Kimiko Yamakawa (National Institute of Informatics) Shigeaki Amano (NTT Communications Science Laboratories) Shuichi Itahashi (National Institute of Informatics)
A fricative [s] and an affricate [ts] pronounced by both native Japanese and Korean speakers were analyzed to clarify the effect of the mother language on speech production. It was revealed that Japanese speakers have a clear individual production boundary between [s] and [ts], and that this boundary corresponds to the production boundary of all Japanese speakers. In contrast, although Korean speakers tend to have a clear individual production boundary, the boundary dose not corresponds to that of Japanese speakers. These facts suggest that Korean speakers tend to have a stable [s]-[ts] production boundary but that differ from Japanese speakers.
|
| #7 | Aerodynamics of Fricative Production in European Portuguese
Cátia M. R. Pinho (IEETA, Universidade de Aveiro, Portugal) Luis M. T. Jesus (IEETA and ESSUA, Universidade de Aveiro, Portugal) Anna Barney (ISVR, University of Southampton, UK)
The characteristics of steady state fricative production, and those of the phone preceding and following the fricative, were investigated. Aerodynamic and electroglotographic (EGG) recordings of four normal adult speakers (two females and two males), producing a speech corpus of 9 isolated words with the European Portuguese (EP) voiced fricatives /v, z, Z/ in initial, medial and final word position, and the same 9 words embedded in 42 different real EP carrier sentences, were analysed. Multimodal data allowed the characterisation of fricatives in terms of their voicing mechanisms, based on the amplitude of oral flow, F1 excitation and fundamental frequency (F0).
|
| #8 | Contextual effects on protrusion and lip opening for /i,y/
Anne Bonneau (LORIA/CNRS) Julie Busset (LORIA/ UMR 7503) Brigitte Wrobel-Dautcourt (LORIA/UMR7503)
This study investigates the effect of “adverse” contexts, especially that of the consonant /S/, on labial parameters for French /i,y/. Five parameters were analysed: the height, width and area of lip opening, the distance between the corners of the mouth, as well as lip protrusion. Ten speakers uttered a corpus made up of isolated vowels, syllables and logatoms. A special procedure has been designed to evaluate lip opening contours. Results showed that the carry-over effect of the consonant /S/ can impede the opposition between /i/ and /y/ in the protrusion dimension, depending upon speakers.
|
| #9 | Speech Rate Effects on European Portuguese Nasal Vowels
Catarina Oliveira (University of Aveiro) Paula Martins (Health School, University of Aveiro) António Teixeira (DETI/IEETA, University of Aveiro)
This paper presents new temporal information regarding the production of European Portuguese (EP) nasal vowels, based on new EMMA data.
The influence of speech rate on duration of velum gestures and their coordination with consonantic and glottal gestures were analyzed. As information on relative speed of articulators is scarce, the parameter stiffness for the nasal gestures was also calculated and analyzed.
Results show clear effects of speech rate on temporal characteristics of EP nasal vowels. Speech rate reduces the duration of velum gestures, increases the stiffness and inter-gestural overlap.
|
| #10 | Relation of formants and subglottal resonances in Hungarian vowels
Tamás Gábor Csapó (Department of Telecommunications and Media Informatics, Budapest University of Technology and Economics, Budapest, Hungary) Zsuzsanna Bárkányi (Research Institute for Linguistics, Hungarian Academy of Sciences, Budapest, Hungary) Tekla Etelka Gráczi (Research Institute for Linguistics, Hungarian Academy of Sciences, Budapest, Hungary) Tamás Bőhm (Department of Telecommunications and Media Informatics, Budapest University of Technology and Economics, Budapest, Hungary; Institute for Psychology, Hungarian Academy of Sciences, Budapest, Hungary) Steven M. Lulich (Speech Communication Group, MIT, Cambridge, MA 02139)
The relation between vowel formants and subglottal resonances (SGRs) has previously been explored in English, German, and Korean. Results from these studies indicate that vowel classes are categorically separated by SGRs. We extended this work to Hungarian vowels, which have not been related to SGRs before. The Hungarian vowel system contains paired long and short vowels as well as a series of front rounded vowels, similar to German but more complex than English and Korean. Results indicate that SGRs separate vowel classes in Hungarian as in English, German, and Korean, and uncover additional patterns of vowel formants relative to the third subglottal resonance (Sg3). These results have implications for understanding phonological distinctive features, and applications in automatic speech technologies.
|
|
|