Technical Programme
This is the final programme for this session. For oral sessions, the timing on the left is the current presentation order, but this may still change, so please check at the conference itself. If you have signed in to My Schedule, you can add papers to your own personalised list.
Wed-Ses1-S1:
Special Session: Lessons and Challenges Deploying Voice Search
| Time: | Wednesday 10:00 |
Place: | East Wing 4 |
Type: | Special |
| Chair: | Mike Cohen & Mike Phillips |
| 10:00 | Role of Natural Language Understanding in Voice Local Search
Junlan Feng (AT&T Labs Research) Srinivas Banglore (AT&T Labs Research) Mazin Gilbert (AT&T Labs Research)
Speak4it is a voice-enabled local search system currently available
for iPhone devices. The natural language understanding
(NLU) component is one of the key technology modules in this
system. The role of NLU in voice-enabled local search is twofold:
(a) parse the automatic speech recognition (ASR) output
(1-best and word lattices) into meaningful segments that contribute to high-precision local search, and (b) understand user’s intent. This paper is concerned with the first task of NLU. In
previous work, we had presented a scalable approach to parsing,
which is built upon text indexing and search framework,
and can also parse ASR lattices. In this paper, we propose an
algorithm to improve the baseline by extracting the “subjects”
of the query. Experimental results indicate that lattice-based
query parsing outperforms ASR 1-best based parsing by 2.1%
absolute and extracting subjects in the query improves the robustness of search.
|
| 10:20 | Recognition and Correction of Voice Web Search Queries
Keith Vertanen (University of Cambridge) Per Ola Kristensson (University of Cambridge)
In this work we investigate how to recognize and correct voice web search queries. We describe our corpus of web search queries and show how it was used to improve the accuracy of recognition. We show that using a search-specific vocabulary with automatically generated pronunciations is superior to using a vocabulary limited to a fixed pronunciation dictionary. We conducted a formative user study to investigate recognition and correction aspects of voice search in a mobile context. In the user study, we found that despite a word error rate of 48%, users were able to speak and correct search queries in about 18 seconds. Users did this while walking around using a mobile touch-screen device.
|
| 10:40 | Voice Search and Everything Else – What Users Are Saying to the Vlingo Top Level Voice UI
Chao Wang (Vlingo)
No abstract available.
|
| 11:00 | Searching Google by Voice
Johan Schalkwyk (Google)
No abstract available.
|
| 11:20 | Multiple-hypotheses searches from deeply parsed requests to multiple-evidences scoring: the DeepQA challenge
Roberto Sicconi (IBM)
No abstract available.
|
| 11:40 | Research Areas in Voice Search: Lessons from Microsoft Deployments
Geoffrey Zweng (Microsoft)
No abstract available.
|