Conversational UI--李开复的演讲PPT.ppt

资源描述

《Conversational UI--李开复的演讲PPT.ppt》由会员分享，可在线阅读，更多相关《Conversational UI--李开复的演讲PPT.ppt（26页珍藏版）》请在三一文库上搜索。

1、Conversational Computers:Always 10 Years Away?,Kai-Fu Lee Corporate Vice President Microsoft Corporation,Why Conversational Interface?,Speech : “invented” for interaction “Speech speech evolved from the need to communicate” Michael Dertouzos. Benefits of “Conversational Interface” “To me, speech rec

2、ognition will be a transforming capability when you can speak to your computer and it will understand what youre saying in context.” Gordon Moore “Speech and natural language understanding are the key technologies that will have the most impact in the next 15 years.” Bill Gates Future UI vision assu

3、me conversational UI Apples “Knowledge Navigator”. Microsofts “information at your fingertips”. Science fiction movies assume conversational UI,But “Always” 10 Years Away,1950 Jerome Weisner predicted by 1960 machine translation may be possible 1957 Herbert Simon predicted by 1967 machine will match

4、 human performance in many areas 1969 US Expert Panel predicted “voice I/O will be in common use by 1978” 1993 I predicted by 2003 every PC will ship with speech recognition 1998 Gartner Group predicted PC UI will assume voice input by 2003,Decomposing the Prediction,Speech recognition Text to speec

5、h Natural language understanding Why have we been a constant 10 years away? My 3-year requires from-city, to-city, etc. Context (additional hints) Domain knowledge : No train from Hawaii to Chicago Statistics : Book as a noun Book as a verb “Book Chicago” Personal Preferences : Where you live, your

6、calendar, how you pay Model of time, urgency,presence Dialog (resolving ambiguity Time-normalization;Dynamic programming,Isolated Words; Connected Digits; Continuous Speech,Pattern recognition; LPC analysis; Clustering algorithms;,Continuous Speech; Speech Understanding,Stochastic language understan

7、ding; Finite-state machines; Statistical learning;,Small Vocabulary, Acoustic Phonetics-based,Medium Vocabular,Template-based,Large Vocabulary; Syntax, Semantics,Connected Words; Continuous Speech,Large Vocabulary, Statistical-based,Hidden Markov models; Stochastic Language modeling;,Spoken dialog;

8、Multiple modalities,Very Large Vocabulary; Semantics, Multimodal Dialog, TTS,Concatenative synthesis; Machine learning; Mixed-initiative dialog;,Fueled by Moores Law + Data + Research,Talk Outline,Speech recognition Text to speech Natural language understanding Why have we been a constant 10 years a

9、way? My 3-year Mainstream app,2005,Accessibility Mobile dictation,2013,Key part of Desktop UI; Planning Federation,Question Answering,Task-specific translation Home appliances,Voice data,Voicemail & Meeting Search,Personal Annotations & Recording search,Mining from audio data (e.g., call center),Voi

10、cemail & Meeting transcription,Conclusion,Speech technologies will follow Moores Law Faster CPU + more data + better algorithms. Near-human quality possible in 7-10 years Natural language understanding is hard Domain-free reasoning & common sense hardest Truly human-level understanding likely elusive Smart, conversational systems will emerge 2-3 years: telephony, multimodal, accessibility. 7-10 years: intelligent assistance, meeting search/transcription, speech everywhere., 2001 Microsoft Corporation. All rights reserved.,

展开阅读全文