Sun and Oracle Community Voices How to Buy Log In United States [Change] English

»  Speech and Voice
»  VLSI Research
»  Barcelona
»  Golden Gate
»  JFluid
»  Conceptual Indexing
»  Vanguard Media Appliance Platform
»  Next Generation Crypto
»  SunFlight
ACM Interactions, Volume 3, Number 6, November/December 1996. Copyright 1996 Sun Microsystems, Inc.

Sidebars

How Do Users Know What to Say?

Nicole Yankelovich
Sun Microsystems Laboratories

Some Speech Technology Terms

Following are definitions of speech-related terms used in this column as well as some suggestions for learning more about speech technologies.

Speech Recognition
The capability of a computer to take spoken input from the user and convert it to text.

Speech Recognizer
A software system (sometimes with hardware components) that processes a digital audio signal from a file, microphone, or telephone. The processing produces the best fit of possible word sequences based on a frequency table or predefined grammar specification. The output is one or more strings of text, often with associated measures of how closely the string matches the specification.

Discrete Speech
Some speech recognizers only allow users to speak one word or short phrase at a time, such as Calendar or Send-to-back.

Continuous Speech
Other speech recognizers allow users to speak connected streams of words, although usually only one sentence at a time. For example, I would like my calendar, or Send this to the back.

Vocabulary
The words or phrases a speech recognizer can hear. Both discrete and continuous speech recognizers have a word list, sometimes called a lexicon.

Grammar
A specification used by some continuous speech recognizers for how words in the vocabulary can be strung together.

Dictation
The use of speech for entering free text, as in a memo or the body of an electronic mail message.Commercially available dictation systems currently all use discrete speech recognition, but with very large vocabularies. Dictation systems are distinct from command-and-control systems, which are designed to allow the user to issue commands and to control application behavior.

Speech Output
The computer can either play recorded audio messages or can convert text to speech using a speech synthesizer. The recorded audio provides higher quality output, but a synthesizer is almost always used when the content of the output is not known ahead of time.

Prompt
A recorded or synthesized message produced by the system for the user.

Speech-only Interface
An application interface that has no other input or output mechanism other than speech. Telephone-based applications are the most common speech-only interfaces.

Multimodal Application
In the context of this article, an application that uses any number of input and output modalities, including speech.

Books that Include some Speech Design Issues

Interactive Speech Technology: Human Factors Issues in the Application of Speech Input/Output to Computers. Baber, Christopher, and Noyes, Janet M. (eds.), Taylor & Francis Ltd., London, 1993.

Schmandt, Christopher. Voice Communications with Computers. Van Nostrand Reinhold, New York, 1994.

Web Sites with General Information about Speech Technology

Comp. Speech FAQ http://www.speech.cs.cmu.edu/comp.speech/
Commercial Speech http://www.speechxp.com/commercial/speech.htm/
Speech Toys http://www.speechtoys.com/spchtoys/ (no longer available)

Speech Demonstrations Over the Telephone

Note: When calling these numbers, listen to the prompts and think about whether and how they could be improved.

CheckFree Corporation
(800) 392-0743
Electronic bill payment.

Linkon Demonstration Hotline
(800) 793-3667
Voice fax on demand and text-to-speech demos using Lernout & Hauspie recognizer.

Nortel StockTalk
(514) 765-7862
Speech system for stock quotations.

Voice Control Systems (VCS)
(214) 404-9405
Alphabet Recognition Demo

VCS Barge-In
(214) 404-0777
Demonstrates user's ability to interrupt speech output.

VCS Connected
(214) 490-0767
Connected Digit Recognition Demo

VCS Credit Card
(214) 490-1210
Credit Card Validation Demo

Voice Processing Corporation (VPC)
(617) 577-8422
Demos include Voice Dial, Auto Attendant, Credit Card, Continuous Digits

Wildfire Communications, Inc.
(800) 945-3347
Not much speech recognition in the demo, but you can listen to a simulated session between a Wildfire user and the system.