ReviewEssays.com - Term Papers, Book Reports, Research Papers and College Essays
Search

What Is Voice Recognition?

Essay by   •  October 15, 2010  •  Research Paper  •  3,833 Words (16 Pages)  •  2,176 Views

Essay Preview: What Is Voice Recognition?

Report this essay
Page 1 of 16

First off, what is voice recognition technology?

Voice recognition is a computer application that lets people control a computer by speaking to it. In other words, rather than using a keyboard to communicate with the computer, the user speaks commands into a microphone (usually on a headset) that is connected to a computer. By speaking into the microphone, users can do two things. First, they can tell their computers to execute commands such as open a document, save

changes, delete a paragraph, even move the cursor--all without touching a key. Second, users can write using voice recognition in conjunction with a standard word processing program. When users speak into the microphone their words can appear on a computer screen in a word processing format, ready for revision and editing.

Voice recognition has gained tremendous popularity over the past few years. It has gone from imagination, to rumor, to reality and this trend is not going to stop. It this will be better explained by Jeffrey C. Scott from Computer Shopper in the paragraphs below.

"The ability to interact with computers by talking to them may help bridge the gap between human beings and machines. Once deemed a fantastical glimpse into a science-fiction-like future, voice-recognition technology has undergone several significant developments recently, and it now seems destined to move us toward a humanized computer interface. But that's for the future. At present, voice-recognition technology at its best lets us focus on our daily tasks and needs rather than on the computer's commands and syntax. Imagine inputting all your requests, numerous e-mail messages, routine correspondence and numerical data simply by telling them to the computer. Executives with limited typing skills, individuals who are physically challenged, and users who suffer from pain associated with the repetitive motion of typing can all benefit from voice-recognition applications. In fact, almost anyone can profit from this technology, though it may take some adjustment and a little patience to scale the learning curve.

Today's voice-recognition products are notably easier to use than those of the past. In addition, veterans of voice recognition should find that the new and improved dictation systems require substantially less training and customization."

"The Virtues of Training :Training a voice-recognition system consists of a defined dictation session where you are prompted to say words, phrases, and sentences. This exercise, which can take anywhere from 20 minutes to an hour and a half, allows the computer to become accustomed to your pronunciation. After this session, the program calculates the results of the speech samples. When the entire process is complete, the speech engine is thoroughly tuned to your particular voice. You continually train the speech engine by correcting its recognition mistakes.

This review focuses on four voice-recognition programs for the Windows environment: DragonDictate for Windows 2.01, IBM VoiceType 3.0 for Windows 95, Kurzweil Voice for Windows 2.0, and Listen for Windows 95. These packages all have two basic functions--command and control, and dictation. DragonDictate, VoiceType, and Voice for Windows all seek control of the Windows operating system using dictation modes based on large vocabularies. Listen for Windows 95 has a more strictly defined dictation capability, focusing on number dictation and certain vocabulary words.

The packages were tested on a variety of Pentium and 486-equipped machines having from 8MB to 32MB of RAM. As voice-recognition packages are memory- and resource-intensive, superior performance was achieved by the machines having more RAM and faster microprocessors. To compare the programs, we dictated a variety of information, including sample phrases, letters, paragraphs, price lists, forms, and numerical data. And to measure the recognition engines' robustness, the test passages included groups of similar words, such as "there," "their," and "they're," or "to," "too," and "two." Also, a command and control obstacle course was set up to test the packages' ability to manipulate the Windows interface.

An examination of speech-to-text systems over the past several years reveals that prices have decreased dramatically, from thousands to merely hundreds of dollars. At the same time, usability and adaptability have greatly increased. These products have moved from being hardware-dependent to hardware-independent programs, no longer requiring a special or proprietary sound card. Many are also speaker-independent as well, meaning that the programs recognize with minimal training the speech of any user.

Voice-recognition software consists of two broad categories: discrete- and continuous-speech packages.

Discrete-speech packages require the speaker to pause between words so that the engine can identify each word accurately. Most users find these short pauses to be relatively minor adjustments.

Continuous-speech packages, on the other hand, can recognize your speech in the way you naturally talk--flowing from word to word without constant interruptions or pauses. Though rare in mainstream software applications, continuous-speech capability can be found in many large-scale telephony operations. (See the sidebar "The Future of Voice Recognition.").

These four packages, excluding Listen for Windows, use discrete speech for text dictation and continuous speech for commands and numeric dictation. Listen for Windows is a continuous-speech product. Read on as we begin to dictate." (THE VOICES OF AUTOMATION: Once a technology that existed only in fantasy, voice recognition has made itself a reality in modern computing. By Jeffrey C. Scott, Computer Shopper, September, 1996

http://www.zdnet.com/products/content/cshp/1609/cshp0048.html"

"In grade school, you might have learned that reading your written creations aloud would help you identify errors. Now, the computer can help in this area.

A number of text-to-speech applications on the market let your computer convert onscreen text into reasonably human-sounding speech. Think of the uses: These systems can read aloud letters, online newspapers, e-mail, and sales forecasts.

In the past, speech-generating systems consisted of external voice synthesizers attached to computers. Soon after that, voice-synthesis devices were housed on internal soundboards. Though there are still many extensive hardware-based systems, a lot of speech-synthesis packages are hardware-independent. In the text-to-speech process, your computer uses a variety of language

...

...

Download as:   txt (20.1 Kb)   pdf (209 Kb)   docx (18.4 Kb)  
Continue for 15 more pages »
Only available on ReviewEssays.com