For most computer users, keyboard and mouse are still the primary means of interaction with a personal computer. Long-standing forecasts that they would be eclipsed by alternatives – such as handwriting and the human voice – are still unfulfilled. But for some applications the keyboard may not be the best option.
Handwriting recognition has come a long way since Apple’s Newton personal digital assistant launched in 1993. Its aim to introduce an alternative interface by using a stylus was let down by poor performance of its handwriting recognition software.
I am a fan of today’s convertible-style tablet PCs. In my experience, their built-in handwriting recognition features work well, even with abysmal handwriting.
To control a PC and input text by voice ought to be even more natural, but it is not. Until now speech recognition packages have needed heavy-duty PC hardware, users have had to speak in staccato fashion and the systems have worked properly only if the user spent ages “training” the software to recognise his or her voice.
That has changed. The latest software copes well with continuous speech and is largely speaker-independent. And it delivers high accuracy without special training.
Today the market is dominated by Nuance, the US software publisher (www.nuance.com), and its Dragon Naturally Speaking package. The most popular version of Dragon Naturally Speaking 9, aptly named Preferred, costs $199 (£150) and can handle dictation at up to 160 words per minute; about three times faster than most people can type.
Users may expect 95 per cent accuracy after about 15 minutes’ training. Accuracy improves if you let the software scan your e-mail and document files. In time, Nuance claims users can achieve almost 99 per cent accuracy. The software “learns” as it goes.
The latest version is easy to install and integrates with a wide range of Windows applications, including Microsoft’s Office suite, Corel WordPerfect, and both Internet Explorer and Firefox web browsers. It is also easy to correct mistakes using this version.
But Dragon Naturally Speaking still requires powerful hardware to run smoothly. And to get the best out of the software still requires skill in dictation and patience.
During the past week, I have also tested Dragon with a Sony flash-memory-based digital voice recorder, the ICD-P320. The speech-recognition software worked when transcribing my voice, but it did not perform as well when transcribing an interview with several different voices.
If you already own Dragon and have spent time training the software, you may find it worth experimenting with a digital voice recorder for dictation.
But if the information you want to capture is already in text form, the most efficient way to process it is to use a scanner and OCR (optical character recognition) software such as OmniPage. This week Nuance launched the latest version of OmniPage, its OCR package.
Time savings may be dramatic. Inputting a 20-page, 6,000-word document takes the average person about two and a half hours. OmniPage can re-create the same document on a PC in less than two minutes.
Like Dragon Naturally Speaking, OmniPage comes in several versions. The basic package, OmniPage 16, costs $150 (£80) while Professional 16 costs $500 (£293) but includes additional features for the business professional.
Both versions share the same updated core and are built around the most powerful, flexible and competent OCR engine I have tested. They also represent an upgrade from the previous OmniPage package.
The latest packages are faster and more accurate. Nuance claims that OmniPage 16 runs up to 46 per cent faster and is 27 per cent more accurate. It is also the first OCR package specifically designed for PCs built around the multi-core processors found in most of the latest models.
Other new features that are common to both versions include improved formatting and table handling and – my favourite – the ability to capture text using a digital camera.
That means you can capture a passage from a book, magazine, presentation and even larger objects such as signs, which could never fit through a scanner. OmniPage can process the image and turn the text into a document ready for editing. In doing so it adjusts for skew, curved paper and a three-dimensional perspective.
OmniPage 16 has also been designed with both archiving and file sharing in mind. The software supports the creation of searchable PDF (page description format) files from any paper or electronic document. It also allows users to turn PDFs (even image-only files) into editable documents.
OmniPage Professional also comes with PaperPort, Nuance’s popular desktop document management software, and an industry-standard PDF creation tool called PDF Create! Other features include tools to streamline document processing workflows, capture data from forms, turn text documents into audio books and add digital signatures to files.
I found the software easy to install. The feature set can be a little daunting, but it comes with excellent “how-to” guides to help users get to grips with basic terminology and product features. Individual users should stick with the basic package. Corporate users should consider the more expensive Professional.
■ SOFTWARE THAT SPEAKS YOUR LANGUAGE
Q: Should I invest in voice recognition software?
Yes, provided you have a powerful PC and are willing to “train” the software and improve your dictation skills.
Q: What are the target markets?
Specialist verticals, such as healthcare and the legal profession, which generate lots of similar files packed with technical terminology. And anyone who has problems using a keyboard and mouse. But packages such as Dragon Simply Speaking are designed for the mass market and are relatively accurate, fast and easy to use.
Q: Why should I buy optical character recognition software?
If you own a scanner, OCR makes a quick job of capturing text or images and delivers productivity gains.