It's not unusual for user interfaces to get "stuck" on one model, notes Bill Meisel, president of TMA Associates and publisher of Speech Strategy News. The layout of keyboards hasn't changed for decades, for example, despite some efforts to make it easier to use (by putting oft-used letters under the strongest fingers). The telephone's 12-button keypad is similarly persistent.
Persistence of the user interface is a major barrier to increased use of mobile devices beyond communication, Meisel contends. He believes the alternative that will come to dominate mobile phones is "voice search," a specific use of speech recognition technology. Meisel and the non-profit Applied Voice Input Output Society are highlighting this approach in the Voice Search Conference, being held in San Diego, California, March 10-12 (www.voicesearchconference.com).
Challenges for the Graphical User Interface
The major user model on mobile phones is the Graphical User Interface--the GUI familiar on PCs and Web browsers. The GUI has established almost the same "stickiness" as the keyboard and keypad, Meisel notes. People are familiar with GUIs, and a safe approach to innovation might appear to be to incrementally make the standard GUI easier to use by tweaking its elements (menus, icons, windows, and the pointing or scrolling mechanism). Some of the more innovative approaches have come from Apple, in part by tweaking the pointing/scrolling mechanism to make it easier to use and more powerful on small devices, with the iPod's scroll wheel and the iPhone's multi-touch screen.
The difficulty with tweaking the GUI model on small devices is that the adaptations only partially address the basic issues of small screen size; difficulty in entering text; and the distraction of a screen-based interface when mobile (driving being the obvious example). The GUI alone is not likely to encourage the mass of consumers to use the many services and features that are being promoted for mobile devices, or to make mobile marketing acceptable, Meisel argues. It does not solve the most fundamental problems:
- Frustrating navigation through multiple levels of choices; and
- An overabundance of information or a long list of choices that don't fit in a small window.
The Voice Search alternative
The basic philosophy of Voice Search is "Just say what you want," cutting through layers of navigation. The result can be delivered by voice, but, in many cases will be delivered as text or graphics on the device, taking full advantage of the screen when the information fits. In some cases, the result will be to activate a feature of the device, e.g., if the request is "take a picture."
A second element of the Voice Search paradigm is clarification through dialog. If the initial request is unclear or generates too many alternatives to list, a contextually sensitive request for narrowing information can be spoken to the user. Back-and-forth speech dialog is a natural and efficient way to clarify ambiguities.
A personal assistant
The mental model for a well-executed Voice Search interface is that of a "personal assistant"--Just tell your assistant what you want. Like an assistant, the application should learn over time what a request means for a specific user, reducing the need for dialog. The personal-assistant model is a particularly powerful way to think about designing a voice user interface for a device that is always with you, Meisel observes.
What if you are in a situation where you can't speak? Voice Search can still serve as the primary model for the user. If you can't talk, type what you would say into a text box. In effect, you are texting your personal assistant what you would say to it.
Speech technology has reached the level of maturity to support the Voice Search user interface. The Voice Search Conference is the first conference to focus on this breakthrough and the resulting products and services.