Andrew C. Oliver
Contributing Writer

A keyboard? How quaint

Hello, computer: Hands-free computing a la 'Star Trek' is coming soon

Blurry hands typing on computer keyboard
Credit: Thinkstock

The era of voice search and voice-operated software is upon us. As a developer I live and die by the keyboard, but I can already see the signs: Like many people, for example, I talk to my Android phone (for example, โ€œNavigate to Lowes [or Starbucks or Harris Teeter]โ€) to get directions.

In Mary Meekerโ€™s 2016 Internet Trends Report, she reports that Google Voice search queries have risen by a factor of seven since 2010. Iโ€™ve also noticed that my 12-year-old son does nearly all of his searches via voice โ€” and my girlfriend texts me this way on a regular basis. Also, the company I work for, Lucidworks, recently announced a new partnership with IBM to integrate Watson and text-to-speech capabilities into our enterprise search product.ย 

The technology works a lot better than it used to, and itโ€™s easier to integrate into applications.ย If you develop for Android or iOS, you can easily hook into the APIs for speech recognition. But speech recognition doesnโ€™t begin and end with simple speech-to-text and voice commands.

Understanding the intent of the search is a very contextual task, especially with spoken language. Moreover, people tend to use more words in natural spoken language than when they are confronted with a search bar. There are more โ€œnoise wordsโ€ in spoken language than in a normal textual search.

These are significant AI challenges. But as we overcome the context problem, developers will learn that more can be done with voice than with text. Emotional context will play a role. If youโ€™re looking for a gas station, do you want the cheapest one or the closest one? The emotive content of your voice could imply that. Sure, you might clarify, but you might not have to.

Your talkative future

The voice-driven epoch isnโ€™t about search alone. It will affect the entire way we interact with computers. In the not-too-distant future, keyboards will be considered โ€œquaint,โ€ as Scotty famously described them in โ€œStar Trek IV.โ€

But that shift also demands a whole new UI. Hereโ€™s an ancient illustration of what I mean: When Windows 95 came out, IBM had integrated voice commands into its PCs. At the time, I was working as a salesperson at Office Depot, and it quickly became apparent how impractical voice commands were. The windowed interface didnโ€™t lend itself to this form of interaction at all.

I mean, how the hell do you move a window out of the way of another window and resize them both to fit on the screen in an efficient manner with voice commands? You donโ€™t. You ditch those windows (and probably Windows) altogether. A voice-driven UI doesnโ€™t use the same motifs. You never see a windowed interface on โ€œStar Trek.โ€

Speaking of โ€œStar Trek,โ€ when people start coding or doing something technical, they always switch to a tactile interface (OK, not exactly tactile โ€” it looks more like a microwave keyboard overlaid with art nouveau renderings of a circuit board). But is the regression to โ€œtypingโ€ necessary? True, I canโ€™t imagine using a voice interface to code in Scala. Maybe new languages (devoid of parenthesis, unlike Scala โ€” and my articles) will be developed that are specially suited to voice.

Websites surely wonโ€™t look the same and will offer new navigation paradigms. Youโ€™ll say โ€œshow me deals on shoes,โ€ and what you get back will probably be better organized and more contextually sensitiveย than your average website (โ€œdealsโ€ && โ€œshoesโ€). Moreover, I wonโ€™t want to scroll or say โ€œnext pageโ€ a lot, so the interactions will have to be personalized. The system should already know I want menโ€™s shoes and I donโ€™t want hard-heeled shoes due to my Achillesโ€™ tendonitis. Maybe it knows I prefer dark colors. Maybe I told it or maybe it analyzed my behavior.

Is this a website at all? Sure, if Iโ€™m shoe shopping, Iโ€™ll want a visual representation, but if Iโ€™m talking maybe the machine is talking back. Maybe it shows me shoes, then asks: โ€œAre you looking for a particular type of shoe? What purpose are these shoes for? Are you wearing them hiking or to a party?โ€

The era of voice search will change everything from how we interact with machines to how we code. Many of the technologies we need are already available to us today, while others are yet to be invented. The effect on user interfaces could be more profound than the switch from punch cards to keyboards.

This sweeping change wonโ€™t come all at once. Today isnโ€™t the day to throw out your keyboard. But it might be the day to start thinking about redesigning your website to be truly voice-accessible.