At this year’s CES, some of the hottest gadgets and services were those that interacted with Amazon’s virtual assistant, Alexa. It’s estimated that the market for such virtual assistant technologies will reach $3.6 billion by 2020. Voice appears to be winning on other frontiers too. Global banking giant HSBC recently launched a voice authentication system for personal banking customers in the UK that allows them to access services using their voice as their password.
I recently tracked the rise and rise of voice interface technology from its beginnings in the 1970s to its adoption as a consumer gadget today. If you’re wondering if voice interfaces are relevant to you, I’m going to hazard a guess that they are, even though you may not know it yet! There are a few simple reasons why voice interfaces are booming right now.
- Voice user interfaces, which include AI and Natural Language Processing, have finally matured to a level where they work more or less intuitively.
- As more devices become connected, the main interface will become the voice, not the graphical display to which we’ve grown accustomed. Indeed, take out the need for a display, and you open up a myriad of opportunities for devices to become ‘smart’, take up less space and cost less to manufacture.
- Cloud-based processing now enables us to offer high processing capacity to miniature devices, which was simply not feasible earlier.
Voice – the most natural command center in the world
Screens are getting increasingly cluttered with a plethora of applications. Voice interfaces, by contrast, are a direct way of getting the task done. As the voice recognition and natural conversation ability of interfaces are perfected, users will find it much easier to just dictate the task to the digital assistant rather than navigate the graphical user interface on their own.
Whereas traditional smartphone graphical displays require users to engage brain, hands and eyes, voice interfaces enable users to control them while they’re otherwise occupied. No surprise then that one industry that stands to benefit from voice interaction is the automobile industry. Android Auto and Apple CarPlay have led the field so far. You could be driving your car while simultaneously accessing your smartphone, e.g. navigation, weather, phonebook etc. Voice interfaces also offer an opportunity to integrate in-car controls too, like climate control, music and so on. Slowly, the command-based dialogue will be replaced by natural conversation between the driver and the vehicle.
Voice interfaces will increasingly be utilized for smartwatches, Bluetooth headsets and other communications-focused wearables because the small screen (or lack thereof) makes touch interface less intuitive. In the burgeoning AR/VR/MR headset market, voice user interface has become the default mode of user interaction due to lack of any graphical user interface (GUI).
When it comes to the adoption of smart devices, voice interfaces will play a big part. With the exponential rise in the number of smart devices, it is no longer feasible to have a separate app-based interface for user interaction for every device.
Even mobile phones won’t escape the “voice” treatment! Voice-based assistants like Apple’s Siri and Google Now have penetrated most smartphones. As the natural language processing increases the reliability of the voice interfaces, more and more functionalities can be integrated with the voice interfaces. Fast forward a few years and it’s likely that we’ll see apps as we know them wane in popularity at the expense of increasingly sophisticated virtual assistants.
Of course, there will be challenges if voice user interfaces are to transition from novelty to mainstream. They need to be reliable. Users will need to feel they offer the same dependable functionality as a conventional graphical user interface.
They will also need to be as adaptable in different user environments as a conventional touch display. The current generation of voice interfaces has become effective enough to operate in an environment with some interfering noise. But will they be able to adapt to other commonplace noisy surroundings, e.g. windy weather, industrial environments and public spaces? Any environment with multiple distinct voices can pose problems too. Which leads me to another challenge – that of regional variations in language, the use of slang and let’s not forget emotions and tone of voice. Nobody expects a graphical interface to detect if you’re happy or sad, but as people increasingly carry out conversations in a natural fashion with their devices, they will come to expect some degree of emotional awareness from the machines.
If the prospect of voice user interfaces excites you and you would like to explore their suitability to your digital initiatives, please do contact us.