It should be clear to anyone who follows the technology industry that voice-enabled devices represent the next wave of computing interaction. After punch-cards, green-screen terminals, windows-mice-pointers and touchscreens, voice-enabled UIs will become the prevalent way humans interact with technology in the coming years.
The outspoken tech entrepreneur Gary Vaynerchuk, while kicking off his awesome voice tech conference VoiceCon 2018, gave a rousing keynote slating any organisation that isn’t thinking about what their Brand sounds like to talk to. What’s it’s personality?
He describes his bullishness towards VUIs as a purely transactional business reaction – he claims no vested emotion interest. He agrees that the new channels to consumers opened up by VUI tech will cause seismic shifts in the landscape of big business, who sells to who and how consumers consume. What happens when web searches transition to voice (predications pitch 40% of all search to be screen-less in an year and a half), only the top #1 search result is returned, and Google can’t sell as many AdWords? And how much more will brands need to spend to encourage consumers to say “Alexa, get me more Windowlene”, rather than just ‘get me more window cleaner’. The companies who control the voice platforms will be the new toll-booths to consumers, and you know Amazon is going to offer that customer whatever product makes it the most money: how much do Windowlene pay to get top spot? Has a competitor paid more to be offered instead? Or can Amazon make more money selling their own-brand window cleaner instead?
As more brands tune in to the importance of having a voice presence and becoming the First Thought in consumers’ minds when they request a product, we’re going to see a ruthless arms race (there are no runners up prizes here) with brands demanding more and more marketing spend to capture less and less of our precious awareness.
So, voice is the future, and if we’re heading towards an interaction shift of this magnitude, what should we be doing – as industry professionals – to ensure the whole world can take part in this ride? Voice interfaces have a vital flaw: they may be the most natural interaction mechanism for those who have working hearing, but what if you’re deaf? How do you interact with a voice-first app if your primary method of physical-world communication is sign language?
Tentative steps have been made to explore these questions recently – igniting the right kinds of discussion, at just the right time.
Amazon Tap To Alexa
The first interesting exploration is from Amazon themselves: a software update will soon be coming that makes the Alexa interface interactive via tapping on the Echo Show screen.
Giving those with speech impairments a way to interact with Alexa is a great first step in truly democratising this new wave of technology. Users are presented with a selection of common actions, usually activated via voice commands. Users can also configure their own actions, enabling the device to be responsive to those with speech impairments, as it is to those without.
Optionally, captions can also be enabled to show on-screen what Alexa is saying. In reality, this is a potentially more useful accessibility feature – as it works with every feature or third-party skill the smart speaker supports, rather than a small subset of actions Amazon decide to enable.
Helping Alexa Understand Sign Language
Another great example is from Abhishek Singh – a coder, inventor, and generally awesome dude.
In this demo, Abhishek has created a system that recognises a common set of signs, translating them to words and speaking them with a proceeding “Alexa…” to wake up the smart speaker.
In practical terms, the value of this mod may be limited – the benefit of voice-activated interfaces is in their speed and convenience. By the time the user has signed, had their signs understood and translated, spoken to Alexa, been processed and had a response, it would probably have been quicker to simply pick up a smartphone and have done whatever they needed to do there, instead!
But that’s not the point – I think the real value of this project is in sparking discussions and awareness (such as this article!). What if this research project encouraged Amazon to bake sign language recognition directly in to the Echo Show, using the built-in camera to translate signs directly into intents for Alexa to recognise? This would make signing a first-class interaction mechanism, alongside speech and Tap to Alexa interactions, maintaining the platform’s speed and efficiently benefits for all.
The truth – the truly democratic approach we should be taking with our new-found technological abilities – is to develop these capabilities in a way that is accessible to every tech enthusiast, regardless of their impairments. Amazon’s vision for Alexa is to have a device in every room, every occasion, ubiquitously available. Arguably, they’re well on their way to realising this vision – the pace of adoption of Alexa-compatible devices is simply staggering. But for those with accessibility needs, this drives the wedge between them even deeper – excluding them from taking part in this social, cultural and economic revolution in a way that is hopelessly unfair.
Let’s continue the discussion. Let’s hope these first tentative steps towards a fairer future are just the beginning. Let’s bring everyone along for this one-hell-of-a-ride!