When Alexa Can't Understand You

Weekly Article
Daisy Daisy / Shutterstock.com
Oct. 18, 2018

When Whitney Bailey bought an Amazon Echo, she wanted to use the hands-free calling feature in case she fell and couldn’t reach her phone. She hoped that it would offer her family some peace of mind and help make life a little easier. In some ways, she says, it does. But because she has cerebral palsy, her voice is strained when she talks, and she struggles to get Alexa to understand her. To make matters worse, having to repeat commands strains her voice even more.

Thanks to technologies like Google Assistant and Amazon Alexa, we’re living in an increasingly voice-first world. With a simple “Alexa,” we ask a personal voice assistant to tell us the weather, make reservations, or play ambient music.* For most people, this is a pleasant convenience. For the 39.5 million American adults with limited mobility and the 22.5 million American adults with limited vision, this technology means much more: With a single command, they can turn on lights, lock doors, have search results read aloud, or send text messages to loved ones. But for those who struggle to vocalize their speech, it’s a different story. If voice is the future, tech companies need to prioritize developing software that is inclusive of all speech.

In the United States, 7.5 million people have trouble using their voice and more than 3 million people stutter, which can make it difficult for them to fully realize voice-enabled technology. Speech disabilities can stem from a wide variety of causes, including Parkinson’s disease, cerebral palsy, traumatic brain injuries, even age. In many cases, those with speech disabilities also have limited mobility and motor skills. This makes voice-enabled technology especially beneficial for them, as it doesn’t involve pushing buttons or tapping a screen. For disabled people, this technology can provide independence, making speech recognition that works for everyone all the more important.

As Americans age, the need to develop better software is becoming more pressing.

Yet voice-enabled tech developers struggle to meet their needs. People with speech disabilities use the same language and grammar that others do. But their speech musculature—things like their tongue and jaw—is affected, resulting in consonants becoming slurred and vowels blending together, says Frank Rudzicz, an associate professor of computer science at the University of Toronto who studies speech and machine learning. These differences present a challenge in developing voice-enabled technologies.

Tech companies rely on user input to fine-tune their algorithms. The machine learning that makes voice-enabled tech possible requires massive amounts of data, which comes from the commands you give and the questions you ask devices. Most of these data points come from younger abled users, says Rudzicz. This means that it can be challenging to use machine-learning techniques to develop inclusive voice-enabled technology that works consistently for populations whose speech varies widely, such as children, the elderly, and the disabled.

But as Americans age, the need to develop better software is becoming more pressing. Today, there are roughly 50 million Americans over the age of 65. By 2035, that number will be 78 million, meaning more people will be at risk of stroke and degenerative conditions that impair speech. At the same time, voice-enabled tech is becoming even more integrated into our lives, with some experts predicting that by 2020 nearly three-quarters of U.S. households will have a voice assistant and half of our internet searches will be done via voice. As voice-enabled technology becomes more ubiquitous, companies must adapt to the changing populations—and their speech.

Andy Theyers, who’s written about his struggle to use voice assistants due to his stutter, says that this is, in part, a reflection of an industry that doesn’t always prioritize accessibility from the beginning of a product’s development. Sean Lewis, a motivational speaker with cerebral palsy, agrees. “Unless [tech developers] personally know someone with a disability,” Lewis says, they “have no idea how a lack of technology affects people’s lives.”

Lewis is grateful for his Samsung phone, which came with Google Voice Assistant. Before his Samsung, he wasn’t able to send emails or texts on his own. Now, he can do it all through voice commands. But he finds that he often has to repeat himself at least once or twice before the device understands him (a problem that many people without speech disabilities face as well, to be sure). Though voice-enabled tech has improved, he says, “we’re not where we need to be.”

Twenty-some years ago Steven Salmon, an author with cerebral palsy, began using DragonDictate voice recognition software to write his books, spelling out words letter by letter. It was a time-consuming process that required his pronunciation to be perfectly consistent. “If I had a cold,” he says, “I couldn’t write.” When Salmon received an iPhone in 2015, he tried to get the device to respond to his commands. Voice-enabled tech was more accurate than ever—Siri’s error rate was 5 percent. Yet, his phone couldn’t understand his commands, so he ended up returning the device.

For Theyers, the biggest problem with using his Alexa is triggering it to listen to him. Until 2017, when Amazon added “computer” as a wake word, voice-enabled devices required wake phrases that began with a hard vowel—think, “Alexa.” The hard vowel triggers Theyers’ stutter, and often by the time he’s said the trigger word, the device has stopped listening.

He wishes that there was a way to better personalize devices—and that technologists would seek out the opinions of people with voice impairments. Whitney Bailey agrees: “It can be disheartening when a person has trouble using [technology] because they have a unique speech pattern.”

Rudzicz predicts that in the future, technology will be more individualized. Moreover, as tech companies collect massive amounts of data from millions of users, he anticipates that we’ll begin to look at individual user data to see how it differs from the general population, allowing us to adapt models to the individual.

Some companies are already working to develop more individualized software. Voiceitt, a startup, is currently beta-testing a speech-recognition app that translates nonstandard speech to standard speech in real time using a closed dictionary. Users, often with the help of a speech therapist or caregiver, create their own dictionary by reading short phrases or sentences. After they create the dictionary—a process that can take 30 minutes to three hours—they can begin to use the app. Voiceitt’s goal for its first iteration is to help people vocalize their wants and needs, says Sara Smolley, Voiceitts’ vice president of strategy. But as the company collects more data, it is exploring how it might be able to find commonalities within demographics that would allow it to develop more tailored software—such as special algorithms for native English-speaking, 40-year-old males or native Spanish-speaking 20-year-old females.

Larger tech firms also recognize their role in developing inclusive assistive technologies that are both widely available and relatively inexpensive. A spokesperson from Amazon said the company frequently receives positive feedback from “aging-in-place” customers who use Alexa’s smart-home features as an alternative to going up and down stairs. Amazon did not comment on its future plans regarding accessibility, but it pointed to Amazon’s Echo Show—which offers users Tap to Alexa, a screen interface that lets users who are deaf and hard of hearing to tap common commands—as well as Alexa Captioning, which allows users to read Alexa’s responses. And Microsoft recently launched an A.I. for accessibility program to create inclusive, affordable tech. These features bring us closer to better technology, but they still have barriers for people with limited mobility and poor fine motor skills, who may be unable to easily walk over to a screen or tap small buttons.

As tech developers continue designing voice-enabled products, the key will be scaling up solutions and supporting their integration into existing technologies. Rudzicz predicts that we’ll see better technologies as the population ages and companies try to cater to people with degenerative conditions. And as these technologies are developed with aging populations in mind, people with congenital disorders like cerebral palsy will benefit, too—pushing us one step closer to truly inclusive voice-enabled tech.

This article originally appeared in Future Tense, a collaboration among Arizona State University, New America, and Slate.