When Amazon launched the Alexa virtual assistant nine years ago, its ability to decode voice commands to set a timer or play a song seemed almost magical. Today, the bar for impressive language skills is much higher, thanks to OpenAI’s ChatGPT. Amazon is giving its voice assistant a reboot that takes advantage of the technology behind the new wave of chatbots that can engage in remarkably lifelike conversation.
Amazon announced the upgrade to Alexa at an event held at its second headquarters in Arlington, Virginia. The assistant will answer much more complex questions and engage in more flowing, open-ended conversation, dropping the need for users to say “Alexa …” at each turn.
In a few weeks, users who say, “Alexa, let’s chat,” will get access to the new, more capable voice assistant. Amazon calls it an “early preview” because the new capabilities remain a work in progress.
Demos given onstage on Wednesday showed Alexa exhibiting more simulated personality with its intonation and efforts at humor. Devices equipped with cameras, such as the Echo Show, will try to detect when a person is expecting Alexa to continue the conversation and when the conversation is over.
The new Alexa will also modulate its own voice to create a more natural-seeming back-and-forth. “If I ask Alexa how the Red Sox are doing, and they have just lost, it will come back with an empathetic tone,” says Rohit Prasad, who leads AI development at Amazon and is based in Cambridge, Massachusetts.
Prasad says that upgrading Alexa’s language skills required extensive engineering, because the large language models that power services like ChatGPT can make up facts, blurt out nonsense, and be downright inappropriate. “Especially given certain limitations of language models, this is a huge leap,” Prasad says.
Justine Cassell, a professor at Carnegie Mellon University who studies the way humans interact with AI agents, says it will be fascinating to see how people respond to a voice-enabled chatbot capable of richer responses. “The goals are great, and I’m excited to see what they do,” she says.
However, Cassell says some of the things Amazon is promising, like responding to body language, remain extremely challenging. “There is no grammar of body language, the way there is for spoken and written language,” she says. If Alexa misreads someone’s posture or movements and responds incorrectly, things could get awkward.
Cassell says that even if Alexa gains more ChatGPT-like fluency, its efforts to mimic human personality and feeling through characteristics like intonation are unlikely to match human capabilities for some while yet. Expect the new Alexa to sometimes feel stilted in its responses.
Amazon says users will be able to apply to gain access to an additional test of its new technology, where Alexa’s new capabilities can be used to control other devices, including some not made by Amazon. Over time, the company plans to add new features to Alexa, potentially including the ability to discuss and recommend products from the company’s vast inventory of products.