A novel “empathetic voice interface” was launched today Hume AIRecent York startup, is adding a range of emotionally expressive voices and emotion-sensitive hearing to huge language models from Anthropic, Google, Meta, Mistral and OpenAI — heralding an era in which AI helpers will make us feel more emotional.
“We specialize in building empathetic personas that speak the way humans speak, not the stereotypes of AI assistants,” says Hume AI co-founder Alan Cowenpsychologist, co-author of many works research work in artificial intelligence and emotion, and previously worked on emotional technologies at Google and Facebook.
WIRED tested Hume’s latest voice technology, called EVI 2, and found that its results were similar to those developed by OpenAI for ChatGPT. (When OpenAI gave ChatGPT a flirty voice in May, the company’s CEO Sam Altman praised the interface as “like AI from movies.” Real-life movie star Scarlett Johansson later claimed that OpenAI had stolen her voice.)
Like ChatGPT, Hume is much more emotionally expressive than most conventional voice interfaces. If you tell it that your pet has died, for example, it will adopt an appropriately somber and sympathetic tone. (Also, like ChatGPT, you can interrupt Hume mid-sentiment, and it will pause and adjust to the novel response.)
OpenAI hasn’t said how much of its voice interface attempts to measure user emotion, but Hume’s interface is designed specifically for this purpose. During interactions, Hume’s developer interface will show values that indicate the measure of things like “determination,” “fear,” and “happiness” in users’ voices. If you’re talking to Hume in a gloomy tone, it will pick up on that, too, which ChatGPT doesn’t seem to do.
Hume also makes it basic to implement a voice with specific emotions by adding a prompt in its UI. Here it is when I asked it to be “sexy and flirty”:
Hume AI’s “sexy and flirty” message
And when he was told to be “sad and gloomy”:
Hume AI’s ‘Depressed and Gloomy’ Message
And here’s a particularly unpleasant message when I was asked to be “angry and rude”:
Hume AI’s “angry and rude” message
Technology hasn’t always seemed so polished and smooth like OpenAI, and sometimes behaved in strange ways. For example, at one point the voice suddenly sped up and began spewing gibberish. But if the voice can be improved and made more reliable, it has the potential to support make human-like voice interfaces more widespread and diverse.
The idea of recognizing, measuring, and simulating human emotions in technological systems dates back decades and is studied in a field known as “affective computing“, a term introduced by Rosalind Picardprofessor at MIT Media Lab, in the 1990s
Albert’s mistakeprofessor at Utrecht University in the Netherlands who studies affective computing is impressed by Hume AI technology and recently demonstrated it to his students. “EVI seems to assign emotional values and arousal values [to the user]and then modulating the agent’s speech accordingly,” he says. “That’s a very interesting twist on the LLM.”
