What We Lose When ChatGPT Sounds Like Scarlett Johansson
When Spike Jonze’s romance “Her” was released in 2013, it sounded both like a joke — a man falls in love with his computer — and a fantasy. The iPhone was about six years old. Siri, the mildly reliable virtual assistant for that phone, came along a few years later. You could converse in a limited way with Siri, whose default female-coded voice had the timbre and tone of a self-assured middle-aged hotel concierge. She did not laugh; she did not giggle; she did not tell spontaneous jokes, only Easter egg-style gags written into her code by cheeky engineers. Siri was not your friend. She certainly wasn’t your girlfriend.
So Samantha, the A.I. assistant with whom the sad-sack divorcé Theodore Twombly (Joaquin Phoenix) fell in love in “Her,” felt like a futuristic revelation. Voiced by Scarlett Johansson, Samantha was similar to Siri, if Siri liked you and wanted you to like her back. She was programmed to mold herself around the individual user’s preferences, interests and ideas. She was witty, and sweet and quite literally tireless. In theory, everyone in “Her” was using their own version of Samantha, presumably with different names and voices. But the movie — which I love — was less the tale of a near-future society, and more the coming-of-age story of one man. Theodore found the strength to return to life in a brief, beautiful relationship with a woman who fit his needs perfectly and healed his wounds.
It was thus a tad jarring to hear the voice of the virtual assistant, Sky, in last week’s announcement of the newest version of ChatGPT, probably the best known artificial intelligence engine in the very real world of 2024. Among other things, the new iteration, dubbed ChatGPT-4o, can interact verbally with the user and respond to images shown to it through the device’s camera. Those who watched the live demo from OpenAI, the company that makes ChatGPT, were quick to note that she sounded a whole lot like Samantha — which is to say, like Johansson.
Mira Murati, OpenAI’s chief technology officer, told The Verge that the resemblance was incidental, and that ChatGPT’s nascent speech capabilities have used this voice for a while. But once you hear it, you can’t unhear it. That’s probably why OpenAI announced on Monday that it was suspending Sky, though not four other voices — Breeze, Cove, Ember and Juniper — that reflect the same strategy.
Furthermore, OpenAI founder and chief executive Sam Altman has professed his love of “Her” in the past. Following the announcement, he posted the word “her” to his X account. And on his blog post about the news, he wrote, “It feels like A.I. from the movies; and it’s still a bit surprising to me that it’s real. Getting to human-level response times and expressiveness turns out to be a big change.”