Microsoft tries to justify AI’s tendency to give wrong answers by saying they’re ‘usefully wrong’
Microsoft CEO Satya Nadella speaks at the company’s Ignite Spotlight event in Seoul on Nov. 15, 2022.
SeongJoon Cho | Bloomberg | Getty Images
Thanks to recent advances in artificial intelligence, new tools like ChatGPT are wowing consumers with their ability to create compelling writing based on people’s queries and prompts.
While these AI-powered tools have gotten much better at producing creative and sometimes humorous responses, they often include inaccurate information.
For instance, in February when Microsoft debuted its Bing chat tool, built using the GPT-4 technology created by Microsoft-backed OpenAI, people noticed that the tool was providing wrong answers during a demo related to financial earnings reports. Like other AI language tools, including similar software from Google, the Bing chat feature can occasionally present fake facts that users might believe to be the ground truth, a phenomenon that researchers call a “hallucination.”
These problems with the facts haven’t slowed down the AI race between the two tech giants.
On Tuesday, Google announced it was bringing AI-powered chat technology to Gmail and Google Docs, letting it help composing emails or documents. On Thursday, Microsoft said that its popular business apps like Word and Excel would soon come bundled with ChatGPT-like technology dubbed Copilot.
But this time, Microsoft is pitching the technology as being “usefully wrong.”
In an online presentation about the new Copilot features, Microsoft executives brought up the software’s tendency to produce inaccurate responses, but pitched that as something that could be useful. As long as people realize that Copilot’s responses could be sloppy with the facts, they can edit the inaccuracies and more quickly send their emails or finish their presentation slides.
For instance, if a person wants to create an email wishing a family member a happy birthday, Copilot can still be helpful even if it presents the wrong birth date. In Microsoft’s view, the mere fact that the tool generated text saved a person some time and is therefore useful. People just need to take extra care and make sure the text doesn’t contain any errors.
Researchers might disagree.
Indeed, some technologists like Noah Giansiracusa and Gary Marcus have voiced concerns that people may place too much trust in modern-day AI, taking to heart advice tools like ChatGPT present when they ask questions about health, finance and other high-stakes topics.
“ChatGPT’s toxicity guardrails are easily evaded by those bent on using it for evil and as we saw earlier this week, all the new search engines continue to hallucinate,” the two wrote in a recent Time opinion piece. “But once we get past the opening day jitters, what will really count is whether any of the big players can build artificial intelligence that we can genuinely trust.”
It’s unclear how reliable Copilot will be in practice.
Microsoft chief scientist and technical fellow Jaime Teevan said that when Copilot “gets things wrong or has biases or is misused,” Microsoft has “mitigations in place.” In addition, Microsoft will be testing the software with only 20 corporate customers at first so it can discover how it works in the real world, she explained.
“We’re going to make mistakes, but when we do, we’ll address them quickly,” Teevan said.
The business stakes are too high for Microsoft to ignore the enthusiasm over generative AI technologies like ChatGPT. The challenge will be for the company to incorporate that technology so that it doesn’t create public mistrust in the software or lead to major public relations disasters.
“I studied AI for decades and I feel this huge sense of responsibility with this powerful new tool,” Teevan said. “We have a responsibility to get it into people’s hands and to do so in the right way.”