Why Elon Musk’s OpenAI Lawsuit Leans on A.I. Research From Microsoft

When Elon Musk sued OpenAI and its chief executive, Sam Altman, for breach of contract on Thursday, he turned claims by the start-up’s closest partner, Microsoft, into a weapon.

He repeatedly cited a contentious but highly influential paper written by researchers and top executives at Microsoft about the power of GPT-4, the breakthrough artificial intelligence system OpenAI released last March.

In the “Sparks of A.G.I.” paper, Microsoft’s research lab said that — though it didn’t understand how — GPT-4 had shown “sparks” of “artificial general intelligence,” or A.G.I., a machine that can do everything the human brain can do.

It was a bold claim, and came as the biggest tech companies in the world were racing to introduce A.I. into their own products.

Mr. Musk is turning the paper against OpenAI, saying it showed how OpenAI backtracked on its commitments not to commercialize truly powerful products.

Microsoft and OpenAI declined to comment on the suit. (The New York Times has sued both companies, alleging copyright infringement in the training of GPT-4.) Mr. Musk did not respond to a request for comment.

How did the research paper come to be?

A team of Microsoft researchers, led by Sébastien Bubeck, a 38-year-old French expatriate and former Princeton professor, started testing an early version of GPT-4 in the fall of 2022, months before the technology was released to the public. Microsoft has committed $13 billion to OpenAI and has negotiated exclusive access to the underlying technologies that power its A.I. systems.

As they chatted with the system, they were amazed. It wrote a complex mathematical proof in the form of a poem, generated computer code that could draw a unicorn and explained the best way to stack a random and eclectic collection of household items. Dr. Bubeck and his fellow researchers began to wonder if they were witnessing a new form of intelligence.

“I started off being very skeptical — and that evolved into a sense of frustration, annoyance, maybe even fear,” said Peter Lee, Microsoft’s head of research. “You think: Where the heck is this coming from?”

What role does the paper play in Mr. Musk’s suit?

Mr. Musk argued that OpenAI had breached its contract because it had agreed to not commercialize any product that its board had considered A.G.I.

“GPT-4 is an A.G.I. algorithm,” Mr. Musk’s lawyers wrote. They said that meant the system never should have been licensed to Microsoft.

Mr. Musk’s complaint repeatedly cited the Sparks paper to argue that GPT-4 was A.G.I. His lawyers said, “Microsoft’s own scientists acknowledge that GPT-4 ‘attains a form of general intelligence,’” and given “the breadth and depth of GPT-4’s capabilities, we believe that it could reasonably be viewed as an early (yet still incomplete) version of an artificial general intelligence (A.G.I.) system.”

How was it received?

The paper has had enormous influence since it was published a week after GPT-4 was released.

Thomas Wolf, co-founder of the high-profile A.I. start-up Hugging Face, wrote on X the next day that the study “had completely mind-blowing examples” of GPT-4.

Microsoft’s research has since been cited by more than 1,500 other papers, according to Google Scholar. It is one of the most cited articles on A.I. in the past five years, according to Semantic Scholar.

It has also faced criticism by experts, including some inside Microsoft, who were worried the 155-page paper supporting the claim lacked rigor and fed an A.I marketing frenzy.

The paper was not peer-reviewed, and its results cannot be reproduced because it was conducted on early versions of GPT-4 that were closely guarded at Microsoft and OpenAI. As the authors noted in the paper, they did not use the GPT-4 version that was later released to the public, so anyone else replicating the experiments would get different results.

Some outside experts said it was not clear whether GPT-4 and similar systems exhibited behavior that was something like human reasoning or common sense.

“When we see a complicated system or machine, we anthropomorphize it; everybody does that — people who are working in the field and people who aren’t,” said Alison Gopnik, a professor at the University of California, Berkeley. “But thinking about this as a constant comparison between A.I. and humans — like some sort of game show competition — is just not the right way to think about it.”

Were there other complaints?

In the paper’s introduction, the authors initially defined “intelligence” by citing a 30-year-old Wall Street Journal opinion piece that, in defending a concept called the Bell Curve, claimed “Jews and East Asians” were more likely to have higher I.Q.s than “blacks and Hispanics.”

Dr. Lee, who is listed as an author on the paper, said in an interview last year that when the researchers were looking to define A.G.I., “we took it from Wikipedia.” He said that when they later learned the Bell Curve connection, “we were really mortified by that and made the change immediately.”

Eric Horvitz, Microsoft’s chief scientist, who was a lead contributor to the paper, wrote in an email that he personally took responsibility for inserting the reference, saying he had seen it referred to in a paper by a co-founder of Google’s DeepMind A.I. lab and had not noticed the racist references. When they learned about it, from a post on X, “we were horrified as we were simply looking for a reasonably broad definition of intelligence from psychologists,” he said.

Is this A.G.I. or not?

When the Microsoft researchers initially wrote the paper, they called it “First Contact With an AGI System.” But some members of the team, including Dr. Horvitz, disagreed with the characterization.

He later told The Times that they were not seeing something he “would call ‘artificial general intelligence’ — but more so glimmers via probes and surprisingly powerful outputs at times.”

GPT-4 is far from doing everything the human brain can do.

In a message sent to OpenAI employees on Friday afternoon that was viewed by The Times, OpenAI’s chief strategy officer, Jason Kwon, explicitly said GPT-4 was not A.G.I.

“It is capable of solving small tasks in many jobs, but the ratio of work done by a human to the work done by GPT-4 in the economy remains staggeringly high,” he wrote. “Importantly, an A.G.I. will be a highly autonomous system capable enough to devise novel solutions to longstanding challenges — GPT-4 can’t do that.”

Still, the paper fueled claims from some researchers and pundits that GPT-4 represented a significant step toward A.G.I. and that companies like Microsoft and OpenAI would continue to improve the technology’s reasoning skills.

The A.I. field is still bitterly divided on how intelligent the technology is today or will be anytime soon. If Mr. Musk gets his way, a jury may settle the argument.