Microsoft’s AI Guru Wants Independence From OpenAI. That’s Easier Said Than Done
**Microsoft’s AI Guru Wants Independence From OpenAI. That’s Easier Said Than Done**
After setbacks, Mustafa Suleyman appears to be making slow progress on his mission to reduce Microsoft's dependence on OpenAI.
**By Aaron Holmes**
Mar 7, 2025, 6:00am PST
Mustafa Suleyman wasn't getting the answers he wanted. Last fall, during a video call with senior leaders at OpenAI and Microsoft, Suleyman—who leads Microsoft's in-house artificial intelligence unit—wanted OpenAI staffers to explain how its latest model, O1, worked, according to someone present for the conversation and two other Microsoft employees who were briefed on it. He was peeved that OpenAI wasn't providing Microsoft with documentation about how it had programmed O1 to think about users' queries before answering them. The process, known as chain of thought, is a key ingredient in the secret recipe of any AI model.
**The Takeaway**
- Microsoft is developing in-house AI reasoning models to compete with OpenAI and may sell them to developers.
- It has begun testing out models from XAI, Meta, and DeepSeek as potential OpenAI replacements in Copilot.
- Microsoft AI CEO Mustafa Suleyman clashed with OpenAI leaders over the latter's withholding of technical information.
Raising his voice, Suleyman told OpenAI employees, including Mira Murati, then OpenAI's chief technology officer, that the AI startup wasn't holding up its end of the wide-ranging deal it has with Microsoft, the people familiar with the conversation said. The call ended abruptly.
The episode reflects a central challenge Suleyman has faced since he assumed the lofty title of AI CEO at Microsoft, which has invested more than $13 billion in OpenAI in exchange for a share of its revenue and the rights to reuse its technology. Suleyman's job, which he started a year ago, has two somewhat conflicting mandates, according to executives in his unit. First, he must simultaneously carry on the OpenAI partnership, through which the two companies share information about AI research and development. At the same time, he's also under orders to put Microsoft on a path to self-sufficiency in AI so it won't have to rely indefinitely on OpenAI's technology for the majority of Microsoft AI products.
Now AI researchers in Suleyman's unit believe they have achieved a significant milestone on the second of those two priorities. A team led by Suleyman's deputy, Karén Simonyan, recently completed the training of a family of Microsoft models, internally referred to as MAI, that performed nearly as well as leading models from OpenAI and Anthropic on commonly accepted benchmarks, according to someone involved in the effort. The team is also training reasoning models—which use chain-of-thought techniques to think through problems before solving them—that could compete directly with OpenAI's, this person said.
Suleyman's staff are already experimenting with swapping out the MAI models—which are far larger than an earlier family of Microsoft models called Phi—for OpenAI's models in Microsoft's Copilot, the person said. The company is considering releasing the MAI models later this year as an application programming interface, a software hook that will allow outside developers to weave the Microsoft models into their own apps, this person said. Those plans, which haven't been previously reported, would bring Microsoft's models into direct competition with similar API offerings from OpenAI and other AI labs.
And at Suleyman's direction, Microsoft has been hedging its bets further by trying out models from OpenAI's competitors to power Copilot—the family of AI tools built into Windows, the Edge web browser, and other Microsoft products that currently run on OpenAI's technology. Microsoft's tests of alternative models for Copilot include ones from Anthropic and Musk's XAI, along with open-source models from DeepSeek and Meta Platforms, according to the person with direct knowledge of the effort.
"It's an incredibly competitive and creative time," Suleyman said in an interview. "We are now using, under the hood, essentially all of the models from major labs, including all of the open-source models. We are experimenting with them, we're flighting them, and that's not what anyone thought would happen."
It's too soon to tell whether Suleyman, a rock star in the AI field, will succeed in his efforts to achieve more self-sufficiency for Microsoft in AI. While Simonyan's team has internally celebrated the performance of the MAI models, it hasn't yet released them publicly, nor has Microsoft made them widely available within the company, making it hard to gauge their quality.
The effort to train the MAI models took nearly a year due to technical setbacks, abrupt changes in strategy, and the departures of several Microsoft veterans who disagreed with Suleyman's management and technical approach, according to interviews with six current and former Microsoft employees who worked with Suleyman and his deputies. During that time, OpenAI has trained and published several batches of cutting-edge models. Those setbacks have left some outsiders unconvinced that Suleyman can deliver on his grand strategy. "From the outside, it's still not clear what they have to show for the creation of this unit under Suleyman a year later," said Nathan Benaich, a venture capitalist who invests in AI startups. "They need to make Copilot a serious competitor to ChatGPT, but I'm not sure how they aim to do that, or what their strategy is beyond chasing what OpenAI has already done."
In a statement after this story was published, Microsoft spokesperson Frank Shaw said that MAI also trained a family of models last summer that achieved "state of the art" performance at the time, but declined to specify how it used those models in practice.
No bet is more important at Microsoft than the one it's making on AI. Last month, the company told shareholders it was generating more than $13 billion in annualized AI revenue—its most recent month's revenue times 12—across all of its businesses, up from $10 billion just three months earlier. Most of that revenue comes through Microsoft's Azure cloud computing unit—including from OpenAI's heavy use of the service as well as from OpenAI-powered Office 365 products for corporate customers and AI tools for developers like GitHub Copilot, none of which Suleyman oversees. Instead, his remit is limited to AI within Microsoft's consumer applications, such as Bing and Windows.
But Suleyman says his biggest focus is setting up Microsoft for AI self-sufficiency in the coming decade, not on short-term results. While he's only responsible for a small share of Microsoft's business, the products he oversees could have implications for the future of the entire company. "What we're really trying to do at [Microsoft's AI unit] is not get too obsessed with what's happening this year or next year and really focus on what the next 10 years looks like," Suleyman said. "[We're] making sure that the company builds this muscle internally to be able to build the very best models in the world and also use the best models in partnership with everyone else."
**An Outsider**
Suleyman arrived at Microsoft after a close call with OpenAI. In late 2023, OpenAI was on the verge of imploding when the board of its nonprofit fired CEO Sam Altman, alleging that he had been dishonest about the company's progress. It was hardly a reassuring development for Microsoft, which had launched an OpenAI-powered chatbot in Bing earlier that year and had big plans to do much more with its partner. When OpenAI rehired Altman weeks later and most of the board that ousted him resigned, that calmed nerves at Microsoft somewhat. Still, the company's board of directors responded to the turmoil by pressuring Microsoft CEO Satya Nadella to alter its AI strategy so it didn't hinge entirely on OpenAI, according to two people who spoke to Nadella.
A few months later, Nadella hired Suleyman, a pioneer in AI who co-founded DeepMind, the lab Google acquired in 2014 that is responsible for several scientific breakthroughs in AI. Suleyman joined Microsoft from his startup, Inflection AI, which he had publicly boasted had developed AI on par with that of OpenAI. As part of his hiring, Microsoft signed a $650 million licensing deal to get access to Inflection's technology.
Suleyman moved quickly to reorganize Microsoft's internal AI teams, consolidating various fiefdoms that had been working on separate AI projects into a new unit focused on building models, led by Simonyan, his co-founder and chief scientific officer at Inflection. Among Suleyman's new reports was Mikhail Parakhin, then Bing's CEO, who had overseen the integration of OpenAI's products with the search engine. With Nadella's blessing, Suleyman commandeered several engineers from other units, including staff from the research arm that had previously trained Microsoft's Phi family of small language models. Executives from other divisions, including Saurabh Tiwary, who had led development of earlier models that powered Bing, and Misha Bilenko, an Azure executive who had more recently overseen the Phi effort, also began reporting to Suleyman.
But creating harmony between the various AI teams under Suleyman proved difficult. By the end of last March, just a few weeks after Suleyman joined, Parakhin, a former Yandex engineer who took a hands-on approach to overseeing Bing and sometimes personally wrote and edited code, resigned from Microsoft. Parakhin expressed frustrations to his colleagues about losing control over Bing's business, according to someone who spoke to him. A few weeks later, Parakhin appeared to take a public jab at his old boss, after Suleyman gave a TED Talk about safety risks to humans from advanced forms of AI. Some critics blasted Suleyman's comments, including Andreessen Horowitz partner Martin Casado, who called the remarks "total fucking nonsense" in a post on X. One user on X tagged Parakhin in the replies, stating, "We need you." "Hard to disagree," Parakhin responded, along with a smiley face emoticon. Parakhin didn't respond to requests for comment.
More defections followed in the months ahead. In August, Bilenko and Tiwary, both of whom had spent more than a decade at Microsoft, departed to join Google. Meanwhile, some Bing engineers grumbled that Suleyman, who has more experience as an entrepreneur and executive than as a hands-on programmer, was less focused on the nitty-gritty of AI development than his predecessor. While Parakhin held routine meetings with engineers to review their code, Suleyman has taken a different approach. He holds monthly all-hands meetings with his entire unit and writes weekly emails to staff covering his priorities and thoughts on the broader AI industry, as well as bimonthly breakout sessions with small groups of 10 to 20 employees at a time. Those have been helpful for setting organization-wide agendas, according to two people who have attended them. The conversations at other times are far-reaching, and discussions at the biweekly breakouts typically focus on nontechnical topics such as employees' hobbies outside work, these people said.
More recently, Suleyman has made a string of high-profile hires, including four former researchers from Google's DeepMind earlier this year.
**Training Wheels**
Even before Suleyman came aboard, Microsoft had made some progress on creating its own AI models that could one day substitute for those from OpenAI. The most prominent example was Phi, which staff in Microsoft's more academically focused research arm, Microsoft Research, first developed in 2023. Microsoft researchers trained the Phi models using so-called synthetic data generated by using OpenAI's technology. As a result, Phi was able to mimic the quality of OpenAI's models while only requiring a fraction of the computing power and costs to run OpenAI's technology. Microsoft soon began swapping out some of OpenAI's models in its Copilot products for Phi in order to save money, as well as making Phi available to developers on its Azure cloud computing platform.
After Suleyman's hiring, Sébastien Bubeck, lead researcher on Phi, began reporting directly to him. At Suleyman's direction, Bubeck began employing his AI training methods on larger models akin to those from OpenAI, required for handling the more complex tasks Phi and smaller models weren't equipped for. Under Suleyman, the Phi team also soon had the benefit of much larger clusters of computers for training new models than it had previously had access to within Microsoft Research, according to two people involved in the effort.
But the additional computing muscle didn't help the group produce the results it was looking for. At least three training runs for the larger models, each of which cost millions of dollars, produced subpar results, according to the people involved in the effort. The models had high rates of the false or misleading results known as hallucinations and weren't as reliable as OpenAI's models at providing succinct answers to questions, they said. Suleyman and Simonyan believed the issues stemmed from Bubeck's methods of training Phi on AI-generated data, while Bubeck argued that the issues arose later in the training process, according to someone present for the discussions.
By September, Bubeck left Suleyman's unit to return to Microsoft Research. He told colleagues he planned to continue training Phi models that would be at least five times larger than the previous versions, according to a copy of an internal presentation he gave that The Information viewed. Microsoft Research teams continued to publish updates to Phi. But in October, Bubeck left Microsoft for OpenAI, where he has continued to work on training models using synthetic data and has since poached multiple former Microsoft researchers to join him at the startup, according to two people involved in the effort.
**Taking Time**
Around the time Bubeck left Microsoft, Suleyman's frustration with OpenAI began to intensify. The startup had pioneered a new model dubbed O1 that was programmed to more accurately solve math and logic problems by taking more time to respond to users' queries with answers. But because OpenAI wouldn't let Microsoft see the chain of thought for O1's reasoning, Suleyman and his team had a difficult time figuring out how to replicate that programming in Microsoft's own AI models.
Despite that, Suleyman's AI team was able to successfully recreate some of what OpenAI had achieved in the subsequent months. The model-training team, led by Simonyan, began using chain-of-thought reasoning to improve output quality. That helped them overcome the struggles stemming from the models' training data. But OpenAI has not been standing still. Its output of advances in its models has outpaced that of Microsoft. Over the past two months, the startup has since released a preview of O3, an even more powerful reasoning model, and GPT-4.5, its largest model yet. Whether Microsoft can catch up remains to be seen.
Stephanie Palazzolo also contributed to this article. Aaron Holmes is a reporter covering tech with a focus on enterprise and cybersecurity. You can reach him at aaron@theinformation.com or on Signal at 706-347-1880.