When OpenAI’s Sam Altman spoke to US senators in May, he made a startling admission. He didn’t really want people to use ChatGPT. “We’d love it if they use it less,” he said. The reason? “We don’t have enough GPUs.”
Altman’s admission underscores a troubling dynamic in the growing generative AI business, where the power of incumbent tech firms is becoming more entrenched thanks to the value and scale of their infrastructure. Rather than create a thriving market for innovative new companies, the boom appears to be helping Big Tech consolidate its power.
GPUs — graphics processing units — are special chips that were originally designed to render graphics in video games, and have since become fundamental to the artificial intelligence arms race. They are expensive, scarce and mostly come from Nvidia Corp., whose market value breached $1 trillion last month because of the surging demand. To build AI models, developers typically buy access to cloud servers from companies like Microsoft Corp. and Amazon.com Inc. — GPUs power those servers.
During a gold rush, sell shovels, goes the saying. It’s no surprise that today’s AI infrastructure providers are cashing in. But there’s a big difference between now and the mid-19th century, when the winners of the California Gold Rush were upstarts such as Levi Strauss with his durable miners’ trousers, or Samuel Brennan, who sold enough pans to make himself a millionaire. Today, and for at least the next year or so, most of the profits from selling AI services will go to the likes of Microsoft, Amazon and Nvidia, companies that have dominated the tech space for years already.
Part of the reason is that while the costs of cloud services and chips are going up, the price of accessing AI models is coming down. In September 2022, OpenAI lowered the cost of accessing GPT-3 by a third. Six months later, it made access 10 times cheaper. And in June OpenAI slashed the fee for its embeddings model — which converts words into numbers to help large language models process their context — by 75%. Sam Altman has said the cost of intelligence is “on a path towards near-zero.”
Meanwhile, the price of building AI models is rising because purchasing a GPU today is like trying to buy toilet paper during the Covid-19 pandemic. Nvidia’s A100 and H100 chips are the gold standard for machine-learning computations, but the price of H100s has climbed to $40,000 or more from less than $35,000 just a few months ago, and a global shortage means Nvidia can’t make the chips fast enough. Many AI startups have found themselves waiting in line behind bigger customers like Microsoft and Oracle to buy these much-needed microprocessors. One Silicon Valley-based startup founder with links to Nvidia told me that even OpenAI was waiting on H100 chips that it won’t receive until spring 2024. An OpenAI spokeswoman said the company doesn’t release that information; but Altman himself has complained about his struggle to get chips.
Big Tech companies have a major advantage over upstarts like OpenAI, thanks to having direct access to those all-important GPUs as well as established customer bases. When Sam Altman traded 49% of OpenAI for Microsoft’s $1 billion investment in 2022, that seemed like a remarkable amount of equity to give up — until you consider that hitching to a major cloud vendor might be the safest way for AI companies to stay in business.
So far, that bet is paying off for Microsoft. Amy Hood, the company’s chief financial officer, told investors in June that the AI-powered services it was selling, including those powered by OpenAI, would contribute at least $10 billion to its revenue. She called it, “the fastest growing $10 billion business in our history.” That Microsoft product, called Azure OpenAI, is more expensive than OpenAI’s own offering, but allows companies like CarMax and Nota to access GPT-4 in a more enterprise-friendly way, ticking boxes for security and compliance issues, for instance.
Makers of AI models, meanwhile, face a constant migration of talent between their companies, making it difficult to maintain secrecy and product differentiation. And their costs are never-ending; once they’ve spent the money on cloud credits to train their models, they also have to run those models for their customers, a process known as inference. AWS has estimated that inference accounts for up to 90% of total operational costs for AI models. Most of that money goes to cloud providers.
That sets the stage for a two-tiered system for AI businesses. Those at the top have the money and prestigious connections. Founders graduating from the elite startup accelerator Y Combinator have been offered computing credits worth hundreds of thousands of dollars from cloud vendors like Amazon and Microsoft. A lucky few have managed to hook up with venture capital investor Nat Friedman, who recently spent an estimated $80 million on his own batch of GPUs to set up a bespoke cloud service called the Andromeda Cluster.
AI companies in the second tier will make up a long tail who don’t have these kinds of connections and resources to train their AI systems, no matter how clever their algorithms are.
The glimmer of hope for smaller companies is that Big Tech firms will one day find their products and services becoming commoditized too, forcing them to loosen their stranglehold of the market for building AI. The chip shortage will eventually ease, making GPUs easier to access and cheaper. Competition should also heat up between the cloud providers themselves as they encroach on each other’s territories, for instance with Google developing its own version of the GPU — called a TPU — and Nvidia building up its own cloud business to compete with Microsoft.
And, as researchers develop techniques like LoRAand PEFTto make the process of building AI models more efficient, they’ll need less data and computing power. AI models are now on course to get smaller. That will require less GPUs and infrastructure — and that means Big Tech’s grip won’t last forever.
Parmy Olson is a Bloomberg Opinion columnist covering technology. A former reporter for the Wall Street Journal and Forbes, she is author of “We Are Anonymous.”