The fiercest technology race in history has, so far, one clear financial winner — and it is not any of the racers. Nvidia, which designs the chips on which nearly all frontier AI models are trained, booked $193.7bn of data-centre revenue in its 2026 fiscal year, up 68% on the year before, at gross margins hovering between 71% and 75%, as of mid-2026. Roughly speaking, for every dollar the AI industry hands Nvidia, seventy cents is gross profit. Analysts put its share of the market for AI accelerator chips at around 80–90% by revenue. How does one supplier extract most of the profit from a contest among the richest companies on earth?

Economists answer questions like this with a spectrum. At one end sits perfect competition: many sellers of an identical product, none with any say over price — the wheat farmer, who takes what the market gives. At the other end sits monopoly: one seller, free to pick the price that suits it best, restrained only by buyers’ willingness to walk away. In between lie oligopolies, where a handful of firms eye each other warily. The position of an industry on this spectrum, more than the virtue or villainy of its managers, determines who keeps the money. And the useful trick with AI is to stop thinking of it as one industry. It is a stack of four, each sitting at a different point on the spectrum.

At the bottom are chips, and that market sits nearest the monopoly end. Nvidia’s grip rests only partly on its silicon; the deeper moat is CUDA, the software layer in which a generation of AI engineers learnt to work, which makes switching to a rival’s chip an expensive rewrite. Above chips sits cloud computing — the data centres where models are trained and run — a tight oligopoly in which Amazon (roughly 30%), Microsoft (around 25%) and Google (12–13%) control about two-thirds of the world market, as of mid-2026. Above that sit the model-makers: an oligopoly with perhaps half a dozen serious frontier contenders and a noisy fringe. And at the top sit applications — the thousands of firms wrapping models into products for lawyers, doctors and marketers — a layer that looks much like ordinary competition: easy to enter, hard to defend, thin margins.

The AI stack as four layers, from near-monopoly in chips at the bottom to near-competition in applications at the top
The AI stack as four layers, from near-monopoly in chips at the bottom to near-competition in applications at the top

Why does concentration pile up at the bottom? Chapter 4 supplied the first reason: enormous fixed costs and near-zero marginal costs reward whoever is biggest, at every layer where fixed costs bite. The other sources of market power are subtler. Network effects and data feedback loops — more users produce more usage data, which makes the model better, which attracts more users — were expected to make model-makers unassailable, though the loop has so far proved leakier than the theory promised. Switching costs do real work: retraining staff, rewriting code, re-certifying systems. And vertical integration knits the layers together — the big cloud firms have invested billions in the leading labs, and Britain’s competition authority warned in 2024 that a small number of firms controlling compute, data and expertise could shut challengers out of the whole stack.

The strangest competitive force in this market is the one that gives the product away. Meta has released its Llama models with open weights — anyone may download and run them — and Chinese labs have followed with gusto. This is not charity but a strategy economists recognise: commoditise your complement. If models become cheap and abundant, the money flows instead to whatever scarce thing must be bought alongside them — Meta’s advertising platform, or a cloud contract, or, indeed, chips. The open fringe acts as a permanent ceiling on what closed-model firms can charge, which is one reason prices at the model layer keep falling even as capabilities rise.

January 2025 showed how fragile the stack’s pecking order can look. DeepSeek, a Chinese lab, released an open-weights model that matched much of the western frontier and claimed its final training run had cost about $5.6m in rented computing time — a figure that excludes research and hardware and is hotly disputed, but that landed regardless. Nvidia’s shares fell 17% in a day, erasing nearly $600bn of market value, as investors briefly repriced the assumption that frontier intelligence must always require oceans of premium silicon. The shares recovered; the lesson stands. Rents in this industry rest on scarcity, and scarcity in software has a way of evaporating.

The resolution of the puzzle is what managers of older industries call the “smiling curve”: profits pool at the ends of a value chain and drain from the commoditised middle. At one end sits the scarcest physical input — Nvidia’s chips, and behind them a single Taiwanese manufacturer and a single Dutch maker of the machines that etch them. At the other end sit customer relationships and distribution. The middle — the models themselves, miraculous and increasingly interchangeable — is where competition is most furious and profit thinnest. Nvidia earns most of the money not despite the ferocity of the race but because of it: every contestant, whatever its strategy, must buy from the same armourer. Whether they are wise to keep buying at this pace is the question of the next chapter.

What to watch

  • Nvidia’s gross margin and market share (about 71–75% and 80–90% as of mid-2026): sustained erosion would signal real competition arriving at the bottom of the stack.
  • The share of hyperscalers’ AI chips that they design themselves (Google’s TPUs, Amazon’s Trainium) — vertical integration nibbling at the monopoly layer.
  • The price gap between frontier closed models and the best open-weights models: a narrowing gap squeezes the model layer’s rents.
  • Antitrust action on cloud–lab partnerships in the EU, Britain and America.

Dig deeper