Surprise! Surprise!

January 5, 2026 11-minute read

Summary: For labs to capture durable value, the world has to be meta-surprising. I’m not sure it is.

🎶 The more the world is changing, the more it stays the same. Life is full of small surprises, it’s a never ending game. If nothing is impossible, will you believe your eyes? If the unexpected brings a smile, that’s a big surprise!

-Cilla Black “Surprise! Surprise!”

Are the AI labs actually pyramid schemes?

To be clear, AI labs are clearly not Herbalife––there is no fraud.

A pyramid scheme requires endless recruitment of new members to pay returns to existing ones. Labs don’t operate like that. The core observation though is that labs operate a pyramid on quickly recruiting surprising new, and unexpected problem spaces to justify returns on existing and growing R&D investment.

Labs spend huge money on research and development, creating models with incredible new capabilities in code, legal, finance, and medicine.

Open v Closed Capability over time — Open v closed model capability over time from Epoch AI.

But open-source models with similar capabilities always quickly follow. The delay between open and closed release dates with similar capabilities has gone from years to months. The purpose of this blog is not to identify any deceit but to investigate whether the quick supply of hard problems is structurally inexhaustible.

I’ve spent this blog arguing that value in the AI era will accrue to the edges––to Hayek’s “compute on the spot”; to asset owners bearing non-deferrable liability. The middle––the application layer reselling cognition––gets squeezed. I’ve yet to apply a similar lens to labs themselves––I attempt that here.

The timing of this attempt feels right: “the top of the cycle is near.” Last week Anthropic seed investor Anjney Midha noted

In the early days, a couple Anthropic cofounders and I… needed a coherent view of our market size and arrived at an answer that felt reasonable: ~$800B/year. [But] that was a gross underestimate.

Dwarkesh believes so much wealth will flow to the model layers that founders and early investors will “own a galaxy”. Even Tyler Cowen is distancing himself from these characterizations.

But what would actually have to be true for Dwarkesh and Anjney to be right? Under what conditions does distillation—the specialization of models purpose-built for the context of the context—not consume frontier labs’ value?

This essay identifies three necessary conditions:

High type-entropy: the space of problem types is large and unpredictable
High K-complexity: solutions require reasoning, not lookup
Heteroskedasticity: the distribution of problem types shifts over time

I’ll argue the conjunction of all three is rarer than it appears, that I’m not sure it exists at economically meaningful scale, and that even if it does, it may not be sufficient for durable value capture.

As a preview, large labs need the world––the space of AI-addressable problems––to be meta-surprising; having surprise about surprise. Surprise! Surprise!

Let’s see if this space exists.

Variety and difficulty Link to heading

Selling closed-model inference effectively runs on a value treadmill, constantly searching for hard problems that aren’t readily solved by cheaper models that don’t charge a margin for intelligence. Hard problems appear in two forms: those with huge variety and great difficulty.

Variety: type-entropy Link to heading

You need the space of problem types to be large and surprising.

Break entropy into two forms. Problem instance-entropy is unpredictability within a known problem type––which contract, which clause, which edge case. Type-entropy is unpredictability about the problem frame itself. So for instance

A problem-instance: “Draft this contract for a Delaware LLC.”
A problem-type: “Legal reasoning under a regulatory regime.”

Instance-entropy is likely bounded. How many different problems does “customer service on Shopify” actually emit? Lots, but enumerable and likely finite. Type-entropy might not be––the space of problem types could be too large to map in advance.

Labs need type-entropy Link to heading

High problem-instance entropy doesn’t protect the labs.

Most business applications know in advance what problem types they’ll be responsible for––the space of problem instances is at-most countable, by assumption. With harnesses like Claude Code and capable tool-calling models, problems in fixed-type conditions could be resolved with static enumerations of solution sets represented by skills or external services.

In cases where there are many available skills to service a given problem-type, the model searches for the right skill and plugs it in. Claude desktop appears to do this for generating diagrams. Uncertainty around the problem-instance doesn’t much matter as all problem-instance solutions can be enumerated and found via search. You may even be able to distill a model trained on the key application use cases and employ a cheaper model!

Low- v High-Entropy Problem Spaces — Large models help provide coverage to yet unknown or poorly understood problem types.

Uncountably infinite problem-type is more promising defensibility for the cognition seller. Low type-entropy would mean we could probably enumerate the types, enabling the same solution enumeration available to problem instances. But high type-entropy, where the space of problem types is large and we're never really sure which we'll face, is more promising as a guard against commoditization. In these cases you need a generalized model capable of resolving even yet-unknown types and respective problem instances.

Difficulty: “K-complexity” Link to heading

The second axis measures how hard it is to figure out what to do once you’ve identified the problem type. You might casually refer to this as Kolmogorov “K-complexity”:

what is the shortest computer program or reasoning trace that produces desired solutions?

That is, can the solution be compressed into a decision tree, template or lookup table? Or is the mapping irreducibly complex or so rare that enumerating it in lookup table is too costly relative the benefit? The latter has high K-complexity.

Businesses monetizing inference find more durable value when their target action spaces have high K-complexity. In applications with high type-entropy but low K-complexity, you face many types of problems but going from “what kind of problem is this” to “what should I do” is actually simple. In these cases, you don’t need a powerful reasoner. A classifier with a lookup table could suffice. Finite state machine solutions (today’s “agentic workflows”) from the 2016 chatbot era could work. Here intelligence is spent in the problem routing, not the delivery of the actual solution, which has been solved in advance.

Many applications satisfy this high type-entropy, low K-complexity condition:

insurance claims
general loan underwriting
IT helpdesk

having many problem types but once the type is identified, the pathway to solution is simple; likely a decision-tree.

The cases with high “K-complexity” are those where reasoning itself is the work––where reasoning on the fly is necessary. The problem characteristics are so many or uncertain that every solution requires reasoning, or that enumerating all possible inputs and verifying them individually is cost-inefficient. Coding is a clear example.

Labs need both Link to heading

Entropy and Complexity Regimes — Use Claude whenever type entropy and K-complexity is high.

To continue to charge rents for intelligence, labs need the space of

high-entropy
and high-K-complexity

problems to be large.

Low-entropy, low-k-complexity can probably be resolved with a lookup table or in a prompt. High-entropy, low-k-complexity need some routing, but once the problem type is fixed, paths to the solution are simple.

In cases where there’s low entropy and high-k-complexity, problems are harder but few. Once you’ve seen a few examples, you can compress solutions. You use the large model to identify solutions, but its value escapes to small models that make the initial large model unnecessary and its rents superfluous.

The moat lives in the bottom right––keep thinking Claude!!––where you don’t know what type of problem you’ll face, and for any of those you actually need reasoning to get to the solution.

Today, anyway, coding is in this quadrant.

Heteroskedasticity: temporally variable quadrants Link to heading

The 2x2 matrix above is a snapshot.

For instance, you might have a new research problem that today lives in the bottom right. It’s

high type-entropy: you don’t really understand its problem type
high K-complexity: figuring it out requires real reasoning

But then … you solve it.

The previously unknown steps are assembled and perhaps even generalized. What yesterday you explored you now exploit. The next time this problem type appears you know how to solve it. Type-entropy drops. You can now use a distilled solution. Today’s frontier is tomorrow’s template.

So you have to wonder––what domains actually stay in the bottom right quadrant? What structurally stops migration over time?

Non-stationarity! Link to heading

The only plausible antidote against this distillation dynamic is non-stationarity. Not just high-variance over problem types but unstable high-variance: surprise! surprise!. You need meta-surprise––surprise squared––to keep the types shifting faster than you can learn them. This looks like heteroskedasticity: distribution at time t shifts at t'.

Heteroskedastic types — Distribution of problem-types itself shifts over time.

Heteroskedasticity would break a compounding distillation process. You learn the game at t but the game itself keeps changing at t' > t.

That said, it’s difficult to think of what actual situations have this property.

Adversarial domains? Players are adapting to each other’s strategies, but the core strategies themselves aren’t really changing. New technologies––mobile, cloud, AI––can shift the game type, but that’s independent of adversarial dynamics.
Open-ended research? It’s true that solving problems can unlock new problem types, but this seems rare. “How do we synthesize Y?” is a recurring problem type. Heteroskedasticity seems like it exists only at paradigm boundaries when a discovery opens up an entirely new field. But this seems rare.
Markets? Human chaos? This seems most fertile. Words assume new meaning: “cottage” found new meaning in December 2025. A pandemic or government takeover could require novel reasoning.

These might be domains where large models have an advantage. All three require vertical integration for maximal value capture, which falls outside the inference business.

On the other hand, you could say these examples are actually contrived and really the source of surprise are regime shift “shocks”. Most domains are probably homoskedastic––“the more the world changes the more it stays the same.” But if this is true, are lab business models essentially underwriting being a shock absorber? Specialist models outcompete labs most of the time, but would people buy expensive models as protection from world-changing events?

Are large closed-models actually selling insurance or time-savings against addressing unknown-unknowns?

There’s a related case worth naming: static long-tail entropy. The distribution of problem types doesn’t shift, but there are so many rare types that specializing in any one never pays. You need frontier models not because the game keeps changing, but because the game has a thousand niche corners that individually don’t justify distillation. This is a different moat—high type-entropy without non-stationarity. Whether this space is large enough to sustain lab economics is an empirical question I don’t have an answer to.

The price of ignoring unknown-unknowns Link to heading

What actually is the intersection of

high type-entropy
high K-complexity
heteroskedastic

domains?

I don’t think coding is an example. Languages evolve slowly and model harnesses are proving adept at navigating them. Perhaps what people build with code is heteroskedastic?? It’s hard to think of a trillion dollar market with many hard problems where the game is always changing.

In essence today we pay labs margins as a form of cognitive insurance or time-savings fee against unknown-unknowns. Today there is too much uncertainty about the types of problems AI can address and their difficulty that we pay extra to models to “just deal with it.” But will this be true in 5 years? Or 10? How durable is a perpetual novelty actually?

This is why agentic systems are so important to lab strategy. They are the best candidates for problem spaces where you don’t know what’s coming next; where problems evolve faster than their solutions can be compressed.

On the other hand, if these broad, deep and always changing problem spaces don’t exist at scale, the AI-compounding wealth accumulation might not happen. Hayek will win, models will specialize to their use cases and run free on purpose-built hardware. Energy and intelligence prices will decouple with marginal intelligence cost going to zero.

Load-bearing chaos Link to heading

If heteroskedastic entropy exists, the lab business model is clear: sell exposure to load-bearing chaos that only large models can navigate. Operate in spaces where complexity evolves faster than compression and compete in domains that resist specification.

But if it doesn’t exist, the lab inference business is an arbitrage on intelligence yet compressed. For instance, consumers can’t tell the difference between GPT 5 and GPT 5.2. I remain heavily bullish on cybernetic arbitrage as a supreme value capture mechanism for the AI age: own the physical or institutional nodes where chaos emerges and monetize it directly:

Waymo owns and operates the fleet (chaos of driving)
Base owns the batteries (chaos of energy markets)
Law firms own the legal risk (chaos of new regulatory regimes)

If the chaos isn’t permanent, the rent isn’t durable. This is true for the Cybernetic Arbiteur as well.

The $800B market Anjney imagined is actually for contextual lock in at the edge of human entropy. The question, however, is whether that capture forever needs frontier intelligence to navigate it.

For now, I’m guessing not. Hayek will take revenge.

But I’d be delighted to welcome an unexpected smile.