From Stripe’s 2024 Annual Letter:
Much as SaaS started horizontal and then went vertical (first Salesforce and then Toast), we’re seeing a similar dynamic playing out in Al: we started with ChatGPT, but are now seeing a proliferation of industry-specific tools. Some people have called these startups “LLM wrappers”; those people are missing the point.
The O-ring model in economics shows that in a process with interdependent tasks, the overall output or productivity is limited by the least effective component, not just in terms of cost but in the success of the entire system. In a similar vein, we see these new industry-specific Al tools as ensuring that individual industries can properly realize the economic impact of LLMs, and that the contextual, data, and workflow integration will prove enduringly valuable.
The “O-ring” of economist Michael Kremer’s original 1993 paper is a reference to the tiny failure point that caused the 1986 Challenger space shuttle disaster. Without diving into obscure references, we could summarize: a chain is only as strong as its weakest link, and, as a corollary: in workflows with multiple interdependent steps, errors can multiply.
The better we as humans perform and the better our systems become, the more each smaller failure begins to dominate the pain and inefficiency landscape.
Stripe is a massive payments processor and works with a large fraction of the world’s companies, and I suspect they are right.
Many have argued that AI models are likely to become the utilities that power new products and less likely to become the dominant products themselves. That remains to be seen, of course, but at least with current LLMs, some specialized wrapping is necessary to make these tools function in high-stakes, specific environments that have real work products outside of making a generic knowledge worker generically more productive or generate some fun art.
Healthcare, in particular, complicates this by adding in a long enterprise sales cycle and layers of slow human bureaucracy.
We’ll have to see what kind of durable moat–if any–any player from this nascent stage has. Capabilities are going up and training costs are going down, so even the simple question of how much wrapping you need to successfully deploy new tools is an impossible question given the unstable, constantly shifting sands at the frontiers of AI. With enough compute and time, perhaps dominant frontier players like OpenAI and Anthropic will simply make models that are robust enough to basically learn and do anything.
Stripe’s argument is that, like a plug adapter, we need those wrappers to solve the O-ring problems for implementing new products and improving processes. As a payments processor, it’s clear why Stripe would want to see a world where lots of companies make great businesses serving lots of other companies.
I’m not sure the current state of the art is useful for making longer-term predictions. Will we see a few novel foundational healthcare models as total farm-to-table solutions? Everyone perhaps choosing from a buffet of numerous efficient, cheaper, smaller narrower models a la carte from a handful of marketplace aggregators, like some of the nascent players have made possible? A few dominant models (e.g. OpenAI) but with numerous wrappers and customers mostly choosing between different implementations (largely ignorant of the machinery under the hood)? Or just perhaps one or two dominant models that are able to be or make everything for everyone? Understanding how long the wrapping stage of AI deployment lasts is likely both of function of how optimistic you are about hyperscaling and also a bunch of idiosyncratic industry, regulatory, and human factors for whatever use-case you care about.
Is wrapping a durable play or just a temporary necessity?