- Value will overwhelmingly accrue to incumbents
- Personalized generative models (LLMs, imagery, video, etc.) will be deeply integrated into platforms (OS, browser, major social platforms)
- Foundation models will be used to train highly specialized models in various domains, while "conductor" models seamlessly route queries to specialized models
- Apple will be the primary driver of mass consumer adoption of ML
In observing ML-powered software over the last 9 months or so, I’ve been ruminating on where the market might go. These are some of my thoughts. I’ll hold myself accountable by indicating any material edits in a changelog at the end.
Zero of the content of these predictions is based on my work at Google. These are my personal opinions.
We're in the popup shop era
The broad availability of ML models like ChatGPT and Stable Diffusion have produced an explosion in cool product ideas. But cool ideas do not a business make.
Remember when the most popular iOS app was a toy that used the accelerometer to “pour beer”? Or when you could use your iPhone as a lightsaber? That’s where we are on the ML timeline.
“But Frank, some companies are making a lot of money!” you protest.
Unfortunately, dear reader, even among successful apps like Lensa, which generates stylized avatars, these “businesses” are more like popup shops. They’ll make some money for now, but unless they can transition from toy to engaging product, they’ll languish and eventually fade away.
Value will largely accrue to incumbents
As much as I love an underdog story, I don’t see them happening at scale as part of the ML shift. Incumbents with strong distribution and large existing userbases will win over newcomers.
Of course there will be exceptions, but largely, ML will help incumbents extend their lead through ML-powered features more than it’ll help underdogs chip away at that lead with ML-powered products.
Contrast this with the shift to mobile, where mobile platforms represented a new place incumbents needed to go. ML models will be brought to where incumbent products already are.
A hypothetical example
Even still, it’s tempting to imagine an “ML-native” product disrupting an incumbent. Let’s take an ML-native email marketing product, for example, to see why that mostly won’t work.
Our email marketing product might use ML to generate copy and imagery; to create hyper-personalized schedules for each recipient; to automatically run A/B tests using generated content variations.
Sounds compelling, but an ML-native email marketing product still needs to do the things that a “traditional” email marketing product does. It still needs to allow you to send one-off promotions, to manually compose an email, to integrate with your CRM, to handle unsubscribes.
It is 100x cheaper to build broadly available ML models into a traditional product than it is to build an “ML-native” product from the ground up, because for every use case I can think of, you still need to build the traditional stuff.
There would need to be absolutely tremendous ROI in the ML features to drive displacement. It’s expensive for companies to change their operational software. ML-native products need to account for switching costs in their ROI models.
Not to mention, incumbents already have a ton of domain-specific data they can use to train higher quality, specialized, cheaper models that give them better product performance and pricing power.
Exception: business model disruption
One exception: if ML-powered products enable business model disruption – think major changes to pricing and packaging — to which incumbents simply cannot react without cannibalizing their existing business, there may be an opportunity for ML-native startups to win.
Platform integration of personalized generative models
Language models and generative media models will be baked into the platforms consumers use every day.
Today, there are dozens or hundreds of implementations of various image generation and writing assistant features. Basically the same UIs over and over in service of various goals.
Platforms such as operating systems, browsers, voice assistants, and large social networks (including messaging apps) will build native, personalized, general-purpose “assistant” models, which will cannibalize the litany of lowest-common-denominator features being built into today’s popup shops.
You’ll have a writing assistant that can help you speak in your authentic voice and an art director/design assistant that can generate media in your personal style, built right into the heart of your phone, your computer, and your browser.
Companies like Grammarly will be marginalized unless they’re able to specialize in specific domains.
Foundation models like ChatGPT are expensive to run at scale. Were there mass demand today, there may not even be enough specialized compute hardware in the world to meet it.
In order for even incumbents with deep pockets to efficiently meet demand at scale, they’ll need to substantially improve the efficiency of these models.
Of course, they will continue to improve the efficiency of foundation models, but mostly, they will use foundation models to train dozens (or hundreds or thousands) of highly specialized models, which are much cheaper to run and are just as good as foundation models at specific tasks.
Users will interact with those models without knowing it. “Conductor models,” specialized models themselves, will seamlessly and invisibly route user requests to a suitable specialty model under the hood and return high-quality results to users much more cheaply, and they’ll do it with style and personality.
Apple will drive mass consumer adoption at the edge
Last one: Apple has been quiet so far, other than a few under-the-radar optimizations of open source models for their extremely powerful, ML-ready processors (M1, M2).
From the horse’s mouth:
With the growing number of applications of Stable Diffusion, ensuring that developers can leverage this technology effectively is important for creating apps that creatives everywhere will be able to use.
Translation: generative models are an opportunity for Apple to improve its platform and the apps that run on it.
There are a number of reasons why on-device deployment of Stable Diffusion in an app is preferable to a server-based approach. First, the privacy of the end user is protected because any data the user provided as input to the model stays on the user’s device.
Apple is going to lead with privacy in their positioning.
Finally, locally deploying this model enables developers to reduce or eliminate their server-related costs.
For iOS and Mac development, Apple is going to make it a no-brainer for developers to use these models on-device because it won’t cost developers anything.
Apple knows that ChatGPT and Stable Diffusion aren’t products. They’re infrastructure. Your interaction with Stable Diffusion on your iPhone won’t look like DALL-E. It’ll look a lot more like, say, lifting a subject from an image, the technology totally transparent.
What do you think? Where am I wrong? Where do you agree? What did I leave out? My hope is to continually edit this article as I have new predictions or as elements are proven right or wrong. Looking forward to hearing from you.
- Published 24 March 2023