ARE LLMS THE NEW SURFACES OR JUST ANOTHER MARKETPLACE?The real question is whether they can become a market
Written by Adrian Maharaj
(Views are my own, not my employer’s.)
Writing in a personal capacity. Views are my own. This reflects experience across ad ecosystems and public information only.
I have been lucky enough to chase ideas that were early, messy, and often wrong. More of them failed than succeeded. The constant was trust. I have worked where product, commercial reality, and what real people do all meet. The job was never the integration. It was creating a market that did not exist yet.
Large language models are changing how people discover, decide, and buy. A surface becomes a market when three things close at the same time. The economics. The incentives. The safety. That is not branding. That is mechanics.
THE CASE FOR: There are real signs that assistant surfaces can mature into markets.
Distribution is arriving. ChatGPT is now wired into Apple’s system level flows, which means assistant answers can sit closer to identity, payments, and intent than a browser tab ever did. (OpenAI)
Answer inventory is taking shape. OpenAI rolled out ChatGPT Search after piloting SearchGPT, standardizing answers with citations. That is the grammar you need before you can sell attention in a way publishers can live with. (OpenAI)
Rights are being licensed. News organizations are signing deals that allow grounding and display. The Financial Times and News Corp are clear examples. Rights to ground and show are the legal spine you need before you can build paid units on top. (About Us)
Ads are moving from rumor to live tests. Perplexity has been running sponsored follow ups alongside answers, and X has said it will place paid ads inside Grok’s answers. That is the earliest form of an answer ad market, not a screenshot for a pitch deck. (Perplexity AI)
Enterprise rails are forming. Claude added a web search API with citations and tied into Google Workspace, which makes assistant answers easier to measure and audit inside real companies. (Anthropic)
There is executive intent to commercialize applications. OpenAI stood up an Applications leadership track, hired Fidji Simo to run it, and then acquired Statsig while naming Vijaye Raji CTO of Applications. That reads like a decision to scale products, measurement, and markets, not only models. (OpenAI)
THE CASE AGAINST: Most surfaces are not yet real markets. Here is why.
Lift is not proven often enough. You know you have a market when you can show incremental outcomes without leaning on last click stories. That requires regional holdouts, and in some cases a small placebo arm. Too few public tests show that rigor.
Incentives still leak. If a feature can take credit for a sale it did not cause, it will. If a measurement move makes the metric move, it will. Markets punish that behavior over time with lower bids and shorter budgets.
Speed is under treated. A smart decision that arrives late is wrong in production. Most assistants do not publish a response time target with a clear degrade out rule where a late paid unit simply does not show and does not count. Buyers will not fund latency drift forever.
Governance is uneven. Policy by press release does not replace an audit trail. A market needs versioned developer interfaces, reasonable sunset windows, migration kits, and an off switch for sensitive features. Without this, you create partner pain and you churn the very supply you want to grow.
Identity and settlement are not finished. The closer an assistant sits to sign in, consent, and payments, the easier it is to measure and settle. OS level hooks help, and some ecosystems are moving there fast, but most surfaces still rely on stitched flows. (OpenAI)
WHAT WOULD CHANGE MY MIND FAST? Three receipts would turn the debate.
Publish a clean, third party reviewed holdout that shows incremental sales or leads from answer units.
Adopt a public response time target where late paid units do not render and do not count.
Ship a rights policy that links licensed grounding and licensed display to labeled sponsored units with citations under the answer.
If I see those three together from any surface, I will call it a market.
THE MARKET TEST YOU CAN RUN TODAY
Score any assistant with five plain questions. One point each.
Truth. Can you turn on a regional holdout or a simple time switchback without custom work, and do you declare the smallest effect that counts before you test.
Signals. Do you accept consented server side conversions and do you score the quality of each event at the door.
Mechanism. Do you publish auction rules, pacing behavior, a protected learning budget for new advertisers and new products, and fairness rules against self attribution.
Speed. Do you publish a response time target for decisions and does a paid unit simply not appear when you are late.
Governance. Do you version interfaces, offer migration kits, and post an audit trail of changes with reasonable sunset windows.
Four or five is a market. Two or three is promising. One or zero is a demo with better prose.
WHAT GOOD LOOKS LIKE IN PRACTICE
A large commerce surface connects through ads and action interfaces. Both sides agree on the unit that matters, for example profit or lifetime value. They agree to the experiment design up front. They publish a decision time target and a graceful fallback. They fund a learning budget and protect early signals from noise and abuse. They ship with consented events that move quickly and are scored for quality. They set stop losses. They give customers real controls. Budget. Reach. Risk. Learning rate. They version developer interfaces and provide migration kits before any switch is flipped.
The outcome is not a one time launch. It is a compounding engine. Better signals produce a better truth set. That produces a fitter mechanism. That produces more efficient spend. That produces higher lifetime value. Which produces even better signals.
SO WHAT FOR BRANDS AND PLATFORMS?
If you buy media. Treat answer inventory as mid funnel intent with a higher bar. Ask for holdouts, ask for server side conversions, and ask for the late means not shown rule. Your finance leader will thank you.
If you build surfaces. Hire for mechanism, not only model. Write down your rules. Fund a learning budget. Publish your response time target. Connect to identity and payments that respect consent.
If you run an ecosystem. The preferred partner is the network that already handles intent at scale, proves lift with real experiments, degrades late units out, and has versioned interfaces with migration paths. When you run this checklist, the answer tends to choose itself.
MY TAKE
Are LLMs the new surfaces or just another marketplace. They are surfaces today. They become markets the moment they clear the test above. Some are close. A few have the distribution and the rights. One or two have early ad units in the wild. The distance between a clever surface and a durable market is measurement, mechanism, speed, and governance. Close that distance and the budgets move.
TWO PREDICTIONS TO ARGUE WITH
Answer native lead generation will be standard on at least two major assistants within twelve months. Verification will use server side conversions and answer arm holdouts. The early signals are already live. (Perplexity AI)
Sponsored citations will become a labeled unit by late twenty twenty six. Pricing will reflect incremental lift. Eligibility will depend on licensed grounding and display rights with publishers. The licensing runway is being laid now. (About Us)
THE QUIET PART
There are global intent networks where people declare what they want at scale. They already run on real experiments, consented signals, clear response time targets, stop losses, and versioned developer interfaces. If you care about sponsored answers, real lead capture, or shop the answer, you will want your assistant to meet those standards and you will want supply that already does.
INVITE THE DEBATE
What is the cleanest public holdout you have seen for assistant ads.
Would you buy a paid answer that misses its response time target if it were labeled as delivered.
Publishers. What labeling and rev share would you accept for a sponsored citation under an answer.
Drop your receipts. Change my mind.
CALL TO ACTION
If you are building the next distribution layer for commerce or for AI and you are willing to optimize for truth, speed, and trust, we will work well together. If you want help turning a surface into a market, reach out.
This note is independent analysis prepared outside of my employment. It does not speak for my employer, relies only on publicly available information, and is not an endorsement of any company. Where I suggest practice standards, this is an individual editorial view, not a call for coordination among competitors.
……………………………………………………………………………………………………
SELECT SOURCES FOR READERS
OpenAI and Apple integrate ChatGPT into Siri and Apple system flows. (OpenAI)
ChatGPT Search rollout and the prior SearchGPT prototype that established answers with citations. (OpenAI)
Content licensing that enables grounding and display. Financial Times. News Corp. (About Us)
Perplexity advertising experiments as sponsored follow ups and side rail. (Perplexity AI
Grok ads announced to appear inside assistant answers. (TechCrunch)
Anthropic web search and citations, plus Workspace integrations for enterprise workflows. (Anthropic)
OpenAI Applications leadership and experimentation capability. Fidji Simo named CEO of Applications. Statsig acquisition with Vijaye Raji as CTO of Applications. (OpenAI)