The Model, the Chat and the Application

Feb 24, 2026

A patient sits down in a doctor’s office and begins to talk. A persistent cough. A tightness in her chest that worsens at night. The doctor nods, listens, asks follow-up questions. From her side of the desk, it looks like every medical visit she has ever had.

On the doctor’s screen, something else is happening. The moment she stated her name and date of birth, an application pulled her full history — the asthma diagnosis at fourteen, the ER visit last March, the ACE inhibitor prescribed six months ago. As she describes the cough, the system is already cross-referencing her medications against known side effects. ACE inhibitor-induced cough: incidence rate 5–35%. A flag appears. The application suggests a question: *When exactly did the cough begin relative to starting the new medication?* The doctor asks it. The patient thinks. “Actually — about two weeks after.” Now the screen offers alternatives: ARBs with equivalent efficacy, filtered against her sulfa allergy, ranked by her insurance formulary.

This is what an LLM-native application looks like. Not a conversation with a machine, but a machine that makes the human conversation better. No chatbot. No prompt. No one waiting for a response. Dozens of model calls running in the background — fetching, cross-referencing, suggesting, filtering — each one a small, disposable step in a larger workflow.

This is not what most people picture when they think about AI.

The world got its first taste of LLMs through ChatGPT — a low-stakes experiment that became the fastest-growing consumer app in history. But it fused AI to a chat interface. You type, it responds. You prompt, it generates. As models got smarter, benchmarks beaten quarterly, the interface stayed the same. A text box. A blinking cursor. The burden of orchestration placed entirely on the user.

Most people have been exposed only to this interface. Because of that, most people believe they have already seen what AI can do. They haven’t. They have seen a powerful engine bolted to the wrong chassis.

The first crack in the chat paradigm came from software engineering. Claude Code is an application where LLMs are not the interface — they are the infrastructure. The developer works; the models run in the background, decomposing problems, attempting solutions, verifying results, discarding failures. It happened in the developer market first not because the pattern is unique to coding, but because developers adopt tools fast, their workflows tolerate failure, human review cycles already exist, and the people building the models are developers themselves.

Under the hood, an LLM-native application looks nothing like a chat. It decomposes complex workflows into many simple, interconnected steps. Each step is carried out by a hybrid of models and deterministic logic. Models are invoked multiple times, attempting different approaches, until the system converges on something sound. Results are pieced together and verified — again by a combination of models and hard checks.

The economics are aligning fast. Tokens get cheaper. Models get smarter. Applications can afford to be generous — trying multiple approaches, verifying results, discarding failed attempts. The cost of intelligence is dropping fast enough that invoking an LLM once to produce a final answer will soon seem as quaint as the chat interface itself.

This produces an artifact. A diagnosis. A tax return. A product roadmap. An investment thesis. The artifact is what matters — not the interface.

The optimistic outlook: jobs become more interesting. The professional drives an LLM-deterministic hybrid toward a solution in a complex space. Agents are embedded into the application like cogs in a machine — invisible, subordinate, useful. The professional is not replaced. The professional is the one who still decides, still judges, still carries the responsibility.

But beyond the optimism, something is clear. Every white-collar profession will look fundamentally different within five years. Not because a chatbot will take anyone’s job — but because professionals who work with LLM-native applications will be so much more thorough, so much faster, and so much more consistent that the old way of working simply won’t compete.

The chat was the demo. The application is the product.

Thoughts from the trenches in FAANG + Indie

Discussion about this post

Ready for more?