The Existence of Interpretation Drift and The Hidden Truth of LLM Instability

Dec 4

Elin Nguyen - December, 2025

On November 13th, my best friend sent me a YouTube tutorial called "How to Build an AI Agent." I genuinely thought: this is it—I don't have to code anymore. Just type what you want, AI generates it, upload it online, boom: SaaS product, money. That's how stupidly naive I was.

So I set out to build a GTM business intelligence dashboard. The kind of thing that would analyze customer journeys, figure out buying intent, tell you which deals were real and which were going nowhere. Pretty straightforward, I thought. But reality had other plans. Same prompt. Same Salesforce export. Five completely different answers.

One run classified a deal as "Closed Won." Next run, same data: "High-Value". One model declared the deal “In Negotiation” before the first email was even sent. Another declared the same deal “Awareness”.

At first, I thought I was messing up—bad prompts, unclear instructions, or some secret technique everyone else knew. It felt like I was missing something obvious, and it really bothered me. I couldn't let it go.

The Night It Clicked

One night around 4 AM, I couldn't sleep. I kept thinking about why AI kept changing it’s mind, why the outputs wouldn't stay the same. So I tried something simple: run the exact same prompt across different models and just watch how they disagreed. That's when I saw it.

The outputs weren't random. They were patterned. Each model made different mistakes, but the mistakes followed patterns—like watching four people climb an invisible staircase, each one missing a different step in a weirdly consistent way.

And that's when it hit me: I can't build my product until I fix this first. Business intelligence can't run on hallucinations. I needed outputs that were exact, repeatable, auditable. Not "close enough" or "probably correct." But exact, every single time.

Solving Unstable Outputs

I became obsessed. I wasn't trying to solve some grand industry problem—I just needed my GTM dashboard to stop hallucinating.

I started treating it like a game. One model was fast but sloppy. Another tried to break everything on purpose. But there was one that behaved like a final boss: it overthought everything, found nuance in a single spreadsheet cell, even refused prompts with "I can't provide an answer in the style you're requesting."

I realized: if I could force this final boss to produce the exact same output as the other LLMs, I could actually build something not built on hallucinations.

So I kept prompting relentlessly and ruthlessly. On November 22nd, the drift boss caved. It returned the exact same answer as every other model I'd tested. 100% cross-model convergence. The ultimate knockout.

The Hidden Truth Revealed

That 4 AM realization led somewhere unexpected. When I finally forced all four models to produce byte-for-byte identical outputs—meaning the exact same interpretation of the same data—I understood something fundamental: AI agent disasters aren't due to prompting failures or model bugs. They're caused by structural and systematic instability in how models understand meaning.

Machine learning has studied two types of drift since the 1990s: data drift (when input distributions change) and concept drift (when the relationship between inputs and outputs shifts). But this is different. This is semantic drift—when the model's interpretive frame or reasoning changes internally, even when the data and relationships remain identical.

Data drift and concept drift have been recognized for decades. Semantic drift is the emerging scientific frontier in 2025. I call this AI drift, and it's what makes AI agents in production truly dangerous.

The Silent AI Agent Crisis

When AI interprets queries differently every single run in production, you get instability. This matters especially for anything that can't afford conflicting interpretations: financial forecasting, legal compliance, medical diagnosis, fraud detection.

Here's what AI drift looks like in production:

Chatbot hallucinations leading to legal liability
Autonomous vehicle perception errors linked to injuries
AI-generated false citations in legal filings
Billions invested in AI, later scaling back production pilots due to hallucinations and reliability issues
Financial impacts from AI-related disputes

Naming The Existence of Interpretation Drift

I searched the literature, scoured LinkedIn, and asked every large language model I could about AI interpretation drift. What I found was worse than finding no answers—there was no floor.

There was no shared understanding of what “drift” even meant. No clear definition. No taxonomy. No coherent way to talk about it. Just scattered observations, vague complaints about “inconsistency,” and ad hoc workarounds that never addressed the core problem.

Meanwhile, the AI industry has been building billion-dollar systems on top of a fundamental instability that has never been clearly named, let alone scientifically measured.

To ground this in something concrete, I’ve written a technical preprint that defines and measures what I call Interpretation Drift: output instability in large language model under identical task conditions.

It’s not a solution paper. It’s an attempt to make an invisible reliability problem visible, measurable, and reproducible—so that others can test it, challenge it, or build on it.

You can read the paper here: Empirical Evidence Of Interpretation Drift In Large Language Models [white paper]

elin nguyen