The Vibe Coding Hangover

The Study Everyone Tried To Explain Away

METR's AI Assistance for Software Engineering study, published in June 2025, was one of the most carefully designed evaluations of AI coding tools to date. It used experienced open-source developers working on real tasks from real projects they were familiar with — not contrived benchmark problems. Participants completed tasks with and without AI assistance, randomized across participants. The result: AI tools made participants 19% slower, on average.

The reaction was predictable. Critics of the study pointed to sample size, task selection, the learning curve of the tools used, and the fact that the participating developers were already familiar with their codebases (reducing the advantage AI retrieval provides for unfamiliar territory). Defenders pointed out that the study controlled for experience level, used realistic tasks, and had the developers self-report confidence in AI benefits — where the same developers predicted AI would make them 24% faster.

The Wall Street Journal covered it in July 2025 under the headline "AI Makes Some Programmers Worse At Their Jobs." That framing is too simple, but the study's core finding deserves to be taken seriously rather than explained away through motivated reasoning.

Why Experienced Developers Slowed Down

The METR study's finding is counterintuitive but mechanically explicable once you look at the task structure. The slowdown was most pronounced in debugging and comprehension tasks — understanding what existing code does, identifying why something is broken, and modifying code in ways that preserve subtle invariants. These are fundamentally comprehension problems, and AI tools introduce a comprehension tax.

When an AI generates a 50-line solution, the developer still has to read, understand, and verify it before committing. For a developer who would have written a 20-line solution themselves in five minutes, the AI path requires reading 50 lines of generated code, determining whether it's correct, potentially debugging it, and revising the prompt when it's not. The review cost exceeds the generation speed advantage.

This pattern reverses for generation-heavy tasks where the developer would have spent significant time writing boilerplate. Creating a new REST API with standard CRUD operations from a description is faster with AI assistance because the generated code largely matches what the developer would have written, the review cost is low, and the generation time is much lower than typing speed.

The Expertise Inversion

There's a paradox embedded in the 2025 AI coding research: the less you know, the more you benefit, and the more you know, the more you're slowed down by the tools designed to help you. This expertise inversion explains a lot of the conflicting signals from the field.

A junior developer who doesn't know the idiomatic way to handle database connections in their framework gets a huge lift from AI assistance — the model bridges the knowledge gap directly. A senior developer who knows exactly what they need gets a mixed bag: sometimes the AI's output matches their mental model and saves keystrokes, sometimes it diverges and creates a debugging task that wouldn't have existed without the AI involvement.

The YC W25 data point — 25% of the batch had codebases where 95% of lines were AI-generated — fits this pattern. Many of those founders were building their first meaningful codebase. AI assistance was filling in knowledge gaps, not slowing down experienced practitioners. The product shipped, but the technical debt question is still being answered as those teams scale.

The Development Hell Nobody Writes About

There's a category of AI coding failure mode that doesn't make the benchmarks and barely makes the blog posts: the spiral. A developer gets AI to write a component. The component works. They get AI to write the next component. That one mostly works but has a subtle incompatibility with the first. They ask AI to fix the incompatibility. The fix introduces a regression in a third component. After three hours of prompting their way through AI-generated fixes of AI-generated problems, the codebase is a pile of technically-running code that nobody fully understands.

Senior developers recognize the spiral and break out of it by reading the actual code, understanding what it does, and often rewriting sections themselves. Developers without that foundational comprehension keep prompting. The WSJ piece in July 2025 documented several startup teams that had shipped products they couldn't maintain because the founding engineers had vibe-coded the core product without understanding it well enough to modify it safely.

This is not an argument against AI coding tools. It's an argument for using them with your brain engaged rather than with your brain delegated. The model writes code; you are still responsible for understanding it.

What Actually Works

The most productive AI-assisted developers in 2025 share a pattern: they use AI heavily for generation and test-writing, they read everything the AI produces before accepting it, they maintain a mental model of the overall architecture that they don't outsource to the model, and they treat AI-generated code with the same skepticism they'd apply to code from a competent but unfamiliar contractor.

The 19% slowdown in the METR study was in part a learning curve effect — the developers had relatively little experience with the specific tools used. Teams that had been using AI-assisted coding for six or more months consistently reported productivity gains, suggesting the skill of working with AI coding tools is real and learnable. It just takes longer to develop than the demos suggest, and it requires genuine understanding of software engineering rather than treating the AI as a replacement for that understanding.