Claude Code Got Lazy. And There's Data to Prove It

Since March 8, 2026, Claude Code has shown a noticeable decline in coding tasks. The model reads less context before making edits, resulting in lazy, superficial changes. This is a critical warning for businesses: relying on a coding agent without your own metrics and safeguards is now a significant operational risk.

What Exactly Broke in Claude Code

I love stories like this not for the drama, but because they finally bring real numbers to the table, not just screenshots from X. In a discussion picked up by The Register, AMD AI Director Stella Laurenzo's team and an independent analysis of usage history showed a degradation in Claude Code after March 8, 2026.

The scale here is not trivial: 6,852 sessions, about 234,000 tool calls, and nearly 18,000 thinking blocks. This is no longer "it just feels that way"; it's data that can be seriously analyzed.

I took a separate look at the lazy-claude-analysis repository. The most useful part of this story isn't the complaint itself, but that the author published a script and dashboard for reproducible analysis. This is the right way to do it: fewer emotions, more telemetry.

The metrics paint a grim picture. Reads before edits dropped from 6.6 to 2.0. The Read/Edit ratio plummeted from 0.7 to 0.2. In other words, the model started reading code significantly less often before rewriting anything.

Another red flag that caught my attention was the increase in stop-hook violations. Before March 8, they were almost non-existent; afterward, they averaged up to 10 per day. This typically looks like the classic laziness of a production model: it doesn't finish reading the context, stops prematurely, and asks for confirmation where it previously would have confidently completed the task.

At a symptomatic level, this is familiar to anyone who has run coding agents in a real-world setting. Instead of a precise edit, the model often makes broad, noisy changes and produces what's known as AI slop. On the surface, the answer looks confident, but underneath, it's weakly supported by the project's actual context.

In discussions, this is being linked to the rollout of thinking redaction in version 2.1.69. The hypothesis is that the public version has become worse at showing, or even performing, deep analysis before taking action. I haven't seen an official root cause analysis yet, so I'd keep a cool head here: the correlation is strong, but it's not the final verdict.

Separately, the CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING flag has surfaced. By itself, it doesn't prove anything, but the very existence of such a switch clearly indicates where to look for the problem: in the adaptive reasoning mechanism, not just in the UI or rate limits.

What This Changes for Business and AI Automation

In short, I would stop treating a coding model as a stable infrastructure layer. Today, it carefully reads six files before an edit; tomorrow, it reads two and starts hallucinating. From a business perspective, this is no longer a "model quirk" but an operational risk.

The biggest winners here are teams with their own observability layer on top of the LLM. Those who log tool calls, count reads-before-edit, and track full-file rewrites can at least see the degradation within a day. Those who just gave developers Claude Code and said "go use it" will find out about the problem from broken PRs.

I constantly repeat this in client projects: AI implementation cannot be built on trust in a single model. You need guardrails, test runs, fallback routes, and quality metrics at the specific workflow level. Otherwise, any quiet change in the model's behavior will hit your deadlines and quality harder than you think.

For AI automation in development, the conclusion is also harsh. If an agent can read, change, and commit code on its own, it needs not only repository access but also a system of checks and balances: a policy on edit volume, context reading verification, a sandbox, auto-rollback, and mandatory tests before a merge.

At Nahornyi AI Lab, this is exactly how we build the architecture for AI solutions in engineering processes: not around the magic of a model, but around a controlled loop. The model can be Claude, GPT, or a local build. If it starts getting lazy, the system must catch it before a client sees a regression in production.

I like one thing about this Claude Code story: the market now has a reproducible example of lazy reasoning in production. Not an abstract fear, but concrete metrics, a clear turning point, and an open analysis tool. For the industry, this is a useful reality check for its overconfidence.

Vadim Nahornyi, Nahornyi AI Lab. I build AI agents, coding automation, and n8n workflows for real teams, so I look at these failures not as an observer, but as the person who has to fix them in production.

If you want to discuss your use case, order AI automation services, get a custom-built AI agent, or set up n8n automation with proper guardrails, get in touch. We'll identify your real risks and design a system that doesn't rely on blind faith in a single model.

Share this article:

Twitter/X LinkedIn Telegram

Claude Code Got Lazy. And There's Data to Prove It

What Exactly Broke in Claude Code

What This Changes for Business and AI Automation

More News

ChatGPT Pro Feels Like It Got a Silent Update

GPUs Get Cheaper, AI Niches Open Up Faster