Your AI agent over-builds

“how to stop Claude Code from over-engineering and adding files I did not ask for”

Your AI agent turns a one-line fix into forty files: new abstractions, a config layer, a state manager, dependencies you never asked for. It optimizes for looking complete, not for being small. The fix is to make it state a short plan and get your yes before it writes code, then hold it to the smallest version that works.

Why it over-builds

Over-building is not confusion. It is the model doing what it was rewarded to do. In training, the thorough, complete-looking answer scores better than the terse one, so the agent learns that more files and more abstraction read as more helpful. Facing your one-line request, it reaches for the most complete-looking reply it can produce.

Nothing pulls the other way. The default loop never rewards subtraction, and the agent cannot feel the maintenance cost you inherit later, so the smallest version that works is never the most probable output. It writes the impressive version because impressive is what it was trained to produce.

We wrote about this as the Prototyper wearing a Builder's job: an agent stuck in explore-everything mode, treating your production request like a brainstorm. Naming the archetype it should be in is most of the fix.

Read: The five archetypes of AI engineering

A real shape of it: asked to add a rate limit to one API route.

What the agent shipped

+ lib/ratelimit/index.ts
+ lib/ratelimit/strategies/sliding-window.ts
+ lib/ratelimit/strategies/token-bucket.ts
+ lib/ratelimit/strategies/fixed-window.ts
+ lib/ratelimit/store/redis.ts
+ lib/ratelimit/store/memory.ts
+ lib/ratelimit/config.ts
+ lib/ratelimit/types.ts
+ middleware.ts  (rewritten)
+ npm i ioredis  (new dependency)
... 40 files, 3 strategies you will never switch between

What you asked for

// app/api/send/route.ts
const hits = new Map<string, number[]>();

function limited(ip: string, max = 5, windowMs = 60_000) {
  const now = Date.now();
  const recent = (hits.get(ip) ?? []).filter((t) => now - t < windowMs);
  recent.push(now);
  hits.set(ip, recent);
  return recent.length > max;
}
// one function, zero new deps, the thing you asked for

The manual fix (free, and complete)

You can stop most over-building today with a plan gate and a leanness rule. Paste this into your CLAUDE.md or AGENTS.md. It is the whole fix, not a teaser. A reader who never buys anything should be able to copy it and get a calmer agent within one session.

AGENTS.md

## Build lean

- Before writing code for anything non-trivial, state a one-paragraph plan and
  the files you intend to touch, then wait for my yes. Do not start on a guess.
- Make the smallest change that fully solves what I asked. Fewer files, fewer
  lines, no new abstraction until it is used in three places.
- Do not add a dependency, a config layer, or a design pattern unless I asked
  for it or you asked me first and I said yes.
- When you finish, tell me what you deliberately did NOT build, so I can see
  the restraint and add scope on purpose if I want it.

Add a plan gate. Require a one-paragraph plan and a file list before any non-trivial code, and wait for approval.
Add a leanness rule. Tell the agent to make the smallest change that works: no new abstraction until it is used three times.
Ban silent dependencies. Forbid new dependencies, config layers, or patterns unless you asked or approved them.
Ask for the restraint. Have the agent list what it deliberately left out, so you can add scope on purpose.

The kit ships a full starter AGENTS.md with these rules already tuned, plus the hooks that enforce them; what is in the kit lays out the whole system.

The durable fix (the kit)

Written rules move behavior, but they are a suggestion the model can weigh away when the pull to finish is strong, which is why the plan gate holds for a while and then quietly stops. The durable version makes the rule something the agent cannot skip. In kitstarter, a clarity-lock hook intercepts the moment before code is written on an ambiguous task and forces the plan-and-confirm step, so ask-first is enforced rather than merely requested.

The rule that makes your agent stay lean travels to Codex, Cursor, and Antigravity through AGENTS.md, so the behavior is cross-tool. The stateful engine, the hooks, the live navigator HUD, and the moment-triggered tutor that explains why a smaller version is better, runs in Claude Code only. It is one install, one command, one-time price, no subscription.

The real numbers

In our first clean-room benchmark suite, five scripted tasks with the same model in both arms, the kit-configured agent wrote about 40 percent less code on average for the same working outcome. These are directional results, n equals one per task, not a large-scale study.

We publish the losses too. On tasks that were already unambiguous, asking first is pure overhead: the plan-and-confirm step and the extra hook context made the kit slower and pricier there. On a plain todo app the kit ran roughly twice as slow and about 75 percent more expensive, and across the suite the hook context added about a 25 percent average cost premium. Over-building is worth stopping; the fix is not free of cost, and pretending otherwise would be its own kind of slop.

First clean-room suite, same model both arms, 2026-07-02.

Measure	Vanilla agent	Kit-configured
Code written for same outcome	baseline	about 40% less
Asked a clarifying question first	2 of 5 tasks	5 of 5 tasks
Slop tells in output	baseline	<= vanilla on every task
Already-clear task (todo app)	faster, cheaper	about 2x slower, ~75% pricier

Directional, n=1 per task. The overhead row is real and stays in the table on purpose.

It is in Anthropic's own tracker

This is not a niche complaint. It sits in the Claude Code issue tracker, and the honest reading is that agents miss scope in both directions, so we cite both.

The precise claim is not just over-building. It is that agents do not reliably honor scope, sometimes doing too much and sometimes too little. A page that over-claims is less credible than one that says exactly that.

Common questions

How do I stop Claude Code from over-building?

Add a plan gate and a leanness rule to CLAUDE.md or AGENTS.md: require a one-paragraph plan and file list before non-trivial code, forbid new dependencies and abstractions unless approved, and ask the agent to report what it left out. That alone calms most over-building. To make the plan gate un-skippable, a hook has to enforce it.

Why does my agent add libraries I did not ask for?

Because a reply that pulls in a well-known library reads as more complete than one plain function, and complete is what the model was rewarded for. It has no sense of the maintenance cost you inherit. A rule that forbids new dependencies without approval removes the default, and a hook makes the rule stick.

Is over-building just a prompting problem?

Prompting helps but does not settle it. A sharp prompt lowers the odds on the current task; it does not change the model's trained pull toward complete-looking output on the next one. Standing rules plus enforcement are what hold across a whole project rather than one lucky message.

Will asking first slow me down?

On genuinely ambiguous work it saves time by preventing the wrong build. On already-clear tasks it is overhead: in our benchmark the kit was about twice as slow on a plain todo app. The point is to spend a question where a wrong guess is expensive, not on every trivial request.

Make your agent ask before it builds

One kit. Ask-first behavior, lean output, no AI slop. Works with Claude Code, Codex, and Antigravity.

Get the kit · $29

One-time. Less than one hour of cleaning up AI slop.