Contenido generado por LLM
Este artículo tiene bastante edición humana y no es un one-shot, si no varias sesiones con distintos agentes, pero el grueso ha sido escrito por un LLM.
El contexto en el que fue escrito está explicado en mi otro artículo de hoy
Está escrito tratando de responder a "pongo en el AGENTS una instrucción de - No escribas ficheros de más 250 líneas - o hay formas mejores de evitar god files"
Coding agents tend to accumulate responsibilities into existing files rather than reasoning about structure. The result is files that grow across sessions, doing too many things — and the problem compounds over time.
The slop can be reduced with instructions, good workflows and strictness but accuracy can also be reduced if not being done correctly.
Coding agents tend to follow the path of least resistance. It is computationally "easier" for an LLM to append logic to an existing file than to create a new file, manage imports, and update directory structures. This leads to 1,000+ line "God Files" that combine UI, state, API calls, and business logic.
Key Insight: When an agent builds a monolith, it isn't being smart; it's being lazy. As the human architect, your role is to provide the "structural friction" necessary to keep the codebase clean.
Five approaches came up, each with a different tradeoff profile:
Rules in agents.md — proactive, shapes decisions before code is written. The risk is token cost: instructions sitting in context are paid on every turn, even when irrelevant. Also relies on the agent actually following them, with no enforcement mechanism.
Linter rule + lint after changes — objective and enforceable, near-zero token cost. The limitation is it only catches the problem after it exists, and agents can game it mechanically (splitting a 400-line file into two 200-line files without real refactoring, deleting comments or writing one-liners). Best used as a safety net, not a primary strategy.
Refactoring subagent — the most powerful option if you have solid test coverage. A dedicated subagent can reason about why to split, not just that something is too big. The downsides are latency, cost (extra token overhead for a new session), and regression risk if the subagent rewrites instead of just extracting.
Hooks — best option for automatic enforcement. A post-edit hook that runs checks and feeds results back to the agent closes the feedback loop without manual intervention. Setup cost is higher and hook failures can be noisy.
Slash-command, on-demand skills or reusable prompts — on-demand invocation of a well-crafted refactoring prompt. Zero cost when not in use, no automation risk, and you control the timing. More powerful than it first appears. The downsides is that is not automatically enforced, requires constant supervision.
There are two important aspects about how and when constraining the LLM's file output:
- Reasoning Degradation. Enforce strict line counts (e.g., "Max 200 lines") too early, may interrupt the LLM's flow. It may focus more on "fitting the code" than "solving the logic," leading to obfuscated code or broken abstractions.
- Token Waste:. Large files are "Token Killers." Every time a small change is requested in a 1,000-line file, the agent must re-read (and often re-write) the entire context. This wastes money and increases latency.
Yes, but the concern is nuanced. The risk is real when constraining how the model thinks — forcing a specific algorithm or architecture mid-problem. A file size limit is an output constraint, closer to formatting guidance, and modern LLMs handle that reasonably well.
That said, write first, refactor later is still directionally correct: the model solves one problem at a time instead of two. Upfront constraints should be soft and structural, not hard limits enforced mid-task.
Context bloat is more damaging, and it's not a close call.
Context bloat compounds: large files make context grow, which reduces accuracy, which can produce more bloated code, which grows context further. Each session the problem gets worse.
Constraint cost is fixed and mild by comparison. A slightly worse initial design is recoverable. Degraded model accuracy due to bloated context causes mistakes that are hard to trace.
Critically, the "write big then refactor" approach pays the context cost twice — once for the big file and once for the refactoring pass. A soft upfront constraint that keeps files small is worth it precisely because it controls context size across the session.
Three layers, in order of priority:
- Soft rules in
AGENTS.md. Can be combined or replaced by Skills or Rules.
- Linter rules.
- Reusable prompt for post refactoring
Keep them concrete and structural, not vague. The goal is to shape decisions before files are written, not to micromanage output. Example:
Before adding logic to an existing file, check if it belongs there.
If a file already has more than one clear responsibility, create a new module instead.
Prefer creating a new file over extending an existing one beyond its scope.
- Avoid hard line limits here. They constrain reasoning without being reliably enforceable.
- Prioritize responsibility over lines. Instead of
max-lines: 250, enforce one-responsibility-per-file.
Configure a max-lines rule (e.g. ESLint max-lines, Pylint max-module-lines) and run it via a hook or CI. Don't rely on it as the primary mechanism — it catches symptoms, not causes. Pair it with a complexity check (number of exports, cyclomatic complexity) for better signal.
A reusable prompt (slash-command, on-demand skill, custom agent, ...) that instructs the agent to review and extract responsibilities from files that have grown. Invoke it when you decide the codebase needs it, not automatically after every turn. The prompt contract should be explicit:
The attached file is to big and have too many responsabilities. Refactor it.
Rules:
- One responsibility per file (name it after what it does)
- If a file needs more than 3 imports from the same module, extract shared logic
- Prefer creating a new file over extending an existing one beyond its scope.
- Only extract, never rewrite logic.
- All tests must pass before and after.
This approach keeps token cost under control (only paid on demand), avoids constraining original reasoning, and gives you a deliberate moment to review structure before it accumulates further debt.
A bash script to catch big files could be also useful
min_lines=200
# ripgrep respects .gitignore and has --type
# Ignore ./docs as we use it for examples, temporary clone other repos, ...
rg --files \
-g '!docs/**' --type typescript --type js --type python \
| while IFS= read -r file; do
line_count=$(wc -l < "${file}")
if ((line_count >= min_lines)); then
printf '%10d %s\n' "${line_count}" "${file}"
fi
done \
| sort -nr
Skills and Rules can provide accurate design guidance to the model for the concrete task at hand. A react-guidelines skill can promote architecture constraints. As example:
# Architectural Constraints
1. **Separation of Concerns:** UI components must not contain business logic or raw API calls.
2. **The "Hook" Rule:** If a component exceeds 3 state variables, move state logic to a custom Hook.
3. **The "Service" Rule:** All data fetching must reside in `/services`. Do not use `fetch` or `axios` directly in components.
4. **Complexity Trigger:** If you are about to add a second distinct responsibility to a file, you MUST propose a file split before writing the code.
5. **Pre-emptive Architecture:** Prefer creating 3 focused files (Type, Logic, View) over 1 multi-purpose file.