CORE DIRECTORY // SYSTEM.USER.DIANA_ISMAIL
Labs by Diana — Experiments that ship.
Side projects that got out of hand. AI tools built for problems I kept tripping over — now live, now yours.
Designing Rules for AI Agents
ARTICLE_003
PUBLISHED
2026.04.12
READ
~7 MIN
Part 1 documented the rule cascade -- global, project, and scoped layers that govern AI agents across fourteen projects. This article goes deeper into the rules themselves. How to write them, how to structure them, and critically, how to know when a rule is doing more harm than good. The core finding is counterintuitive: LLM reasoning degrades around 3,000 tokens of instructions, well below context window limits. The bottleneck is not how much the model can hold -- it is how much it can actually follow. Every rule that does not constrain behaviour is noise that actively degrades the rules that do. The design challenge is not what to write. It is what to cut.
The article draws on research across CSS cascade layers, ESLint configuration inheritance, and Terraform override files to identify three conflict resolution patterns that translate directly to agent governance. It documents the specific anti-patterns that appeared in Diana's own system -- duplicated conventions, documentation masquerading as constraints, rules that no agent could verify -- and the information architecture decisions that fixed them. The result is a framework for writing rules that stay under the cognitive load ceiling while carrying maximum signal per token.
Every_Rule_Has_a_Cost
The assumption behind most CLAUDE.md files is that more rules mean better behaviour. The research says the opposite. LLM reasoning measurably degrades around 3,000 tokens of instructions -- roughly 50-60 lines of dense prose. This is not a context window limitation. Models can hold 200,000 tokens in context. The degradation happens because instruction-following is a cognitive task, and like any cognitive task, it has a load ceiling.
Prompt formatting studies show up to 40% performance variance depending on how instructions are structured, and research on context saturation confirms the mechanism: every instruction that does not constrain behaviour actively degrades the ones that do. A rule like "write clean code" is not neutral. It occupies tokens, competes for attention weight, and dilutes the specific, verifiable rules around it.
The practical implication is that rule design for AI agents is a subtraction problem. The question is never "what else should I add?" It is "what can I remove without losing a constraint?"
Three_Inheritance_Patterns
Software solved rule inheritance decades ago, and three patterns translate directly to agent governance. The first is specificity-based inheritance, borrowed from CSS. Rules closer to the target win. Global rules apply everywhere by default. Project-level rules override globals where the project needs something different. File-level rules -- scoped to specific directories or file patterns -- are most specific and override everything above them. CSS cascade layers, formalised in 2024, made this explicit: layers are ordered, and proximity to the thing being styled determines authority. This is the model my system uses. The global CLAUDE.md sets TypeScript strict mode; the Labs project CLAUDE.md overrides the versioning workflow; FitChecker's admin.md scoped rule injects 26 route definitions only when editing admin files.
The second pattern is last-write-wins, from ESLint. When multiple configuration objects match the same file, later objects override earlier ones. Simple and predictable, but fragile -- accidental ordering changes break behaviour silently, and with LLMs, attention is non-uniform across the context window, so instruction ordering is unreliable.
The third is value-by-value merge, from Terraform. Override files replace individual values rather than entire blocks. Kubernetes extends this with explicit OVERWRITE versus PRESERVE declarations per field. This is the pattern behind the "Override:" declaration in my project CLAUDE.md files -- a single line that states "this project uses manual versioning instead of semantic-release" without restating the entire git workflow.
The_Override_Declaration
Implicit overrides break silently. When a project CLAUDE.md says "commit after every completed subtask" and the global CLAUDE.md says nothing about commit frequency, the rule works fine. When the project CLAUDE.md changes the git workflow but the global CLAUDE.md also has git workflow rules, the agent has two competing instructions with no explicit resolution.
The fix is borrowed from Terraform: explicit override declarations. Labs' project CLAUDE.md includes the line "Override: This project uses manual versioning steps. semantic-release only creates GitHub Releases." That single declaration does three things. It acknowledges the global rule exists. It states the specific deviation. And it scopes the deviation to this project only.
Without it, an agent reading both the global and project CLAUDE.md files would see two versioning workflows and apply whichever one it encountered most recently in the context window -- which, given how attention works in transformers, is essentially random. Every project CLAUDE.md in the system now follows this pattern: if it deviates from a global rule, it says so explicitly with an "Override:" prefix. The cost is one line per deviation. The alternative is silent rule conflicts that surface as inconsistent agent behaviour across projects.
What_Makes_a_Rule_Work
The difference between rules that constrain behaviour and rules that don't is verifiability. "Write clean code" is not verifiable -- no agent can determine whether it has been followed. "All API payloads validated with Zod" is verifiable -- the agent can check whether a Zod schema exists for each route. "Handle errors properly" is not verifiable. "Wrap every external call in try/catch; non-critical failures log and continue, never crash the request" is verifiable.
The pattern holds across every effective rule in the system. Exact commands work: npm run test:unit leaves no room for interpretation. Naming conventions with examples work: "Components PascalCase, functions camelCase, constants UPPER_SNAKE_CASE." Negative constraints work: "Never use dangerouslySetInnerHTML with unsanitized input" and "No hardcoded secrets -- all from env vars."
What doesn't work: aspirational language, documentation-style explanations, and anything that describes what good looks like without specifying the concrete action that produces it. HackerNoon's testing found that without these kinds of structured enforcement rules, Claude activated project-specific conventions only 25% of the time. With them, the number exceeded 90%. The rules didn't make the model smarter. They made the instructions unambiguous.
The_Anti-Pattern_Gallery
Five anti-patterns appeared in my own system before the restructuring, and all five show up in the community's CLAUDE.md files. First, duplicating what the code already says. A rule that restates the tech stack visible in package.json wastes tokens on information the agent can derive by reading a file. The rule file's job is to tell the agent what the codebase cannot -- conventions, architectural decisions, constraints. Second, treating rule files as documentation. A three-paragraph explanation of why the chat engine uses Redis is documentation, not a constraint. The constraint is "Redis singleton in redis.ts -- one ioredis instance per process."
Third, aspirational rules without verifiable outcomes. "Maintain high code quality" adds cognitive load with zero behavioural impact. Fourth, duplicating rules across layers. The same "strict TypeScript, no any types" rule written into five project CLAUDE.md files creates five maintenance obligations and guarantees drift. It belongs in the global file, stated once. Fifth, rules that no agent will encounter. A scoped rule about admin module conventions loaded on every file edit -- instead of only when the agent touches admin files -- injects irrelevant context that degrades performance on the actual task.
Each of these anti-patterns costs tokens. Tokens compete for attention weight. Attention weight is finite. The math is simple: remove the anti-patterns, and the remaining rules carry more weight per token.
Information_Architecture_for_Rules
The question of what goes where has a decision framework. Methodology goes in the global CLAUDE.md -- TypeScript strict mode, Zod validation, Conventional Commits, accessibility standards, security practices. These are conventions that should apply to every project unless explicitly overridden. Code conventions go in the project CLAUDE.md -- the stack, the architecture patterns, the error handling approach, the git workflow, the naming patterns specific to that codebase. Project operations go in the planning CLAUDE.md -- team assignments, current phase, deliverable standards, folder structure. These are the two different questions from Part 1: "how do we write code here" versus "what is this project and who works on it."
File-type-specific rules go in scoped .claude/rules/*.md files with glob patterns -- FitChecker's admin.md loads only on app/manage/** paths, carrying route definitions and auth patterns that would be noise in any other editing context.
The test for correct placement is straightforward: if a rule applies to all projects, it goes in global. If it applies to one project's code, it goes in the project CLAUDE.md. If it applies to one project's planning, it goes in the planning CLAUDE.md. If it applies only when editing specific files, it goes in a scoped rule. If it fails all four tests, it probably should not be a rule. Teams report 60-80% reduction in context errors after restructuring rules along these lines. The improvement comes not from writing better rules but from putting existing rules in the right layer.
KEY_TAKEAWAYS
TAKEAWAY_01
Every rule you write for an AI agent competes for the same finite attention budget. LLM reasoning degrades around 3,000 tokens of instructions -- not from context limits but from cognitive load. The design discipline is subtraction: remove rules that do not constrain behaviour, and the remaining rules carry more weight per token.
TAKEAWAY_02
Implicit overrides break silently and surface as inconsistent agent behaviour across projects. Explicit override declarations -- one line stating the deviation, scoped to the project -- eliminate rule conflicts at the cost of a single sentence per exception. The Terraform model (state the override, not just the new value) is more reliable than the ESLint model (last config wins) for LLM instruction-following.
TAKEAWAY_03
The difference between rules that work and rules that don't is verifiability. "Handle errors properly" occupies tokens without constraining behaviour. "Wrap every external call in try/catch; non-critical failures log and continue" is a concrete instruction an agent can follow and a reviewer can check. If a rule cannot be verified, it is noise -- and noise degrades everything around it.