← Back to blog

    I Rewrote My CLAUDE.md From Scratch

    This is a first-person account of a CLAUDE.md rewrite. I had a 612-line file that I'd been adding to for fourteen months. The agent's behavior had been getting subtly worse for the last three of those — more clarifying questions, more ignored rules, more drift from the way I would have done things if I were doing them by hand. I decided the only fix was to delete the file and start over with a blank one. Here's what happened.

    Why I didn't just edit it

    The file had become a layered fossil. Rules from the project's first month, when we were still picking a stack. Rules from the migration to TypeScript. Rules from the post-incident reviews where someone had broken something and wanted a CLAUDE.md line to make sure nobody did it again. Rules from a refactor that never landed. Rules I had added in moments of frustration that I knew were rants more than directives.

    I tried editing it twice. Both times I ended up making it worse — adding "exception" clauses that contradicted earlier rules, or rewording a vague rule into a slightly less vague rule that still said nothing falsifiable. The file was so deeply entangled with itself that pulling on any one thread loosened ten others. Editing it was strictly inferior to deleting it.

    The blank file

    rm CLAUDE.md && touch CLAUDE.md is satisfying in a way few git commands are. The repo had a clean slate at session start. The agent — Claude Code, in this case — would now read an empty file and operate from its training-data defaults. That meant for one session, every decision the agent made was a candidate signal: either the agent did something I didn't want, in which case I had a real rule to add, or the agent did something I wanted, in which case the rule was probably already implicit and didn't need to be written.

    I gave myself one constraint: I would not write CLAUDE.md by hand. I would only write it as a reaction to specific agent behaviors over the next two weeks. Every line in the rewritten file would have a corresponding agent action that justified it. No aspirational rules. No "best practice" rules. Only "the agent did X and I didn't want X" rules.

    Two weeks of observed behavior

    I kept a running log in a scratch file. Every time the agent did something I would have done differently, I noted it. By the end of week one I had thirty-two notes; by the end of week two, fifty-one. They clustered into roughly six categories.

    Process violations — the agent pushed something to main directly, or skipped a hook, or amended a published commit. Eight notes. These became the first section: non-negotiables.

    Stack-specific drift — the agent reached for a library or pattern that the project had explicitly migrated away from. Ten notes. These became language and style rules.

    Operational unawareness — the agent assumed something about the dev environment that wasn't true (a path, a command, a service availability). Twelve notes. These became operational notes.

    Investigation depth misjudgment — the agent moved to action when I would have wanted more investigation, or vice versa. Six notes. These became principles.

    Communication style — the agent over-explained or under-explained, used formal English when I wanted terse, or hedged where I wanted directness. Nine notes. These mostly became one paragraph in the style section.

    False or stale assumptions about other tooling — the agent referenced features of CI or hooks that didn't exist, or assumed a CI step was running that wasn't. Six notes. These became enforcement rules and exposed three actual harness gaps that I fixed before writing the rules.

    Drafting from the log

    After two weeks I sat down with the log and turned it into a CLAUDE.md. The process was nearly mechanical: each cluster of notes became 3-8 rules, each rule had to point at a specific note from the log, and each rule had to be falsifiable. The first draft came in at 89 lines.

    I then ran AgentLint against it. The linter flagged six issues — three vague rules, two missing thresholds, one stale tooling reference (I had written pnpm out of muscle memory but the project uses bun). Twenty minutes of edits and the file passed.

    The rewritten CLAUDE.md was 102 lines. The original had been 612. The new one carried more weight, not less, because every line was tied to a specific behavior I'd watched the agent get wrong without that line.

    What changed in the agent's behavior

    The first session after committing the new CLAUDE.md, the agent felt different in a way I hadn't expected. It asked fewer clarifying questions on the kinds of decisions the rewrite had covered (where to put a new test file, what naming to use, whether to push directly), and it asked more thoughtful questions on the kinds of decisions the rewrite had deliberately left out (what should this new feature actually do, what's the user expectation, what edge cases matter). The agent had stopped wasting its budget on already-decided questions and was using it on the questions that actually needed me.

    Over the next two weeks, the rate of "the agent did X and I didn't want X" notes dropped from about four per day to under one per day. The remaining notes were either genuine new patterns (which became new rules, slowly) or one-off oddities that didn't justify a rule. The file stabilized at about 110 lines.

    What I'd do differently next time

    Three things, looking back.

    I should have started the log sooner. I waited until I deleted the file. The two weeks of observation could have happened with the bloated file still in place; I would have learned the same things and the empty-file phase could have been a single weekend. The deletion is the cathartic part, but the learning happens in the observation, not in the deletion.

    I should have stripped the operational notes back even more. I included a few rules in the operational section that I had encountered exactly once. With more weeks of observation, those would have been cut. Rule of thumb: a rule needs to come up at least twice in the log before it earns a line.

    I should have explicitly tagged the principles section. When I rewrote, I treated principles as load-bearing rules. They are, but they degrade differently than process rules — a stale principle stays plausible long after a stale path command has obviously broken. I now mark principles with a "last reviewed" date and revisit them quarterly.

    The rewrite is the easy part

    The deletion and rewrite were satisfying. The hard part is preventing the same fourteen-month accretion from happening again. My current discipline is: every CLAUDE.md edit must come with either a deletion of equal weight or a clear log entry justifying the addition. CLAUDE.md cannot grow during routine work. It can grow when a new pattern is observed, and it has to shrink when an old pattern stops mattering.

    If your CLAUDE.md is over 300 lines and was last seriously rewritten more than six months ago, consider doing the same exercise. Two weeks of observation, one weekend of drafting, AgentLint as the final sweep. The shorter, sharper file is almost always better than the long, drifting one. The agent will tell you within a week whether the rewrite worked — and if it did, it will be one of the highest-leverage changes you made all quarter.

    Related posts