I learned about what I call the plan/execute technique for working with programming LLM agents like Claude Code.
- Describe a feature in a few sentences in a file called
PLAN.featurename.md
. - The plan phase: Ask the LLM to write a detailed plan for implementing the feature in the same file.
- Read the plan and think through how the LLM is going to read it and do the work
- Add details for any problem areas
- Change the plan myself, or ask the LLM to do it (usually I have a few rounds of both
- Loop this step until satisfied
- For larger changes this is often quite a few volleys of reading and editing
- The execute phase: Clear the LLM context and ask a new instance to implement the changes in the plan.
- Read the diff
- Ask it to make any necessary modifications
- Loop this step until satisfied
- Commit the change
But I feel like I’m still stumbling around in the dark. There are plenty of people talking about these things, but much of the conversation is very low quality Twitter style engagement bait or unsearchable in semi-public Discord channels (I assume; I’m not in any). What are the best techniques for this? I’m especially interested in hearing about:
- What techniques work well when writing an overview of your codebase for LLMs?
- As documentation intended for LLMs in a project grows, how do you organize it?
- What techniques work well when writing feature descriptions for LLMs?
- Is it worth keeping LLM-generated feature implementation plans? If so, why?
If you have anything that work well, I’m really interested in hearing from you, drop a comment or send me an email.
The 8kloc barrier
After building a few programs with agentic LLMs by just asking it to implement a feature (not using plan/execute), I’ve noticed a couple of project size inflection points.
- They’re extremely fast and gratifying up to 4kloc
- They get slower and dumber for whole-codebase changes from 4kloc up to about 8kloc
- They become unusable for wide-ranging changes altogether after 8kloc
These numbers are obviously all very much finger-to-the-wind numbers, affected by my particular tool choices, programming languages, and so forth. I suspect that in particular as models are actively developed into products these barriers will move — but I also suspect that they won’t just melt away. I’m interested in techniques for working at different kloc bands because I think those skills will remain useful in the long term.
The plan/execute pattern makes a huge difference in getting useful work out of LLMs on projects past the 8kloc barrier.
My projects
I got those quick and dirty kloc numbers above from my experience with a few personal projects. These were two of the earliest I tested with:
- ldapenforcer is less than 6kloc of Go, built very fast across a few days as my first Claude Code project. It was 2.7kloc in an afternoon, and MVP the first day, according to my contemporaneous exuberant blog post.
- Understatement (yet unreleased) is about 16kloc of TypeScript, started before agentic LLMs were available. By the time I could use Claude Code on it, it was already 8-10kloc. After some initial experiments, I went back to old-style ChatGPT copy/paste for Understatement, because unless the change was unusually small, I got useless results from Claude. It wasn’t until I tried plan/execute that I was able to get real value out of it on this project.
How I plan and execute
I end up with the following files:
-
NOTES.md
, which is committed to the repo and contains a big list of things I want the LLM to keep in mind.This file is adapted from Claude’s recommended
CLAUDE.md
, renamed because sometimes I am using other agents like Cursor.It’s unique to each project. Claude Code prompts you to let it generate one in new projects, which is worth doing, but mine get longer pretty fast.
It includes:
-
How I want the code laid out in the repo
-
Style prescriptions
-
Sometimes specific examples of things to do or not to do
Example
Try to avoid many levels of nested html strings. Zero is best, one is ok, two is not great, and three or more should be avoided. Example: do NOT do this:
return html` <table> <!-- ... snip ... --> <tbody> ${repeat( groupedStatements, (group) => group.accountPersistentIntid, (group) => html` ${repeat( group.statements, (statement) => `${statement.accountPersistentIntid}-${statement.statementYear}-${statement.statementMonth}`, (statement) => html` <tr> <!-- ... snip ... --> <td> ${statement.retrievalDate ? html`<a class="download-link" @click=${() => this.handleDownload(statement)} >Download</a >` : html`<span>-</span>`} </td> </tr> `, )} `, )} </tbody> </table> `; }
-
Locations of key APIs or classes, for instance, how state management works for frontend components
-
Links to other documents describing project-specific decisions, terminology, etc
-
-
PLAN.template.md
, a template for new features which is also committed to the repo.This is pretty short and generic, as most of the codebase-specific stuff goes in
NOTES.md
, and any feature-specific stuff goes inPLAN.<feature>.md
(see below for that):Here it is in its entirety
# PLAN template for LLMs to build new features ## Requirements Common requirements: 1. See `NOTES.md` for a description of this codebase 2. Add sections to this file to describe implementation of this feature 3. Be judicious in the code you add - don't add a bunch of boiler plate that we might need later, just what is needed for this feature 4. When initially creating the plan, don't implement whole functions in this plan file. Write down what the implementer needs to know, but don't do their work for them. 5. Once you've written the plan, go back over it and see what you can improve. Keep in mind the rules in this section. Feature requirements: - ... Prompt the LLM with: > Based on the requirements listed in FILENAME, fill out the rest of the document with a plan for implementing the feature. Don't write any code, just fill out the rest of the plan.
-
PLAN.<feature>.md
, copied from the template and filled in.I haven’t decided whether these should be kept forever in the repo or not. Maybe I’ll make a
docs/prompts/
directory or something for them. If I were to keep them, I’d need to ask the LLM to keep it up to date as I refine the instructions after the first result.
What have you got?
Do you do this? What works well for you?