improve-codebase-architecture
by mattpocockimprove-codebase-architecture helps inspect a real repo, surface architectural friction, and propose deep-module refactoring candidates that improve testability, seams, and maintainability.
This skill scores 78/100, which means it is a solid directory listing candidate: users get a clear trigger, a real architectural review workflow, and concrete decision guidance that should help an agent do better than a generic 'suggest refactors' prompt. It is most credible for architecture exploration and RFC-style recommendations, though execution details are lighter than the diagnosis framework.
- Strong triggerability: the description clearly says when to use it for architecture improvement, refactoring opportunities, coupling reduction, testability, and AI navigability.
- Real workflow content: SKILL.md outlines an exploration process, candidate presentation step, and RFC-oriented output rather than staying at high-level advice.
- Useful reference guidance: REFERENCE.md gives actionable dependency categories and testing strategy rules that help agents reason about when and how to deepen modules.
- Support material is thin beyond prose: there are no scripts, examples, install instructions, or code-fenced templates to reduce execution guesswork.
- The method leans on subjective 'friction' during exploration, which may make results less consistent across agents or codebases.
Overview of improve-codebase-architecture skill
What improve-codebase-architecture does
The improve-codebase-architecture skill helps an agent inspect a real repository, notice architectural friction, and turn that friction into concrete refactoring candidates. Its core idea is not “find code smells everywhere,” but “identify shallow module boundaries that make the code harder to understand, test, and change.”
Who this skill is for
This is best for engineers, tech leads, and maintainers who already have a working codebase and want better structure without a full rewrite. It is especially useful when you are dealing with scattered logic, brittle seams between modules, or tests that only pass by over-isolating behavior.
The real job-to-be-done
Most people looking for improve-codebase-architecture are not asking for abstract architecture advice. They want help answering practical questions like:
- Which part of this repo should we refactor first?
- Where are module boundaries making change harder than it should be?
- How can we make this area more testable without adding more indirection?
- What refactor is worth proposing as an RFC or GitHub issue?
This skill is built around that decision-making step.
What makes it different from a generic refactoring prompt
The main differentiator is its bias toward deep modules: small public interfaces that hide substantial implementation detail. Instead of recommending more wrappers, more tiny functions, or more layers by default, improve-codebase-architecture looks for places where combining logic behind a better boundary would reduce complexity and improve testability.
Best-fit use cases for Refactoring
Use improve-codebase-architecture for Refactoring when you need to:
- consolidate tightly coupled modules
- reduce integration bugs caused by seams
- improve testability at the module boundary
- make a repo easier for humans and AI agents to navigate
- turn vague “this area feels messy” feedback into specific candidates
What it does not replace
This skill does not automatically rewrite your architecture or generate a guaranteed-safe migration plan. It is strongest at exploration, candidate selection, and shaping high-value refactors. You still need repository context, engineering judgment, and validation in code review.
How to Use improve-codebase-architecture skill
How to install improve-codebase-architecture
For most skill-enabled setups, install with:
npx skills add mattpocock/skills --skill improve-codebase-architecture
If your environment already syncs skills from the mattpocock/skills repository, you may only need to enable the improve-codebase-architecture entry rather than install it separately.
Where to read first before using it
Read these files first:
SKILL.mdREFERENCE.md
SKILL.md gives the workflow. REFERENCE.md is the part many users skip, but it contains the dependency categories and testing guidance that strongly affect whether a proposed deepening refactor is realistic.
What inputs the skill needs to work well
The improve-codebase-architecture skill works best when you provide:
- the repository or target directory
- the product area or feature under review
- known pain points
- constraints on refactoring scope
- dependency realities such as databases, internal services, or third-party APIs
Weak input: “Improve the architecture of this app.”
Strong input: “Inspect src/billing and src/invoices. We keep changing both for one feature, tests mock too much, and regressions happen at the integration seam. Suggest 3 deep-module refactor candidates we could ship incrementally.”
How the improve-codebase-architecture workflow actually runs
The source skill follows a three-step pattern:
- Explore the codebase organically
- Present numbered deepening candidates
- Let the user pick one candidate to develop further
The important detail is that exploration is meant to feel like real navigation, not checklist scanning. Friction is treated as evidence. If understanding one behavior requires jumping across many files or layers, that is likely the signal the skill is supposed to surface.
What the skill is looking for during exploration
When using improve-codebase-architecture, the agent should notice issues such as:
- understanding a concept requires hopping through many tiny files
- interfaces are almost as complex as implementations
- logic is split into “testable” helpers but the real risk is in orchestration
- tightly coupled modules create unstable seams
- tests avoid the true behavior by over-mocking internals
That makes this skill more targeted than a broad style audit.
How to write a better prompt for improve-codebase-architecture usage
A high-quality prompt should specify:
- the part of the repo to inspect
- what kind of refactoring you want
- whether you want candidates only or a full RFC
- your testing constraints
- what not to touch
Example prompt:
“Use the improve-codebase-architecture skill on our checkout flow. Explore organically and identify 3 candidates where shallow modules or seam-heavy orchestration are hurting testability. Classify key dependencies as in-process, local-substitutable, remote but owned, or true external. Recommend one candidate we can implement without a full rewrite.”
How to turn a rough goal into a complete request
If your rough goal is “make this more maintainable,” convert it into a request with:
- scope: “look at
packages/webhooks” - symptom: “bugs happen in handoff between parser and dispatcher”
- desired output: “3 candidates plus one recommended RFC”
- constraints: “keep public API stable”
- testing expectation: “prefer boundary tests over internal mocks”
This helps the skill produce actionable refactoring guidance instead of broad commentary.
What REFERENCE.md changes in practice
REFERENCE.md matters because it helps you judge whether a module can actually be deepened:
- In-process dependencies are easiest to merge and test directly.
- Local-substitutable dependencies can be deepened if a local stand-in exists.
- Remote but owned dependencies should usually use a ports-and-adapters shape.
- True external dependencies should be mocked at the boundary, not spread through the module.
If a recommendation ignores these categories, it may sound elegant but be hard to implement.
Testing guidance that affects adoption decisions
A key principle in this skill is replace, don't layer. That means after creating a deeper module boundary, you should prefer tests at that boundary instead of keeping a pile of old shallow-module unit tests. For teams considering improve-codebase-architecture install, this is an important fit check: the skill is opinionated about simplifying seams, not preserving every existing test slice.
Suggested usage workflow in a real repo
A practical improve-codebase-architecture guide looks like this:
- Pick one painful area, not the whole monorepo.
- Run exploration and ask for 3 candidates.
- Choose the candidate with the clearest seam-related pain and feasible dependency shape.
- Ask for an RFC-style issue with problem, proposal, dependency classification, and testing approach.
- Validate against your actual deployment and migration constraints before coding.
When this skill gives the best signal
You will get the best improve-codebase-architecture usage results when the target area already shows real friction: repeated cross-module edits, seam bugs, hard-to-follow control flow, or tests that mostly verify mocks. It is less valuable on tiny, already cohesive modules or code that simply needs cleanup rather than architectural change.
improve-codebase-architecture skill FAQ
Is improve-codebase-architecture good for beginners?
Yes, but with limits. Beginners can use it to learn how module boundaries affect design, especially around testability. The catch is that the best results still require some ability to judge tradeoffs. Treat its output as a refactoring proposal, not an unquestioned prescription.
Is this better than asking an AI to "refactor the architecture"?
Usually yes. A generic prompt often produces abstract layering advice. The improve-codebase-architecture skill is more specific: it explores friction, prioritizes deep modules, and frames candidates around real boundaries and test strategy.
What kinds of repositories fit best?
It fits application codebases with meaningful orchestration and domain behavior: web apps, backend services, internal tools, and feature-rich products. It is most useful where complexity comes from interaction between modules, not just algorithmic code.
When should I not use improve-codebase-architecture?
Skip it when:
- you only need style cleanup
- the codebase is too small for architecture to be the bottleneck
- the main issue is missing requirements, not poor boundaries
- your team cannot make structural changes right now
In those cases, a focused bug-fixing or code cleanup prompt may be a better choice.
Does it work for microservices or networked systems?
Yes, but only if you respect the dependency model. The skill explicitly distinguishes remote-but-owned services from true external services. For internal services, the likely recommendation is a port with production and in-memory adapters rather than pretending the network boundary does not exist.
Will it recommend deleting tests?
Potentially, yes. The underlying guidance says old shallow tests may become waste once you have stronger boundary tests on the deepened module. That does not mean “delete tests carelessly”; it means replace low-value seam-preserving tests with tests that survive internal refactors.
Is improve-codebase-architecture install enough to get value?
Installation alone is not the hard part. The real adoption question is whether you can provide enough repo context and whether your team is open to consolidating logic instead of adding more layers. This skill pays off when used on a concrete problem area with clear symptoms.
How to Improve improve-codebase-architecture skill
Give narrower scope for better improve-codebase-architecture results
Do not point improve-codebase-architecture at the entire repository first. Narrow scope to one subsystem, workflow, or package. Smaller scope leads to better candidate quality and fewer generic recommendations.
Provide friction, not just structure
The strongest inputs describe where the team feels pain:
- “We change three files for one behavior tweak”
- “Tests only pass if we mock the orchestrator heavily”
- “Parsing and persistence are separated, but bugs happen in the handoff”
This gives the skill better evidence than a folder tree alone.
Ask for dependency classification explicitly
A strong prompt asks the agent to classify major dependencies using the categories from REFERENCE.md. This prevents unrealistic proposals and makes the output easier to implement in production.
Request candidate ranking with tradeoffs
Do not just ask for “opportunities.” Ask for ranked candidates with:
- why this boundary is shallow
- what would become deeper
- migration risk
- expected testability gain
- whether the change is incremental
This improves decision quality after the first run.
Common failure mode: more abstraction, not deeper modules
One failure mode is getting recommendations that add wrappers, service classes, or helper layers without reducing conceptual surface area. If that happens, push back with: “Prefer fewer, deeper boundaries rather than more indirection.”
Common failure mode: ignoring operational constraints
A proposal may sound clean but fail your real constraints around API stability, deployment boundaries, or external vendors. Improve the output by naming those constraints upfront and asking for an incremental path.
Improve the first output with RFC-oriented follow-ups
After the first candidate list, ask for one selected candidate to be expanded into:
- problem statement
- current seam friction
- proposed deep module boundary
- dependency handling strategy
- testing replacement plan
- migration steps
- risks and rollback notes
This is usually the highest-leverage follow-up for improve-codebase-architecture for Refactoring.
Use concrete examples from the repo
If the first pass feels generic, point to specific files and call chains. Example:
“Focus on src/orders/createOrder.ts, src/payments/charge.ts, and src/notifications/sendReceipt.ts. We suspect orchestration is split too thinly. Re-evaluate with a deep-module lens.”
Concrete file anchors help the skill connect architecture advice to actual code movement.
Validate recommendations against boundary tests
The best way to assess a recommendation is to ask: “What would the public boundary test look like after deepening?” If the skill cannot describe a stable, observable boundary, the proposal may still be too shallow or too abstract.
Iterate toward one implementable change
Do not try to adopt every candidate. The best improve-codebase-architecture guide in practice is iterative: pick one high-signal refactor, ship it, replace the right tests, and then re-run the skill on the next painful area.
