One Claude Code Tip a Day: Refactor in Thin Slices
Use Claude Code for refactoring by slicing one behavior-preserving change at a time, verifying each step with tests, logs, and git diff before asking for the next move.
This post is part of the “One Claude Code Tip a Day” series — a daily guide to using Claude Code more effectively.
A bad refactor rarely starts with bad taste. It usually starts with a reasonable request that is too wide: “clean up this checkout flow,” “simplify the dashboard state,” or “move this API client into a shared module.” Claude Code is capable of doing all of that, but if you let the session attack the entire shape of the codebase at once, you get the familiar mess: many files changed, tests that fail for unclear reasons, and a diff that mixes behavior changes with naming, formatting, and architecture opinions.
Today’s habit is simple: refactor in thin slices. Use Claude Code as a careful pair programmer that must preserve behavior at every checkpoint, not as a bulk rewrite engine.
Start with the failure-prone task
Imagine you are working in a React app where the billing page has grown a 320-line component. It fetches plans, reads feature flags, formats prices, handles coupon state, and renders three nested panels. The team wants to extract the pricing logic so the same calculation can be reused in an onboarding flow.
The tempting prompt is:
`Refactor BillingPage.tsx and extract the pricing logic into a reusable module.`
That prompt is not specific enough. Claude may move code, rename props, “improve” state handling, and alter edge cases before you have a stable review point. Instead, start the session by forcing a map and a slice:
`Read src/pages/BillingPage.tsx and the related tests. Do not edit yet. Identify the smallest behavior-preserving refactor that would make pricing reusable. Show the exact files you inspected and the verification command you would run after the first slice.`
The phrase “do not edit yet” matters. It turns Claude Code from an eager patcher into an analyst. You are asking for evidence: files read, seams found, and one verification command.
Make the first slice boring
A good first slice is almost disappointingly small. For the billing example, it might be: extract `calculateDisplayPrice(plan, coupon)` into `src/lib/pricing.ts` without changing the JSX, network calls, or UI state. The prompt for the edit can be narrow:
`Apply only the first slice: move the existing display-price calculation into src/lib/pricing.ts. Keep the function behavior identical. Do not rename unrelated variables. Do not change rendering. After editing, run the focused billing tests and show git diff --stat plus the relevant diff.`
This gives Claude Code a contract. It may edit two files, maybe three if a test import is needed. If it touches CSS, routing, or the API client, the session has drifted.
I like to add one more constraint when the code is fragile:
`If you find that the extraction requires changing component behavior, stop and explain the blocker instead of continuing.`
That line prevents the model from silently “making it work” by changing the thing you were trying to preserve.
Verification is part of the refactor, not a cleanup step
After the first patch, do not read Claude’s summary first. Read the evidence. The minimum loop is:
`git diff -- src/pages/BillingPage.tsx src/lib/pricing.ts`
`npm test -- BillingPage pricing`
If the project has no focused test, ask Claude to create a characterization test before the next extraction:
`Before continuing the refactor, add a small test that locks the current coupon and annual-plan price behavior. Use the existing test style. Run only that test and show the assertion names.`
This is where Claude Code becomes especially useful. It can find the testing conventions, copy the local setup, and write the small guard you would otherwise skip. But it still needs a bounded instruction: lock current behavior, do not redesign the feature.
Ask for a second-pass correction
Even when tests pass, the first slice often has rough edges. Maybe the extracted function still accepts a whole `Plan` object when it only needs `amount`, `interval`, and `coupon`. Maybe the name is vague. Maybe the import created a cycle. The second pass should be a review prompt, not another broad refactor:
`Review the diff as a maintainer. Look for behavior changes, unnecessary movement, import cycles, confusing names, and missing tests. Do not edit yet. Return a short punch list ranked by risk.`
Then pick one correction:
`Apply only item 1 from the punch list. Keep the public behavior unchanged. Run the same focused test command again and show the new diff.`
This two-step pattern is slower than “refactor everything,” but it is faster than untangling a 900-line diff. It also teaches Claude Code your standard: inspect, patch, verify, critique, correct.
Common failure modes
The first failure mode is mixed intent. A refactor that also fixes a bug, changes copy, and reformats files is no longer a refactor you can review cheaply. When you see that, stop the session and say:
`The diff mixes behavior changes with refactoring. Revert the non-refactor changes and keep only the extraction needed for pricing reuse.`
The second failure mode is test theater. Claude reports that tests pass, but it ran a broad command that skipped the relevant package or failed before reaching the target. Ask for the exact command and the last meaningful lines of output. For frontend work, pair tests with a runtime check when possible: open the page, reproduce the coupon state, and inspect the rendered price.
The third failure mode is abstraction inflation. Claude may create a generic “pricing engine” before the second caller exists. Push back:
`Do not generalize beyond BillingPage and the onboarding caller. Prefer a small pure function plus tests over a framework.`
Rule of thumb
A safe Claude Code refactor changes one seam at a time and earns the next seam with evidence. If you cannot describe the slice in one sentence, it is too large. If you cannot name the verification command, it is not ready. Make the first patch boring, make the review strict, and let the architecture improve one tested move at a time.