Loop engineering is not prompt engineering at scale
Prompt engineering optimizes one interaction. Loop engineering turns that interaction into a component inside a larger system with memory, tools, and termination logic. A cron job runs the same script on a schedule; a loop runs an agent that inspects current state, picks the next action, checks the outcome, and decides whether to continue, retry, roll back, or stop.
Practitioner writing in mid-2026 frames the shift plainly: stop babysitting agents with manual prompts and start designing the systems that prompt them. That does not remove engineering judgment. It moves judgment to loop design—what success looks like, what evidence counts, and when a human must intervene.
Where loops live in today's tools
Claude Code supports recurring work through /loop scheduling, hooks that fire at lifecycle points, subagents for split explore-implement-verify roles, and headless or CI-style runs that persist after a laptop closes. Cursor supports long-running cloud agents, parallel agents on isolated branches, and Automations triggered by GitHub, Slack, Linear, or schedules. Codex and similar agents implement loops through tool calls, subagents, and repository instructions that name verification commands.
The surface differs by vendor, but the architecture repeats: goal, context, tools, observation, adjustment, termination. Pick the tool by where your team already works, then design the loop around observable repo evidence rather than model charisma.
Common patterns and when to use them
Plan-execute-verify fits bounded repo tasks with a clear pass command. Retry-with-cap helps flaky setup steps but needs a hard attempt limit per item. Evaluator-optimizer pairs work well for reviews and docs when criteria are explicit. Explore-narrow prevents premature edits in unfamiliar code. Scheduled wake-up loops handle recurring triage. Human-in-the-loop checkpoints belong before production, permission widening, or destructive operations.
Anthropic's agent guidance recommends adding complexity only when simpler flows fail. Start with one loop on one repository task, measure review effort and token use, then add subagents or schedules only when the simpler loop stalls.
Recommended play
- Start with one real repository task and a single plan-execute-verify loop before adding schedules or parallel agents.
- Write the done signal as a command or artifact, not a vibe: passing tests, green build, opened PR, or filed ticket.
- Cap iterations per item and escalate when the same failure repeats twice.
- Separate exploration from implementation so read-only passes cannot mutate production paths.
- Budget tokens and concurrency before running unattended cloud or scheduled loops.
When to prompt vs when to loop
Use this table to decide whether a task needs a durable loop or a single supervised agent session.
| Area | Prompt once when | Design a loop when | Stop rule to add |
|---|---|---|---|
| Task shape | The steps are predictable and fit one focused session | The agent must read errors, revise, and re-run verification | Name the verification command and maximum iterations |
| Duration | You can stay at the keyboard for the whole task | Work should continue while you review other items or close the laptop | Set a schedule or queue with a summary artifact per run |
| Risk | Changes are reversible and confined to a local branch | The loop touches shared files, CI, production config, or permissions | Require a human checkpoint before merge or deploy |
| Cost | Token use is small and visible in one sitting | Retries, parallel agents, or long horizons can compound quickly | Set per-run and per-day budgets with automatic stop |
| Team workflow | One engineer needs a quick answer or small patch | A team wants repeatable triage, review, or hygiene across repos | Publish run logs without secrets and name an owner for loop drift |
Execution steps
Name the goal and done signal
Write what finished means in observable terms: command output, PR state, ticket link, or report section. Avoid fuzzy goals like 'make it better' that let the loop run without a verdict.
Choose the first pattern
Default to plan-execute-verify for code changes. Add evaluator-optimizer only when review criteria are explicit. Reserve scheduled wake-up loops for recurring triage after the single-task loop works once.
Wire observation before speed
Give the agent tests, linters, build commands, diff review, or MCP tools that return ground truth. A loop without observation is just expensive repetition.
Set termination and escalation
Cap attempts per file or task, stop when the same error repeats, and name who approves production or permission changes. Document what the loop should do when blocked.
Pilot, measure, then parallelize
Run the loop on one repo task, record review time, token use, and human interventions actually observed. Add parallel agents or cloud handoff only when single-threaded loops are trustworthy.
Common pitfalls
Fuzzy goals with no done signal
Translate goals into a verification command, required artifact, or explicit human acceptance step before the first unattended run.
Unbounded retries on the same mistake
Cap iterations per item and change strategy after repeated failures instead of paying for identical attempts.
Cron without an agent decision-maker
Ensure each run observes current state and chooses the next action; a fixed script on a timer is scheduling, not loop engineering.
Parallel agents on shared files
Isolate branches or assign disjoint ownership; merge results deliberately instead of letting agents overwrite each other.
Implementation checklist
- Write the goal and done signal in observable terms.
- Pick plan-execute-verify as the default loop pattern.
- Attach tests, linters, or builds as loop observation.
- Cap iterations and name escalation for repeated failures.
- Add human checkpoints before production or destructive actions.
- Budget tokens and parallel agents before unattended runs.
- Log outcomes without secrets and assign a loop owner.
Questions this guide answers
What should you do first?
Start with one real repository task and a single plan-execute-verify loop before adding schedules or parallel agents.
Who is this guide for?
Developers, staff engineers, and platform teams adopting agentic coding workflows in Cursor, Claude Code, Codex, or custom CI agents.
What evidence supports this guide?
This guide uses listed source material from Addy Osmani, Anthropic, Kilo. Source links and scope notes are available on this page.