Claude Code /goal: How to Run AI Agents Autonomously for Days

Q: What is the /goal command in Claude Code?

A slash command in Claude Code that sets a completion condition. After every turn, a small fast model (Haiku) evaluates the condition. Claude keeps running turns until the evaluator confirms the goal is met. The goal clears automatically once achieved or once you run /goal clear.

Q: How does /goal differ from /loop?

/loop re-fires the same prompt on a time interval. /goal keeps running until a verifiable condition holds. /goal evaluates after every turn with a separate model; /loop just re-runs the same prompt on a schedule.

Q: How long can Claude Code run with /goal?

There is no hard time limit — sessions are bounded by your weekly token budget and any turn or time clause inside the condition. Public reports include 14-hour overnight sessions, 45-hour sessions, and one user reporting a 5-day continuous run.

Q: Does Codex have a /goal command too?

Yes. OpenAI's Codex CLI shipped /goal as an experimental config option roughly two weeks before Anthropic added it to Claude Code. Both implement the same Ralph loop pattern — run a coding agent until a verifiable end-state holds.

Q: Which model evaluates the /goal condition?

By default the small fast model — Claude Haiku for Anthropic deployments. The evaluator receives your condition and the recent conversation, returns a yes/no plus a short reason. Evaluator tokens are billed at Haiku rates and are typically negligible compared to the main turn cost.

Q: Can /goal work in the Claude Code VS Code extension?

Not yet. /goal currently only runs from a terminal session — either your IDE's built-in terminal launching the claude CLI, or a standalone shell. It also runs non-interactively via claude -p with the goal condition embedded.

Q: How do you write an effective /goal condition?

Anchor on one measurable end state (a test result, build exit code, an empty queue), state how Claude should prove it (npm test exits 0; the new page renders without console errors), and list constraints that must not change. Add a turn-budget clause (or stop after 60 turns) to bound runtime.

Q: Do I need auto mode for /goal to run unattended?

They are complementary. Auto mode auto-approves tool calls within a turn. /goal decides whether to start another turn. Pair them and Claude can run a long session without any human prompts between tool calls or turns.

Anthropic just shipped /goal — a built-in slash command that keeps Claude Code working until a definition of done is met. You type the condition, Claude runs a turn, and a small evaluator model checks whether you're there yet. If not, it starts another turn. Public reports already include 14-hour overnight sessions, 45-hour sessions on Codex, and one developer claiming a five-day continuous run.

This is the same Ralph-loop pattern people have been hand-rolling for months — now baked into the harness. OpenAI's Codex CLI shipped it first as an experimental config option about two weeks ago; Anthropic followed with a more polished implementation. This guide covers what /goal actually does, how to write conditions that don't burn tokens, the head-to-head against Codex on a real build (Claude Code finished a working Catan clone in 13 minutes), and how to use it to drain unused weekly tokens before they reset.

Build Faster, Ship More

Get Your AI Tool in Front of 50,000+ Monthly Readers

PopularAiTools.ai reaches developers, founders, and AI buyers actively searching for tools like yours.

Submit Your AI Tool →

📋 Table of Contents

What /goal Actually Does
How the Evaluator Loop Works
How to Use /goal — A Five-Minute Walkthrough
Writing a Condition That Doesn't Waste Turns
Head-to-Head: Claude Code vs Codex on /goal
The Token-Drain Strategy: Burn Tokens Before They Reset
Limitations and What to Watch For
FAQ

What /goal Actually Does

Most of Claude Code's history has been steering. You write a prompt, Claude does a thing, you steer the next prompt, Claude does the next thing. /goal inverts that. You write one thing — a completion condition — and Claude keeps running turns until that condition is satisfied or you cancel it.

Claude Code marketing page showing the terminal-first design philosophy — Claude Code positions itself as a terminal-native coding agent — `/goal` is the autonomous-mode primitive that's been missing from that pitch.

The shape is familiar to anyone who's hand-rolled an agent loop. Set an objective. Run the model. Evaluate against the objective. Loop if not done. The novelty here isn't the pattern — it's that it's built into the official harness with the evaluator wired up, the token accounting integrated, and a real "stop" mechanism so a runaway agent can't burn your weekly limit in an afternoon.

If you've heard the term "Ralph loop" from earlier this year — that's the pattern. /goal is the official, harness-native version.

How the Evaluator Loop Works

The mechanism is simple and worth understanding because it shapes everything you do with the feature. Every time Claude finishes a turn, the harness invokes a separate, smaller model with three inputs:

Your condition — the literal text of your goal statement.
The recent conversation — what Claude just did, what tools it called, what the results were.
An instruction to return yes/no plus a short reason. No essays. Just a binary verdict.

If the evaluator says no, the reason becomes context for Claude's next turn. If yes, the goal clears and Claude reports done. The default evaluator is Claude Haiku — the smallest, fastest model in the lineup. Evaluator tokens are billed at Haiku rates and rarely show up as a meaningful share of total spend.

Anthropic Claude Haiku model page — Claude Haiku is the per-turn evaluator that decides whether your `/goal` condition has been met.

The architectural implication: your condition is being read by a different, smaller model than the one doing the work. If you write a condition that needs deep reasoning to evaluate, Haiku may get it wrong — either declaring done too early (you stop with broken code) or never declaring done (you burn turns forever). Anchor on things Haiku can verify with confidence.

How to Use /goal — A Five-Minute Walkthrough

Open a real terminal — not the VS Code extension. Type claude to launch the CLI, then /goal followed by your condition. A minimal example:

/goal Build a Chrome extension that adds Slack-style emoji shortcuts.
Definition of done:
  - the extension loads with no manifest errors
  - typing :smile: in any input box pastes the 😀 emoji
  - works in incognito mode
Stop after 30 turns regardless.

Claude reads the condition, plans the work, executes a turn (write the manifest, add the content script, wire up the input listener), then the evaluator checks whether the condition holds. If not, the reason ("manifest is valid but content script doesn't fire on inputs of type=email") feeds into the next turn. The session keeps looping until Haiku says yes or you hit the 30-turn cap.

For longer or more structured runs, write a plan file first and point /goal at it. Ask Claude to draft the plan with "plan for goal" and you get a markdown file with sections like:

Brief — one-paragraph statement of the outcome
Tech stack — languages, frameworks, runtimes
In scope / Out of scope — surface the constraint up-front
Constraints — files Claude must not touch, conventions to honour
Definition of done — the literal text Haiku evaluates against
Acceptance criteria — discrete checks (test exits 0, file count = N, etc.)
Verification recipe — exact commands to run to prove the criteria
Turn budget — explicit cap so a runaway loop can't drain weekly tokens

Then point Claude at the file: /goal Execute the plan in PLAN.md. The model uses the file as ground truth and the evaluator checks each acceptance criterion separately. This is dramatically more reliable than a one-liner condition.

Writing a Condition That Doesn't Waste Turns

Because Haiku is the evaluator, the cardinal rule is: anchor on things Haiku can verify with a quick read of the conversation. Good and bad examples:

Don't say	Do say
"Make the code production-ready"	"`npm test` exits 0 and there are no `TODO` or `FIXME` tokens in the diff"
"Improve performance"	"`benchmark/run.sh` reports p95 latency < 200ms"
"Refactor cleanly"	"All 12 files in `src/auth/` use the new `AuthClient` import and the old `legacyAuth` symbol has zero usages"
"Build a working game"	"index.html loads without console errors and clicking 'Roll' updates the score display"

The pattern: verifiable end state, observable signal, hard stop. A condition Haiku can answer with a yes/no after one glance is a condition that won't waste your weekly tokens.

Head-to-Head: Claude Code vs Codex on /goal

The Codex implementation shipped first. Two weeks later Anthropic followed. That's a real win for users — same primitive on both harnesses. To test, the same prompt was given to both with their most capable models (Claude Code on Opus 4.7, Codex on GPT 5.5):

Build a single-file Settlers of Catan clone, Game of Thrones themed, 32-bit pixel art, where the four player houses are reskinned as the AI labs — Anthropic, Google, OpenAI, X. Single player versus three AI bots.

OpenAI Codex CLI repository on GitHub — OpenAI's Codex CLI introduced `/goal` as an experimental config option ~2 weeks before Anthropic shipped the same primitive in Claude Code.

Dimension	Claude Code (Opus 4.7)	Codex (GPT 5.5)
Wall-clock to "goal achieved"	13 min 05 sec	33 min
Game logic functionality	Dice rolls, settlement placement, AI turns — works	Works but UX hides affordances (no dice indicator, no disabled-state on unaffordable actions)
Visuals	Placeholder shapes — no native image generation	Real sprites — GPT image 2 generated assets for the house menu
Asset integration	N/A	Generated assets but didn't reuse them as map tiles — orphaned

The takeaway: Codex wins on visual polish because it has GPT image 2 in the bag. Claude Code wins on functional logic in less than half the time. If your project needs real sprites or generated UI imagery, plan to call out to OpenAI's image model or Google's nano banana from inside Claude Code — Anthropic doesn't ship an image generator, and that's a real gap for game and design work.

The Token-Drain Strategy: Burn Tokens Before They Reset

Claude's weekly rate limit resets on a fixed schedule. If you finish a week with 92% of your tokens unspent, that 92% just vanishes — there's no rollover. The Max plans in particular ship more tokens than most developers can consume in a normal coding week.

/goal turns that unused budget into output. Frontload token-heavy work into a single autonomous run on Sunday night before the Monday reset. Things that work well:

Batch newsletter / blog drafts — point a skill at recent assets, ask /goal to produce N drafts. The example in the source: "create four weekly newsletter drafts from these four YouTube videos." Result: 8 minutes, 4 HTML files ready for review.
Regression test backfill — "add test coverage for every uncovered function in src/billing/ until coverage report reads ≥ 80%."
Documentation sweeps — "every public API in api/ has a JSDoc block with @param and @returns."
Refactor migrations — "every file importing legacyClient now imports v2Client and the type signatures still compile."

Add a token-budget clause to the condition: "Stop when 95% of weekly tokens have been spent OR when the task list is empty." Haiku reads the most recent /usage output and decides — clean way to use the headroom without overshooting.

Claude Code documentation overview — The Claude Code docs are the authoritative source for `/goal` syntax and the evaluator contract — worth bookmarking.

Limitations and What to Watch For

Terminal only. The VS Code extension doesn't expose /goal yet. Use a real shell or your IDE's built-in terminal launching claude.
Haiku is a small model. Vague conditions get vague evaluations. If Haiku declares done too early, the goal exits with broken code. If Haiku never declares done, you burn turns until the cap. Anchor on objective signals (exit codes, file contents, test results) not subjective ones.
No native image generation. Real design/game projects need an external image API. Budget for it separately.
Reports of 5-day runs are unverified. 14-hour overnight sessions are common; multi-day runs are rare and the failure modes (looping evaluator, stalled tool calls, runaway token spend) are real. Add hard caps even when you want a long run.
Auto mode is complementary, not redundant. Auto mode auto-approves tool calls within a turn. /goal decides whether to start another turn. Use both for truly hands-off runs — but only after you trust the condition.

Pair this with

Long /goal sessions burn through weekly tokens fast. The Claude Max plan ships the highest weekly limits — meaningful headroom if you're running multi-hour autonomous sessions on top of normal coding work. The 20× tier ($200/mo) is the one most "I want to run /goal overnight" developers settle on.

If you're using /goal to batch-write articles, newsletters, or marketing copy, run the bulk output through Undetectr ($39 lifetime) before publishing. The text comes out indistinguishable from human writing — no AI-detection footprint, which matters when you're shipping at /goal volume.

Frequently Asked Questions

❓ What is the /goal command in Claude Code?

A slash command in Claude Code that sets a completion condition. After every turn, a small fast model (Haiku) evaluates the condition. Claude keeps running turns until the evaluator confirms the goal is met. The goal clears automatically once achieved or once you run /goal clear.

❓ How does /goal differ from /loop?

/loop re-fires the same prompt on a time interval. /goal keeps running until a verifiable condition holds. /goal evaluates after every turn with a separate model; /loop just re-runs the same prompt on a schedule.

❓ How long can Claude Code run with /goal?

There is no hard time limit — sessions are bounded by your weekly token budget and any turn or time clause inside the condition. Public reports include 14-hour overnight sessions, 45-hour Codex sessions, and one user reporting a 5-day continuous run.

❓ Does Codex have a /goal command too?

Yes. OpenAI's Codex CLI shipped /goal as an experimental config option roughly two weeks before Anthropic added it to Claude Code. Both implement the same Ralph loop pattern — run a coding agent until a verifiable end state holds.

❓ Which model evaluates the /goal condition?

By default the small fast model — Claude Haiku for Anthropic deployments. The evaluator receives your condition and the recent conversation, returns a yes/no plus a short reason. Evaluator tokens are billed at Haiku rates and are typically negligible compared to the main turn cost.

❓ Can /goal work in the Claude Code VS Code extension?

Not yet. /goal currently only runs from a terminal session — either your IDE's built-in terminal launching the claude CLI, or a standalone shell. It also runs non-interactively via claude -p with the goal condition embedded.

❓ How do you write an effective /goal condition?

Anchor on one measurable end state (a test result, build exit code, an empty queue), state how Claude should prove it ("npm test exits 0"; "the new page renders without console errors"), and list constraints that must not change. Add a turn-budget clause ("or stop after 60 turns") to bound runtime.

❓ Do I need auto mode for /goal to run unattended?

They are complementary. Auto mode auto-approves tool calls within a turn. /goal decides whether to start another turn. Pair them and Claude can run a long session without any human prompts between tool calls or turns.

Related Articles

Are You Building an AI Tool? Get your tool listed in front of thousands of developers, creators, and businesses searching for AI solutions.

Submit Your AI Tool — Free Listing →

Claude Code /goal: How to Run AI Agents Autonomously for Days

Get Your AI Tool in Front of 50,000+ Monthly Readers

📋 Table of Contents

What /goal Actually Does

How the Evaluator Loop Works

How to Use /goal — A Five-Minute Walkthrough

Writing a Condition That Doesn't Waste Turns

Head-to-Head: Claude Code vs Codex on /goal

The Token-Drain Strategy: Burn Tokens Before They Reset

Limitations and What to Watch For

Frequently Asked Questions

Recommended AI Tools

Wondershare Dr.Fone

Wondershare RecoverIt

Emergent.sh

Kie.ai

From Our Store

Claude Code Power User Kit

OpenClaw Business Starter Kit