How to Debug a Failing Raid Task

June 3, 2026

by Alex Salerno

When a raid command fails, the first instinct is usually to scroll up looking for a stack trace. Raid is more disciplined than that: every failure carries a category, an error code, a message, and often a hint. This guide walks through reading them, common categories of failure, and a few flags that help isolate the cause.

1. Read the exit code first

Raid maps failures to five categorical exit codes. They tell you what kind of thing went wrong without reading any logs:

Exit code	Category	Typical cause
1	Generic	Unclassified — fall-through; treat as bug report material
2	Config	Invalid profile / repo YAML / argument
3	Task	A user task failed (shell exit non-zero, prompt missing default, wait timeout)
4	Network	Clone failed, HTTP task failed
5	Not Found	Profile / env / command / repo / file missing

Print $? (POSIX) / $LASTEXITCODE (PowerShell) right after the failed run to see it. The category alone narrows the cause to one of five buckets.

2. Decode the structured error

Raid surfaces a typed error object. In text mode it prints a prose line:

Error: profile 'web-platform' not found (PROFILE_NOT_FOUND)
Hint: use 'raid profile list' to see available profiles

In --json mode it emits a stable envelope:

{
  "error": {
    "code": "PROFILE_NOT_FOUND",
    "category": "not-found",
    "message": "profile 'web-platform' not found",
    "hint": "use 'raid profile list' to see available profiles"
  }
}

code is the most stable identifier — assert on it in scripts. hint is the fastest path to the fix.

3. Common failure codes and how to react

Code	Category	Where it comes from	Likely fix
`PROFILE_INVALID`	Config	Profile file failed schema validation	Check the `# yaml-language-server` errors in your editor
`PROFILE_NOT_ACTIVE`	Not Found	No profile is active	`raid profile list` then `raid profile <name>`
`PROFILE_NOT_FOUND`	Not Found	Named profile isn't registered	`raid profile list` to confirm; re-add if missing
`REPO_NOT_CLONED`	Not Found	Command needs a repo that's not on disk	`raid install` or `raid install <repo>`
`COMMAND_NOT_FOUND`	Not Found	`raid foo` and no `foo` command exists	`raid --help` for the actual list
`ENV_NOT_FOUND`	Not Found	`raid env staging` but no `staging` env defined	`raid env list` to confirm naming
`TASK_SHELL_FAILED`	Task	A `Shell` task exited non-zero	Read the shell's own stderr; that's where the real cause is
`TASK_WAIT_TIMEOUT`	Task	`Wait` task exhausted its `timeout:`	The dependency isn't coming up — check it directly
`HEADLESS_PROMPT_NO_DEFAULT`	Task	Headless mode hit a Prompt without a `default:`	Add a `default:` or set the var via env before running
`CLONE_FAILED`	Network	Git couldn't clone a repo	SSH key, repo URL, or network
`TASK_HTTP_FAILED`	Network	`HTTP` task got a non-2xx or couldn't reach the URL	URL right? Auth right? Service up?
`SCHEMA_VALIDATION_FAILED`	Config	YAML doesn't match the published schema	Editor LSP usually pinpoints the line

The full list is more — but these cover the majority of day-to-day failures.

4. Inspect the workspace state

When the cause isn't immediately obvious from the error, dump the current state:

raid context --json | jq .

That returns active profile, active env, every repo with its git state (branch, dirty?), and recently-run commands. Half the "why doesn't this work" failures resolve here:

The repo isn't cloned (repos[*].cloned is false).
It's on the wrong branch (repos[*].branch).
Local edits are blocking a checkout (repos[*].dirty is true).
The env isn't what you thought it was.

For agent-driven debug, raid context serve gives the same data over MCP — see How to use Raid as an MCP server.

5. Re-run with more output

A failing Shell task already passes its stderr through. If you need more from Raid itself:

--json — emits the error envelope verbatim, including all detail fields (not all of which print in text mode).
Re-run the failing shell directly. Find the cmd: in the YAML, change into the repo, and run it. If the bug reproduces, it's not Raid — it's the underlying tool.
Run a single task in isolation. Comment out the surrounding tasks in the command and re-run. Once you've isolated the failing task, the cause is usually obvious.

6. Tolerate expected failures

Sometimes a step is allowed to fail — a clean-up command that may or may not have anything to clean, a probe that's informational. Mark the task continueOnFailure: true so the command keeps going:

- type: Shell
  cmd: rm -rf ./.cache
  options:
    continueOnFailure: true

The task's failure is logged but doesn't abort the command. The command's overall exit code is still 0 if everything else succeeded.

Don't use this to paper over real bugs — but for genuinely best-effort steps, it's the right tool.

7. Headless / CI debugging

A few flags help when you're staring at a CI log instead of a live terminal:

--json — easier to grep than prose.
RAID_NO_PREFIX=1 — drops the [task-name] prefix on concurrent output so plain logs are easier to read.
Bisect via --threads 1 — for raid install, reduce clone parallelism to 1 to make logs strictly sequential.

8. When it's actually a Raid bug

If you've checked the code, the error message is wrong or misleading, and raid context shows expected state, you've probably hit a Raid bug. Open an issue at github.com/8bitAlex/raid/issues with the error code, the exit code, and a minimal raid.yaml that reproduces it.

Next steps

How to Wire Raid into CI — the flags that complement debugging in CI.
How to Use Raid as an MCP Server for AI Agents — get an agent to inspect the workspace state for you.
Raid technical deep dive — the underlying error model.

Follow me