SKILL.md

When to Activate

Before claiming any task is complete
Before saying tests pass
Before saying a bug is fixed
Before any positive status update
Before requesting a code review
Before marking a task done in a plan

Skip When

You are in an investigation phase (no completion claim yet) — use codi-evidence-gathering
You are debugging a known failure — use codi-debugging
You are in a brainstorming / planning phase where nothing is implemented — use codi-brainstorming / codi-plan-writer
You already ran the exact command this turn and have the output in scope — just cite it; no need to re-run

The Iron Law

NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE. Evidence means: you ran the command this session and read the output.

The Verification Gate

Follow these 5 steps before claiming completion:

IDENTIFY — What is the exact proof command? (e.g., pnpm test, cargo test, curl -s http://...)
RUN — Execute it now, in this session. Do not rely on previous runs.
READ — Read the complete output including exit codes. Do not skip past errors or warnings.
VERIFY — Does the output actually support the claim? Yes/No.
CLAIM — State the result with the specific evidence: “Tests pass: 142 passing, 0 failing” not “tests should pass”

Weasel Word Detection

If you are about to use any of these words, STOP and run the verification gate first:

“should work”, “should pass”, “should be fixed”
“probably works”, “probably passing”
“seems to work”, “seems correct”
“I believe it passes”, “I think it works”
“likely fixed”, “appears to work”
“looks good”, “looks correct” (without running a check)

These words mean you have not verified. Run the command first, then make the claim.

Evidence Table

Claim Type	Required Evidence	Not Sufficient
”Tests pass”	Run test suite, show passing count, 0 failing	Previous run, “should pass"
"Bug is fixed”	Reproduce the original bug first, then show it no longer occurs	Code changed, assumed fixed
”Feature works”	Run the specific user scenario, show actual output	Code looks correct
”Build succeeds”	Run the build command, show 0 errors	Linter passing, logs look good
”Linting passes”	Run the linter, show 0 errors/warnings	Partial check, extrapolation
”Security scan clean”	Run the scan, show findings count	Assumed clean
”Task complete”	Run all verification steps defined in the task	Tests passing
”Regression test works”	Red-green cycle verified: write, run (pass), revert fix, run (must fail), restore, run (pass)	Test passes once
”Agent completed”	Check VCS diff, verify changes exist	Agent reports “success"
"Requirements met”	Re-read plan, create checklist, verify each item	Tests passing

Red Flags

These situations signal you are about to make an unverified claim:

Expressing satisfaction (“Great, that should do it!”) before running a check
Planning the next step before verifying the current step is done
Citing a test run from earlier in the session as current evidence
Extrapolating from partial output (“the first 10 tests passed so all must pass”)
Trusting your own code reading over running the actual code
“I just changed X so Y must work now”

What Counts as Evidence

Fresh run this session. Complete output read. Exit code confirmed. If the test suite was run 10 messages ago, that is NOT fresh evidence.

Key Patterns

Tests:

Run test command -> see "34/34 pass" -> then claim "All tests pass"
NOT: "Should pass now" / "Looks correct"

Regression tests (TDD Red-Green):

Write test -> Run (must pass) -> Revert fix -> Run (MUST FAIL) -> Restore fix -> Run (must pass again)
NOT: "I've written a regression test" without completing the red-green cycle

Build:

Run build command -> see exit 0 -> then claim "Build passes"
NOT: "Linter passed" (linter does not verify compilation)

Requirements:

Re-read plan -> create checklist -> verify each item -> report gaps or completion
NOT: "Tests pass, phase complete"

Agent delegation:

Agent reports success -> check VCS diff -> verify changes exist -> report actual state
NOT: Trust agent report at face value

Rule Applies To

This rule applies to ALL of the following, not only exact phrases:

Any variation of success or completion claims
Any expression of satisfaction before running a check
Any positive statement about the state of the work
Any implication that work is done or correct
Committing, creating PRs, marking tasks done, moving to the next task
Delegating to agents and accepting their success reports

The spirit of the rule: no unverified claims, ever.

Integration

Use at the end of every codi-tdd cycle (Verify RED, Verify GREEN).
Use in codi-debugging Phase 4 before claiming fix is complete.
Use in codi-plan-execution before marking tasks done.
Use in codi-branch-finish before presenting completion options.