SKILL.md
When to Activate
- Before claiming any task is complete
- Before saying tests pass
- Before saying a bug is fixed
- Before any positive status update
- Before requesting a code review
- Before marking a task done in a plan
Skip When
- You are in an investigation phase (no completion claim yet) — use codi-evidence-gathering
- You are debugging a known failure — use codi-debugging
- You are in a brainstorming / planning phase where nothing is implemented — use codi-brainstorming / codi-plan-writer
- You already ran the exact command this turn and have the output in scope — just cite it; no need to re-run
The Iron Law
NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE. Evidence means: you ran the command this session and read the output.
The Verification Gate
Follow these 5 steps before claiming completion:
- IDENTIFY — What is the exact proof command? (e.g.,
pnpm test,cargo test,curl -s http://...) - RUN — Execute it now, in this session. Do not rely on previous runs.
- READ — Read the complete output including exit codes. Do not skip past errors or warnings.
- VERIFY — Does the output actually support the claim? Yes/No.
- CLAIM — State the result with the specific evidence: “Tests pass: 142 passing, 0 failing” not “tests should pass”
Weasel Word Detection
If you are about to use any of these words, STOP and run the verification gate first:
- “should work”, “should pass”, “should be fixed”
- “probably works”, “probably passing”
- “seems to work”, “seems correct”
- “I believe it passes”, “I think it works”
- “likely fixed”, “appears to work”
- “looks good”, “looks correct” (without running a check)
These words mean you have not verified. Run the command first, then make the claim.
Evidence Table
| Claim Type | Required Evidence | Not Sufficient |
|---|---|---|
| ”Tests pass” | Run test suite, show passing count, 0 failing | Previous run, “should pass" |
| "Bug is fixed” | Reproduce the original bug first, then show it no longer occurs | Code changed, assumed fixed |
| ”Feature works” | Run the specific user scenario, show actual output | Code looks correct |
| ”Build succeeds” | Run the build command, show 0 errors | Linter passing, logs look good |
| ”Linting passes” | Run the linter, show 0 errors/warnings | Partial check, extrapolation |
| ”Security scan clean” | Run the scan, show findings count | Assumed clean |
| ”Task complete” | Run all verification steps defined in the task | Tests passing |
| ”Regression test works” | Red-green cycle verified: write, run (pass), revert fix, run (must fail), restore, run (pass) | Test passes once |
| ”Agent completed” | Check VCS diff, verify changes exist | Agent reports “success" |
| "Requirements met” | Re-read plan, create checklist, verify each item | Tests passing |
Red Flags
These situations signal you are about to make an unverified claim:
- Expressing satisfaction (“Great, that should do it!”) before running a check
- Planning the next step before verifying the current step is done
- Citing a test run from earlier in the session as current evidence
- Extrapolating from partial output (“the first 10 tests passed so all must pass”)
- Trusting your own code reading over running the actual code
- “I just changed X so Y must work now”
What Counts as Evidence
Fresh run this session. Complete output read. Exit code confirmed. If the test suite was run 10 messages ago, that is NOT fresh evidence.
Key Patterns
Tests:
Run test command -> see "34/34 pass" -> then claim "All tests pass"
NOT: "Should pass now" / "Looks correct"
Regression tests (TDD Red-Green):
Write test -> Run (must pass) -> Revert fix -> Run (MUST FAIL) -> Restore fix -> Run (must pass again)
NOT: "I've written a regression test" without completing the red-green cycle
Build:
Run build command -> see exit 0 -> then claim "Build passes"
NOT: "Linter passed" (linter does not verify compilation)
Requirements:
Re-read plan -> create checklist -> verify each item -> report gaps or completion
NOT: "Tests pass, phase complete"
Agent delegation:
Agent reports success -> check VCS diff -> verify changes exist -> report actual state
NOT: Trust agent report at face value
Rule Applies To
This rule applies to ALL of the following, not only exact phrases:
- Any variation of success or completion claims
- Any expression of satisfaction before running a check
- Any positive statement about the state of the work
- Any implication that work is done or correct
- Committing, creating PRs, marking tasks done, moving to the next task
- Delegating to agents and accepting their success reports
The spirit of the rule: no unverified claims, ever.
Integration
- Use at the end of every
codi-tddcycle (Verify RED, Verify GREEN). - Use in
codi-debuggingPhase 4 before claiming fix is complete. - Use in
codi-plan-executionbefore marking tasks done. - Use in
codi-branch-finishbefore presenting completion options.