SKILL.md

When to Activate

User asks to “test”, “QA”, or “validate” the project
User says “let’s check if everything works”
User has a QA document and wants help executing it
After a release or major feature merge that needs verification

Skip When

User wants to run tests, check pass/fail, or measure coverage — use codi-test-suite
User wants to audit the codi installation itself — use codi-dev-e2e-testing
User wants to generate new automated tests from scratch — use codi-tdd
User wants to investigate a specific bug — use codi-debugging

Phase Types

Every QA phase falls into one of two categories. The coding agent MUST identify and label each phase before starting it.

Type	Label	Who Drives	Examples
AGENT	`[AGENT]`	Coding agent executes commands and analyzes output	CLI commands, file checks, JSON validation, config verification
HUMAN	`[HUMAN]`	Human must perform the action (IDE, visual, interactive)	IDE integration, interactive prompts, visual/UX checks, git commit hooks

Rules for AGENT phases: The coding agent runs commands, reads output, and diagnoses results autonomously. The human observes and confirms.

Rules for HUMAN phases: The coding agent describes exactly what the human should do and what to look for. The human performs the action and reports back. The agent analyzes the feedback.

Core Workflow — One Phase at a Time

flowchart TD
    A[Start: Load or create QA plan] --> B[Present next phase to human]
    B --> C{Phase type?}
    C -->|AGENT| D[Agent executes checks]
    C -->|HUMAN| E[Agent describes steps for human]
    D --> F[Analyze results]
    E --> G[Human performs and reports]
    G --> F
    F --> H{Bug found?}
    H -->|Yes| I[Diagnose root cause from source code]
    I --> J[Propose fix plan to human]
    J --> K{Human approves fix?}
    K -->|Yes| L[Implement fix + tests]
    L --> M[Re-run failed check to verify]
    K -->|No| N[Log as known issue]
    H -->|No| O[Update QA report]
    M --> O
    N --> O
    O --> P{Human approves moving to next phase?}
    P -->|Yes| B
    P -->|No| Q[Address concerns]
    Q --> B

CRITICAL: Never jump ahead to the next phase without explicit human approval. Present results, wait for confirmation, then proceed.

Process

Step 1: Discover and Classify Test Scope

[CODING AGENT] Before starting:

Check for existing QA documents (docs/qa/, QA report files, TESTING.md)
If a QA document exists, read it and extract phases
If not, analyze the project to build a phase list:
- Read package.json or equivalent for available commands
- Check README.md for documented features
- Scan CLI entry points, API routes, or UI components
Classify each phase as AGENT or HUMAN:

AGENT phases — coding agent can execute directly:
- CLI commands that produce deterministic output
- File existence and content checks
- JSON/YAML validation
- Config generation and drift detection
- Error handling and edge case testing
- Build and test suite execution
HUMAN phases — require human interaction:
- IDE integration testing (loading rules, verifying agent behavior)
- Interactive wizard/prompt flows
- Visual/UX verification
- Git hook testing that requires real commits
- Anything requiring interactive terminal input
Present the full phase list with classifications to the human for approval before starting.

Step 2: Prepare Environment

[CODING AGENT] Verify prerequisites:

Check runtime environment (Node version, Python version, etc.)
Verify dependencies are installed
Ensure a clean state if needed (build, re-init, etc.)
Confirm environment is ready with the human

Step 3: Execute One Phase

For AGENT Phases

The coding agent drives execution:

Announce: “Starting Phase N: [name] [AGENT]”
Read source code to understand expected behavior BEFORE running checks
Execute each check in the phase:
- Run the command
- Compare output against expected behavior from source code
- Diagnose: PASS / FAIL / WARN / SKIP
Present results summary to the human

Format for each check:

CHECK [N.M]: [description]
Command:  [what was run]
Expected: [from source code analysis]
Actual:   [command output]
Result:   PASS | FAIL | WARN | SKIP

For HUMAN Phases

The coding agent provides instructions:

Announce: “Phase N: [name] [HUMAN] — requires your action”
Describe each step the human should perform:
- Exact actions to take
- What to observe
- What success looks like
- What failure looks like
Wait for human feedback before diagnosing

Step 4: Handle Bugs

When a check fails:

Diagnose: Read the relevant source code to identify the root cause
Classify: Is it a bug, a missing feature, a config issue, or user error?
Present diagnosis to the human with:
- What happened vs what should have happened
- Root cause in the source code (file path + line)
- Proposed fix approach
Wait for human decision:
- Fix now → enter plan mode, implement fix, add tests, verify
- Fix later → log as known issue in the QA report
- Not a bug → update the QA document to reflect correct behavior
After fix: Re-run the failed check to confirm the fix works
Update the QA report with the bug, fix, and version

Step 5: Complete Phase and Get Approval

After all checks in a phase are done:

Present phase summary:
- Total checks, passed, failed, warnings, skipped
- Any bugs found and their status (fixed / deferred)
Update the QA report document with results
Ask human: “Phase N complete. Ready to proceed to Phase N+1?”
Only proceed when human confirms

Step 6: Generate Final Report

After all phases (or when human decides to stop):

Save the QA report to the project docs directory. The report is a living document updated throughout the process, not generated only at the end.

Report structure:

# QA Progress Report
**Date**: [date]
**Document**: [filename]
**Category**: REPORT

## Summary
[High-level status and context]

## QA Phase Status
| Phase | Type | Status | Notes |
|-------|------|--------|-------|
| N: [Name] | AGENT/HUMAN | COMPLETED/PENDING/IN PROGRESS | [summary] |

## Phase Results Detail
### Phase N: [Name] (STATUS)
| Test | Result | Detail |
|------|--------|--------|
| [check description] | PASS/FAIL/WARN/SKIP | [details] |

## Remaining Human-Only Phases
[Instructions for phases that still need human execution]

## Bugs Found and Fixed During QA
| # | Bug | Fix | Version |
|---|-----|-----|---------|
| N | [description] | [fix applied] | [version] |

## Doc Corrections Found During QA
[Any documentation errors discovered during testing]

Key Principles

One phase at a time — never batch multiple phases without human approval between them
Never guess — always read source code to understand expected behavior before diagnosing
Classify before executing — every phase must be labeled AGENT or HUMAN before starting
Fix or defer, never ignore — every failure gets a decision: fix now, fix later, or reclassify
Living document — update the QA report after every phase, not just at the end
Human controls pace — the agent proposes, the human approves progression
Be specific — give exact commands, exact expected outputs, exact file paths
Distinguish bug from misuse — a failed check might be a real bug OR a testing environment issue
Verify fixes — after fixing a bug, re-run the failed check to confirm
Track doc errors too — if a QA document references a wrong command name or non-existent feature, log the correction

Tester Roles

Role	Responsibility
HUMAN	Approves phase progression, performs HUMAN-only checks, makes fix/defer decisions, provides judgment on UX/visual checks
CODING AGENT	Drives AGENT phases autonomously, reads source code for diagnosis, proposes fixes, maintains the QA report document, tracks all results

Available Agents

For automated test generation from QA findings, delegate to this agent:

codi-test-generator — Convert QA findings into automated regression tests. Prompt at ${CLAUDE_SKILL_DIR}[[/agents/test-generator.md]]

codi-dev-e2e-testing — Full end-to-end validation of the codi installation
codi-test-suite — Run, measure coverage, or generate tests