README

What This Skill Does

codi-pr-review is a built-in skill that takes a GitHub pull request through a five-phase review: discover (gh pr view / diff / checks), review (delegated to a PR-aware reviewer agent), verify (two-pass: claims vs. reality, then self-critique), document (persists a structured markdown report under docs/ using the [PR_REVIEW] category tag), and publish (posts via gh pr review --request-changes|--approve|--comment). Output follows OWASP-aligned severity (Critical/High/Medium/Low), Conventional Comments labels, and the Google eng-practices spirit of balancing code health vs. forward progress.

Directory Structure

pr-review/
├── template.ts                          TypeScript template literal with SKILL.md content + frontmatter
├── index.ts                             Exports template and staticDir (required by skill-template-loader)
├── README.md                            This file — developer documentation
├── evals/
│   └── evals.json                       10 test cases (5 positive, 1 slash-command, 4 negative)
├── references/
│   ├── severity-mapping.md              OWASP × Conventional Comments × decorator table with worked examples
│   └── pr-review-template.md            The markdown skeleton the skill emits in Phase 4
└── agents/
    └── pr-reviewer.md                   PR-aware reviewer agent prompt (3-pass: find → verify claims → self-critique)

Workflow

  1. Phase 1 — Discover. The agent resolves the PR number (from $ARGUMENTS or gh pr list), then pulls metadata, file inventory, and CI state. Review scale is decided from LOC changed: inline for small PRs, delegate to pr-reviewer agent for medium/large PRs.
  2. Phase 2 — Review. Either inline or via agents/pr-reviewer.md. Every finding carries severity + Conventional Comments label + decorator + file:line citation + CWE for security findings.
  3. Phase 3 — Verify. Two passes. Pass 1 walks every claim in the PR description and greps the diff for the claimed symbols — defined-but-unused is flagged CRITICAL for security-sensitive claims. Pass 2 drops unverifiable or opinion-only findings.
  4. Phase 4 — Document. Writes docs/YYYYMMDD_HHMMSS_[PR_REVIEW]_pr-N-slug.md following the skeleton in references/pr-review-template.md. The skill does not commit the file.
  5. Phase 5 — Publish. Asks the user to confirm the verdict, then posts via gh pr review (preferred — attaches to merge semantics) or falls back to gh pr comment for contributors without review permission. Returns the review URL.

Configuration

No configurable values. The skill reads project rules from .${PROJECT_NAME}/rules/ (resolved at runtime via PROJECT_NAME constant) and project docs directory conventions (docs/ with [CATEGORY] tag) from the repo itself. If the host project uses a closed category set that does not include [PR_REVIEW], the skill instructs the user to add it or falls back to [REVIEW].

Output Conventions

Document file naming

docs/YYYYMMDD_HHMMSS_[PR_REVIEW]_pr-<N>-<slug>.md
  • Timestamp from date +"%Y%m%d_%H%M%S"
  • [PR_REVIEW] is the category tag
  • <slug> is a kebab-case summary of the PR title, max 6 words

Document structure

The markdown follows references/pr-review-template.md. Required sections:

  1. Header metadata (PR URL, branch, author, scope, reviewer)
  2. Recommendation with one-paragraph rationale
  3. Severity summary table
  4. Findings grouped by severity (CRITICAL → LOW), then by theme
  5. Claims vs. reality table (from Phase 3 Pass 1)
  6. Minimum fixes before merge (numbered, actionable)
  7. Follow-up work that can land in a separate PR
  8. Ship-worthy callouts (what’s already solid)

GitHub post shape

The skill posts one body-level review via gh pr review with the full markdown doc as body. Inline comments are optional — they require the gh-pr-review extension and are only used for CRITICAL / HIGH findings when the reviewer wants them anchored to specific lines.

Installed Requirements

  • gh CLI (GitHub CLI) — required for all phases. The skill expects the user to be authenticated (gh auth login).
  • Optional: gh-pr-review extension for inline threaded comments. Skill falls back to body-level review if absent.

No Node or Python scripts ship with this skill — it is SKILL.md + references + agent prompt only. This is a deliberate deviation from the skill-creator default (Python+TS helper scripts); there is no deterministic logic to extract beyond the gh CLI calls that live directly in SKILL.md.

Design Decisions

  • Separate from codi-code-review. Code review and PR review share primitives (severity, evidence, findings) but differ in orchestration: PR review must handle gh CLI, the PR description claim-check, the published doc, and the posted review. Keeping them separate avoids bloating codi-code-review with PR-only concerns, and lets pr-review call code-review internally if the host project later factors that out.
  • OWASP-aligned severity instead of the legacy Critical / Warning / Suggestion. OWASP’s scale maps cleanly to CVSS and CWE, which is what automated tooling and compliance review expect. Conventional Comments labels carry the author-intent signal that OWASP severity alone lacks.
  • Two-pass verification is non-negotiable. This is the single biggest lever for review precision. Claude Code’s own Code Review docs document the same pattern (verification bar + self-critique) and report large reductions in false positives in production usage.
  • gh pr review over gh pr comment as the default. Review objects attach to GitHub’s merge semantics (request-changes blocks merge). Plain comments do not. The skill keeps gh pr comment as a fallback for contributors without review permission.
  • No scripts. Anything the skill does can be expressed as a gh CLI call or a Write. Adding a Python/TypeScript wrapper would add a maintenance burden without deterministic payoff.

Adapting for Similar Use Cases

To build a GitLab or Bitbucket variant:

  1. Copy this directory to src/templates/skills/mr-review/ (or similar)
  2. Swap gh pr commands in template.ts for glab mr or bb pr
  3. Adjust the review-object semantics — GitLab’s --request-changes equivalent is glab mr approve / glab mr unapprove, Bitbucket uses bb pr approve
  4. Keep the severity / label / decorator taxonomy unchanged — these are VCS-agnostic
  5. Update evals to use the new CLI tool and re-run

Testing

Run the evals:

npm run test:skills -- pr-review     # if a harness exists

Manual validation:

# 1. Install the skill locally
npm run build
codi add skill codi-pr-review --template codi-pr-review

# 2. Validate frontmatter and schema
codi validate

# 3. Generate agent configs
codi generate

# 4. Exercise the skill on a real PR
# In a repo with an open PR:
# /codi-pr-review <pr-number>

Check that the skill:

  • Reads the PR via gh pr view and gh pr diff
  • Produces the document under docs/ with the correct filename convention
  • Asks for confirmation before posting to GitHub
  • Uses gh pr review --request-changes when findings include CRITICAL or blocking HIGH
  • codi-code-review — review code that is not yet in a PR (uncommitted, single file, local branch)
  • codi-security-scan — deep OWASP audit across the whole codebase, not scoped to one PR
  • codi-project-documentation — general docs (README, ADR, API) outside PR context
  • codi-branch-finish — feature-branch lifecycle; can trigger this skill before proposing the merge

SKILL.md

When to Activate

  • User asks to review a GitHub pull request by number, URL, or branch
  • User asks to audit or comment on a PR before merge
  • User wants to publish a structured review (approve / request changes / comment) on a PR
  • User asks to produce a PR-review document that can be shared with the team
  • User invokes the /codi-pr-review slash command, optionally with a PR number argument

Skip When

  • User wants a review of uncommitted / local changes (no PR yet) — use codi-code-review
  • User wants to fix issues already surfaced in a review — use codi-debugging
  • User wants a deep OWASP security audit across the codebase — use codi-security-scan
  • User wants general project docs (README, ADR, API) — use codi-project-documentation
  • User wants to create, rebase, or land a PR — use codi-branch-finish

Review Process — 5 Phases

The workflow is deliberately phased so each step is verifiable before the next. Do not skip phases.

Phase 1 — Discover

[CODING AGENT] Gather the PR surface area before reviewing anything.

  1. Resolve the PR identifier:
    • If the user passed $ARGUMENTS, treat it as the PR number
    • Otherwise run gh pr list --state open --limit 20 and ask the user to confirm
  2. Pull metadata and scope:
    gh pr view $PR_NUMBER                       # description, author, reviewers, labels
    gh pr diff $PR_NUMBER --name-only | wc -l   # total file count
    gh pr diff $PR_NUMBER --stat                # per-file additions/deletions
    gh pr checks $PR_NUMBER                     # CI status
  3. Record:
    • Target branch (usually main or development)
    • Author
    • Total additions / deletions / file count
    • CI state (green / red / pending)
    • Labels and milestone
  4. Decide review scale based on scope:
    • < 200 lines changed → inline review in main context
    • 200 – 1000 lines → delegate to a code-review agent (see Phase 2)
    • > 1000 lines → delegate, and warn the user the PR exceeds healthy review size

Phase 2 — Review

[CODING AGENT] Produce findings, not opinions. Prefer evidence over intuition.

For small PRs, read gh pr diff $PR_NUMBER yourself and generate findings directly.

For medium / large PRs, delegate to the code-review agent:

  • Agent prompt: ${CLAUDE_SKILL_DIR}[[/agents/pr-reviewer.md]]
  • Provide the agent with: PR number, PR description, file list, target branch, any specific focus areas the user mentioned

Finding taxonomy — every finding must include:

  1. Severity — one of CRITICAL / HIGH / MEDIUM / LOW (OWASP Risk Rating-aligned)
  2. Label — Conventional Comments label (issue, suggestion, nitpick, question, thought, chore, praise)
  3. Decoratorblocking (must fix), non-blocking (should consider), or if-minor (fix only if the change is small)
  4. Evidencefile_path:line_number citation, not inference from naming
  5. Rationale — why it matters (one sentence)
  6. Fix — minimal change or code snippet (skip if obvious)

Full severity / label / decorator table: ${CLAUDE_SKILL_DIR}[[/references/severity-mapping.md]]

Focus areas — apply these checks at minimum:

  • Security: secrets, authn/authz, input validation, injection, OWASP Top 10 — reference CWE where relevant
  • Correctness: logic errors, null/edge cases, state transitions, API contract drift
  • Reliability: error handling, timeouts, retries, race conditions, transactional boundaries
  • Performance: N+1 queries, unbounded lists, missing indexes, unnecessary re-renders
  • Testing: new behavior without tests, tests that assert the wrong thing
  • Architecture: file size limits, circular deps, wrong abstraction level
  • Project rules: whatever lives in .codi/rules/ — check every change against every rule that applies

Phase 3 — Verify (Two-Pass)

[CODING AGENT] Before writing the review document, critique your own findings. This cuts false positives.

Pass 1 — Claims vs. reality

Read the PR description. For every claim the author makes (e.g. “added rate limiting”, “webhook signature verification”, “GDPR purge removes all PII”), verify it against the diff:

  • Is the feature actually present?
  • Does a dead-code search (gh pr diff $PR_NUMBER | grep -n <symbol>) show call sites?
  • Do the tests cover the behavior the claim promises?

Any claim that is not supported by the diff becomes a CRITICAL or HIGH finding under “Claims vs. reality”, depending on whether it is security-relevant.

Pass 2 — Self-critique

Walk through your finding list and remove any finding that:

  • Lacks a file:line citation
  • Relies on a framework behavior you did not actually verify
  • Is a style preference, not a defect (downgrade to nitpick (non-blocking) or drop)
  • Contradicts an explicit project rule you did not read

Keep only findings you are confident about. A short list of real issues beats a long list of noise.

Phase 4 — Document

[CODING AGENT] Produce a single markdown file that stands alone.

Filename convention (matches the project-documentation naming rule):

docs/YYYYMMDD_HHMMSS_[PR_REVIEW]_pr-<N>-<slug>.md
  • Timestamp: date +"%Y%m%d_%H%M%S"
  • Slug: kebab-case summary of the PR title, max 6 words
  • Category tag: [PR_REVIEW] (add to the project’s allowed-category list if it uses a closed set)

Document structure — follow the template at ${CLAUDE_SKILL_DIR}[[/references/pr-review-template.md]]. Required sections:

  1. Header metadata (date, PR URL, branch, author, scope, reviewer)
  2. Recommendation (approve / request changes / comment) with one-paragraph rationale
  3. Severity summary table (counts per severity)
  4. Findings grouped by severity, then by theme (security / correctness / reliability / performance / tests / style)
  5. Claims vs. reality table from Phase 3 Pass 1
  6. Minimum fixes before merge (numbered, actionable)
  7. Follow-up work (can land in a later PR if tracked)
  8. What is ship-worthy already (explicitly call out the solid parts)

Write the file with the Write tool. Do not attempt to shorten — the document is the durable record.

Phase 5 — Publish

[CODING AGENT] Post the review to GitHub. Ask the user to confirm the verdict before posting — this is a shared-state action.

Pick the right gh command based on the verdict:

VerdictCommandWhen to use
Request changesgh pr review $N --request-changes --body-file <doc>Any CRITICAL or blocking HIGH finding
Approvegh pr review $N --approve --body-file <doc>No blocking findings; only nitpicks or questions
Commentgh pr review $N --comment --body-file <doc>Advisory review, author is not ready for approval decision, or you are not in the reviewer list
Plain commentgh pr comment $N --body-file <doc>Fallback if gh pr review is not available (external contributors, no review permission)

Prefer gh pr review over gh pr comment — a review object attaches to GitHub’s merge semantics (blocks merge on request-changes), a comment does not.

Inline comments (optional, for large reviews): if the gh-pr-review extension is installed, post individual findings as inline review comments so authors see them on the right lines. Fall back to the single-body review if the extension is missing.

After posting, return the review URL to the user (e.g. https://github.com/org/repo/pull/N#pullrequestreview-<id>).

Output Format

When the skill completes, emit a terse summary:

PR #<N> review complete
  doc:      <path-to-doc>
  verdict:  request-changes | approve | comment
  findings: <CRITICAL>/<HIGH>/<MEDIUM>/<LOW>
  posted:   <review-url>

Constraints

  • Do NOT post the review to GitHub without explicit user confirmation of the verdict
  • Do NOT include praise padding or sycophantic openers in the document
  • Do NOT use severity inflation — HIGH is not the default, CRITICAL is rare
  • Do NOT emit findings without a file:line citation
  • Do NOT mix types of documents — this skill produces a PR-review report, nothing else
  • Do NOT skip Phase 3 (verification) — unverified findings waste the author’s time
  • Do NOT paste the full diff back to the user — reference it by file and line

Receiving a PR Review (for the author side)

When feedback ARRIVES on YOUR PR, switch to codi-receiving-code-review — that skill enforces the iron law (external feedback is suggestions to evaluate, not orders to follow), the forbidden-phrase list, and the 4-step READ → VERIFY → DECIDE → RESPOND workflow.

Available Agents

  • codi-pr-reviewer — severity-ranked PR-aware review. Prompt at ${CLAUDE_SKILL_DIR}[[/agents/pr-reviewer.md]]
  • codi-code-review — review code without a PR (uncommitted diff, single file, local branch)
  • codi-security-scan — deep OWASP scan across the whole codebase, not just the PR
  • codi-project-documentation — general documentation (README, ADR, API) outside PR context
  • codi-branch-finish — complete a feature branch; triggers this skill before merge
  • codi-receiving-code-review — sibling skill for the consuming side: how to evaluate and respond to PR review feedback on YOUR work