README

What This Skill Does

codi-pr-review is a built-in skill that takes a GitHub pull request through a five-phase review: discover (gh pr view / diff / checks), review (delegated to a PR-aware reviewer agent), verify (two-pass: claims vs. reality, then self-critique), document (persists a structured markdown report under docs/ using the [PR_REVIEW] category tag), and publish (posts via gh pr review --request-changes|--approve|--comment). Output follows OWASP-aligned severity (Critical/High/Medium/Low), Conventional Comments labels, and the Google eng-practices spirit of balancing code health vs. forward progress.

Directory Structure

pr-review/
├── template.ts                          TypeScript template literal with SKILL.md content + frontmatter
├── index.ts                             Exports template and staticDir (required by skill-template-loader)
├── README.md                            This file — developer documentation
├── evals/
│   └── evals.json                       10 test cases (5 positive, 1 slash-command, 4 negative)
├── references/
│   ├── severity-mapping.md              OWASP × Conventional Comments × decorator table with worked examples
│   └── pr-review-template.md            The markdown skeleton the skill emits in Phase 4
└── agents/
    └── pr-reviewer.md                   PR-aware reviewer agent prompt (3-pass: find → verify claims → self-critique)

Workflow

Phase 1 — Discover. The agent resolves the PR number (from $ARGUMENTS or gh pr list), then pulls metadata, file inventory, and CI state. Review scale is decided from LOC changed: inline for small PRs, delegate to pr-reviewer agent for medium/large PRs.
Phase 2 — Review. Either inline or via agents/pr-reviewer.md. Every finding carries severity + Conventional Comments label + decorator + file:line citation + CWE for security findings.
Phase 3 — Verify. Two passes. Pass 1 walks every claim in the PR description and greps the diff for the claimed symbols — defined-but-unused is flagged CRITICAL for security-sensitive claims. Pass 2 drops unverifiable or opinion-only findings.
Phase 4 — Document. Writes docs/YYYYMMDD_HHMMSS_[PR_REVIEW]_pr-N-slug.md following the skeleton in references/pr-review-template.md. The skill does not commit the file.
Phase 5 — Publish. Asks the user to confirm the verdict, then posts via gh pr review (preferred — attaches to merge semantics) or falls back to gh pr comment for contributors without review permission. Returns the review URL.

Configuration

No configurable values. The skill reads project rules from .${PROJECT_NAME}/rules/ (resolved at runtime via PROJECT_NAME constant) and project docs directory conventions (docs/ with [CATEGORY] tag) from the repo itself. If the host project uses a closed category set that does not include [PR_REVIEW], the skill instructs the user to add it or falls back to [REVIEW].

Output Conventions

Document file naming

docs/YYYYMMDD_HHMMSS_[PR_REVIEW]_pr-<N>-<slug>.md

Timestamp from date +"%Y%m%d_%H%M%S"
[PR_REVIEW] is the category tag
<slug> is a kebab-case summary of the PR title, max 6 words

Document structure

The markdown follows references/pr-review-template.md. Required sections:

Header metadata (PR URL, branch, author, scope, reviewer)
Recommendation with one-paragraph rationale
Severity summary table
Findings grouped by severity (CRITICAL → LOW), then by theme
Claims vs. reality table (from Phase 3 Pass 1)
Minimum fixes before merge (numbered, actionable)
Follow-up work that can land in a separate PR
Ship-worthy callouts (what’s already solid)

GitHub post shape

The skill posts one body-level review via gh pr review with the full markdown doc as body. Inline comments are optional — they require the gh-pr-review extension and are only used for CRITICAL / HIGH findings when the reviewer wants them anchored to specific lines.

Installed Requirements

gh CLI (GitHub CLI) — required for all phases. The skill expects the user to be authenticated (gh auth login).
Optional: gh-pr-review extension for inline threaded comments. Skill falls back to body-level review if absent.

No Node or Python scripts ship with this skill — it is SKILL.md + references + agent prompt only. This is a deliberate deviation from the skill-creator default (Python+TS helper scripts); there is no deterministic logic to extract beyond the gh CLI calls that live directly in SKILL.md.

Design Decisions

Separate from codi-code-review. Code review and PR review share primitives (severity, evidence, findings) but differ in orchestration: PR review must handle gh CLI, the PR description claim-check, the published doc, and the posted review. Keeping them separate avoids bloating codi-code-review with PR-only concerns, and lets pr-review call code-review internally if the host project later factors that out.
OWASP-aligned severity instead of the legacy Critical / Warning / Suggestion. OWASP’s scale maps cleanly to CVSS and CWE, which is what automated tooling and compliance review expect. Conventional Comments labels carry the author-intent signal that OWASP severity alone lacks.
Two-pass verification is non-negotiable. This is the single biggest lever for review precision. Claude Code’s own Code Review docs document the same pattern (verification bar + self-critique) and report large reductions in false positives in production usage.
gh pr review over gh pr comment as the default. Review objects attach to GitHub’s merge semantics (request-changes blocks merge). Plain comments do not. The skill keeps gh pr comment as a fallback for contributors without review permission.
No scripts. Anything the skill does can be expressed as a gh CLI call or a Write. Adding a Python/TypeScript wrapper would add a maintenance burden without deterministic payoff.

Adapting for Similar Use Cases

To build a GitLab or Bitbucket variant:

Copy this directory to src/templates/skills/mr-review/ (or similar)
Swap gh pr commands in template.ts for glab mr or bb pr
Adjust the review-object semantics — GitLab’s --request-changes equivalent is glab mr approve / glab mr unapprove, Bitbucket uses bb pr approve
Keep the severity / label / decorator taxonomy unchanged — these are VCS-agnostic
Update evals to use the new CLI tool and re-run

Testing

Run the evals:

npm run test:skills -- pr-review     # if a harness exists

Manual validation:

# 1. Install the skill locally
npm run build
codi add skill codi-pr-review --template codi-pr-review

# 2. Validate frontmatter and schema
codi validate

# 3. Generate agent configs
codi generate

# 4. Exercise the skill on a real PR
# In a repo with an open PR:
# /codi-pr-review <pr-number>

Check that the skill:

Reads the PR via gh pr view and gh pr diff
Produces the document under docs/ with the correct filename convention
Asks for confirmation before posting to GitHub
Uses gh pr review --request-changes when findings include CRITICAL or blocking HIGH

codi-code-review — review code that is not yet in a PR (uncommitted, single file, local branch)
codi-security-scan — deep OWASP audit across the whole codebase, not scoped to one PR
codi-project-documentation — general docs (README, ADR, API) outside PR context
codi-branch-finish — feature-branch lifecycle; can trigger this skill before proposing the merge

SKILL.md

When to Activate

User asks to review a GitHub pull request by number, URL, or branch
User asks to audit or comment on a PR before merge
User wants to publish a structured review (approve / request changes / comment) on a PR
User asks to produce a PR-review document that can be shared with the team
User invokes the /codi-pr-review slash command, optionally with a PR number argument

Skip When

User wants a review of uncommitted / local changes (no PR yet) — use codi-code-review
User wants to fix issues already surfaced in a review — use codi-debugging
User wants a deep OWASP security audit across the codebase — use codi-security-scan
User wants general project docs (README, ADR, API) — use codi-project-documentation
User wants to create, rebase, or land a PR — use codi-branch-finish

Review Process — 5 Phases

The workflow is deliberately phased so each step is verifiable before the next. Do not skip phases.

Phase 1 — Discover

[CODING AGENT] Gather the PR surface area before reviewing anything.

Resolve the PR identifier:
- If the user passed $ARGUMENTS, treat it as the PR number
- Otherwise run gh pr list --state open --limit 20 and ask the user to confirm

Pull metadata and scope:

gh pr view $PR_NUMBER                       # description, author, reviewers, labels
gh pr diff $PR_NUMBER --name-only | wc -l   # total file count
gh pr diff $PR_NUMBER --stat                # per-file additions/deletions
gh pr checks $PR_NUMBER                     # CI status

Record:
- Target branch (usually main or development)
- Author
- Total additions / deletions / file count
- CI state (green / red / pending)
- Labels and milestone
Decide review scale based on scope:
- < 200 lines changed → inline review in main context
- 200 – 1000 lines → delegate to a code-review agent (see Phase 2)
- > 1000 lines → delegate, and warn the user the PR exceeds healthy review size

Phase 2 — Review

[CODING AGENT] Produce findings, not opinions. Prefer evidence over intuition.

For small PRs, read gh pr diff $PR_NUMBER yourself and generate findings directly.

For medium / large PRs, delegate to the code-review agent:

Agent prompt: ${CLAUDE_SKILL_DIR}[[/agents/pr-reviewer.md]]
Provide the agent with: PR number, PR description, file list, target branch, any specific focus areas the user mentioned

Finding taxonomy — every finding must include:

Severity — one of CRITICAL / HIGH / MEDIUM / LOW (OWASP Risk Rating-aligned)
Label — Conventional Comments label (issue, suggestion, nitpick, question, thought, chore, praise)
Decorator — blocking (must fix), non-blocking (should consider), or if-minor (fix only if the change is small)
Evidence — file_path:line_number citation, not inference from naming
Rationale — why it matters (one sentence)
Fix — minimal change or code snippet (skip if obvious)

Full severity / label / decorator table: ${CLAUDE_SKILL_DIR}[[/references/severity-mapping.md]]

Focus areas — apply these checks at minimum:

Security: secrets, authn/authz, input validation, injection, OWASP Top 10 — reference CWE where relevant
Correctness: logic errors, null/edge cases, state transitions, API contract drift
Reliability: error handling, timeouts, retries, race conditions, transactional boundaries
Performance: N+1 queries, unbounded lists, missing indexes, unnecessary re-renders
Testing: new behavior without tests, tests that assert the wrong thing
Architecture: file size limits, circular deps, wrong abstraction level
Project rules: whatever lives in .codi/rules/ — check every change against every rule that applies

Phase 3 — Verify (Two-Pass)

[CODING AGENT] Before writing the review document, critique your own findings. This cuts false positives.

Pass 1 — Claims vs. reality

Read the PR description. For every claim the author makes (e.g. “added rate limiting”, “webhook signature verification”, “GDPR purge removes all PII”), verify it against the diff:

Is the feature actually present?
Does a dead-code search (gh pr diff $PR_NUMBER | grep -n <symbol>) show call sites?
Do the tests cover the behavior the claim promises?

Any claim that is not supported by the diff becomes a CRITICAL or HIGH finding under “Claims vs. reality”, depending on whether it is security-relevant.

Pass 2 — Self-critique

Walk through your finding list and remove any finding that:

Lacks a file:line citation
Relies on a framework behavior you did not actually verify
Is a style preference, not a defect (downgrade to nitpick (non-blocking) or drop)
Contradicts an explicit project rule you did not read

Keep only findings you are confident about. A short list of real issues beats a long list of noise.

Phase 4 — Document

[CODING AGENT] Produce a single markdown file that stands alone.

Filename convention (matches the project-documentation naming rule):

docs/YYYYMMDD_HHMMSS_[PR_REVIEW]_pr-<N>-<slug>.md

Timestamp: date +"%Y%m%d_%H%M%S"
Slug: kebab-case summary of the PR title, max 6 words
Category tag: [PR_REVIEW] (add to the project’s allowed-category list if it uses a closed set)

Document structure — follow the template at ${CLAUDE_SKILL_DIR}[[/references/pr-review-template.md]]. Required sections:

Header metadata (date, PR URL, branch, author, scope, reviewer)
Recommendation (approve / request changes / comment) with one-paragraph rationale
Severity summary table (counts per severity)
Findings grouped by severity, then by theme (security / correctness / reliability / performance / tests / style)
Claims vs. reality table from Phase 3 Pass 1
Minimum fixes before merge (numbered, actionable)
Follow-up work (can land in a later PR if tracked)
What is ship-worthy already (explicitly call out the solid parts)

Write the file with the Write tool. Do not attempt to shorten — the document is the durable record.

Phase 5 — Publish

[CODING AGENT] Post the review to GitHub. Ask the user to confirm the verdict before posting — this is a shared-state action.

Pick the right gh command based on the verdict:

Verdict	Command	When to use
Request changes	`gh pr review $N --request-changes --body-file <doc>`	Any CRITICAL or blocking HIGH finding
Approve	`gh pr review $N --approve --body-file <doc>`	No blocking findings; only nitpicks or questions
Comment	`gh pr review $N --comment --body-file <doc>`	Advisory review, author is not ready for approval decision, or you are not in the reviewer list
Plain comment	`gh pr comment $N --body-file <doc>`	Fallback if `gh pr review` is not available (external contributors, no review permission)

Prefer gh pr review over gh pr comment — a review object attaches to GitHub’s merge semantics (blocks merge on request-changes), a comment does not.

Inline comments (optional, for large reviews): if the gh-pr-review extension is installed, post individual findings as inline review comments so authors see them on the right lines. Fall back to the single-body review if the extension is missing.

After posting, return the review URL to the user (e.g. https://github.com/org/repo/pull/N#pullrequestreview-<id>).

Output Format

When the skill completes, emit a terse summary:

PR #<N> review complete
  doc:      <path-to-doc>
  verdict:  request-changes | approve | comment
  findings: <CRITICAL>/<HIGH>/<MEDIUM>/<LOW>
  posted:   <review-url>

Constraints

Do NOT post the review to GitHub without explicit user confirmation of the verdict
Do NOT include praise padding or sycophantic openers in the document
Do NOT use severity inflation — HIGH is not the default, CRITICAL is rare
Do NOT emit findings without a file:line citation
Do NOT mix types of documents — this skill produces a PR-review report, nothing else
Do NOT skip Phase 3 (verification) — unverified findings waste the author’s time
Do NOT paste the full diff back to the user — reference it by file and line

Receiving a PR Review (for the author side)

When feedback ARRIVES on YOUR PR, switch to codi-receiving-code-review — that skill enforces the iron law (external feedback is suggestions to evaluate, not orders to follow), the forbidden-phrase list, and the 4-step READ → VERIFY → DECIDE → RESPOND workflow.

Available Agents

codi-pr-reviewer — severity-ranked PR-aware review. Prompt at ${CLAUDE_SKILL_DIR}[[/agents/pr-reviewer.md]]

codi-code-review — review code without a PR (uncommitted diff, single file, local branch)
codi-security-scan — deep OWASP scan across the whole codebase, not just the PR
codi-project-documentation — general documentation (README, ADR, API) outside PR context
codi-branch-finish — complete a feature branch; triggers this skill before merge
codi-receiving-code-review — sibling skill for the consuming side: how to evaluate and respond to PR review feedback on YOUR work