README
A local HTTP server skill that turns any HTML file or static website into a collaboratory between a human in a browser and a coding agent in a terminal.
The agent starts the server pointed at a directory, tells the user the URL, and the user opens it. Every page served gets a tiny inspector script injected into it. When the user hovers or clicks a DOM element, the inspector reports the full context — selector, attributes, text, computed styles, bounding rect, parent chain — back to the server. The agent then polls a handful of REST endpoints to know exactly what the user is looking at, what they just interacted with, and (when enabled) can even run JavaScript inside the page to drive the UI.
This skill is the HTML-side counterpart to content-factory: same Node
CJS + stdlib http stack, same workspace conventions under .codi_output/,
same start/stop shell script pattern.
Table of contents
- What this skill does
- Directory structure
- Workflow
- Configuration
- Server API reference
- Injected inspector client
- Design decisions
- Adapting for similar use cases
- Testing
What this skill does
- Serves a user-supplied directory over HTTP on
127.0.0.1:<random-port>. - Injects
<script src="/__inspect/inspector.js"></script>before</body>on every HTML response. - The injected inspector draws a hover outline, lets the user click to lock
a selection, and streams selection events and user interactions to the
server via
POST /__inspect/ingest. - The server exposes a REST API the coding agent polls to read:
- What element is currently selected (full context)
- A ring buffer of recent user interactions (click, input, navigation, scroll)
- Current page URL, title, viewport
- Optional (on by default, disable with
--no-eval): the agent can push JavaScript to the server, which the in-page inspector long-polls, runs in the page context, and returns the result to the agent.
Directory structure
html-live-inspect/
├── README.md # this file
├── index.ts # exports template + staticDir
├── template.ts # SKILL.md body (wrapped in TS literal)
├── evals/
│ └── evals.json # trigger + behavior evals
├── scripts/
│ ├── package.json # zero-dep Node package manifest
│ ├── start-server.sh # launcher — prints JSON {url,pid,apiBase}
│ ├── stop-server.sh # graceful shutdown via pid file
│ ├── server.cjs # HTTP entrypoint + static + injection
│ ├── lib/
│ │ ├── http-utils.cjs # sendJson, sendText, readBody, status helpers
│ │ ├── workspace.cjs # pid/log/state files under .codi_output/
│ │ ├── selection-store.cjs # latest selection + history
│ │ ├── event-log.cjs # ring buffer (seq-based, default 500)
│ │ ├── eval-bridge.cjs # long-poll queue for agent→browser eval
│ │ └── injector.cjs # <script> tag injection into HTML
│ ├── routes/
│ │ ├── health-routes.cjs # GET /api/health, GET /api/page
│ │ ├── selection-routes.cjs # GET /api/selection[, /history]
│ │ ├── events-routes.cjs # GET /api/events?since=N, DELETE /api/events
│ │ ├── dom-routes.cjs # GET /api/dom?selector=...
│ │ ├── eval-routes.cjs # POST /api/eval (+ internal pull/push)
│ │ └── ingest-routes.cjs # POST /__inspect/ingest (client → server)
│ └── client/
│ └── inspector.js # injected browser-side overlay + capture
└── tests/
└── integration/
└── server.test.js # boot server, hit API, assert shapes
Workflow
- Agent invokes
scripts/start-server.sh --site-dir <path>. The launcher picks a random port, writes pid/log files under<project>/.codi_output/html-live-inspect/, and prints a JSON line{"url":"http://localhost:PORT","apiBase":"http://localhost:PORT/api","pid":1234}. - User opens the URL in a browser. The first HTML response has the
inspector script injected. The browser loads
/__inspect/inspector.jsfrom the server. - User hovers and clicks elements. The inspector draws an outline,
captures a full context snapshot on click, and POSTs it to
/__inspect/ingest. - Agent polls
GET /api/selectionto know what the user is looking at, andGET /api/events?since=Nto catch up on recent interactions. - Agent drives the UI (if eval is enabled) by calling
POST /api/evalwith a JavaScript body. The server queues it; the inspector’s long-poll loop picks it up, runs it inside the page, and posts the result back. - Agent shuts down the server with
scripts/stop-server.sh. The workspace directory stays if a--project-dirwas given, otherwise it is wiped from/tmp/.
Configuration
Environment variables honored by server.cjs:
| Variable | Default | Purpose |
|---|---|---|
HLI_PORT | random high port | bind port |
HLI_HOST | 127.0.0.1 | bind address |
HLI_SITE_DIR | required | directory served as static root |
HLI_WORKSPACE | /tmp/html-live-inspect-workspace | pid/log/state root |
HLI_ALLOW_EVAL | 1 | 0 disables /api/eval and in-page eval |
HLI_EVENT_BUFFER | 500 | ring buffer capacity for events |
HLI_IDLE_TIMEOUT_MS | 1800000 | auto-shutdown when idle (30 min) |
HLI_OWNER_PID | none | parent PID to watch for exit |
All of these are settable via flags on start-server.sh:
--site-dir, --host, --port, --workspace, --no-eval,
--idle-timeout, --foreground/--background.
Server API reference
Base URL: http://127.0.0.1:<port>. All responses are JSON unless noted.
| Method | Path | Purpose |
|---|---|---|
| GET | /api/health | {status, uptimeMs, siteDir, allowEval} |
| GET | /api/page | {url, title, viewport, userAgent, lastUpdateMs} |
| GET | /api/selection | Full context of the currently locked element — null if none |
| GET | /api/selection/history?limit=N | Previous selections (max 50) |
| GET | /api/events?since=<seq>&limit=N | {events:[...], nextSeq, dropped} |
| DELETE | /api/events | Clear the event ring buffer |
| GET | /api/dom?selector=<css> | Ask the page to resolve the selector and return context |
| POST | /api/eval | Body {js, timeoutMs?} → {ok, result?, error?} (403 if eval disabled) |
| GET | /__inspect/inspector.js | The injected client script (served raw JS) |
| GET | /__inspect/eval-pull | Internal long-poll for the inspector — not for agent use |
| POST | /__inspect/eval-push | Internal result callback from the inspector |
| POST | /__inspect/ingest | Internal sink for selection/event updates from the inspector |
Selection payload shape:
{
"seq": 42,
"timestamp": 1713110400000,
"selector": "main > section:nth-of-type(2) > button.primary",
"tag": "button",
"id": "submit",
"classes": ["btn", "primary"],
"attributes": {"type": "submit", "aria-label": "Send"},
"text": "Send message",
"outerHTMLSnippet": "<button id=\"submit\" ...>...</button>",
"boundingRect": {"x": 120, "y": 480, "width": 140, "height": 44},
"computedStyles": {"display": "inline-block", "color": "rgb(255,255,255)", ...},
"parentChain": [
{"tag": "section", "id": "", "classes": ["hero"], "selector": "main > section:nth-of-type(2)"},
...
],
"childrenCount": 1,
"pageUrl": "http://localhost:1234/index.html",
"pageTitle": "Demo"
}
Injected inspector client
scripts/client/inspector.js is the brain of the browser side. It:
- Draws a dashed outline on
mouseovervia a single absolutely-positioned overlay<div>— does not touch the page’s own styles. - Locks the selection on
click(orAlt+clickto unlock). - Builds a stable CSS selector path using
:nth-of-typeindices. - Snapshots a curated set of computed styles (box model, typography, color,
flex/grid) — not the full
getComputedStyleobject, which is huge. - Captures interactions (
click,input,submit,scroll, navigation) and POSTs them to/__inspect/ingestwith a monotonic sequence number. - Long-polls
/__inspect/eval-pull; when a task arrives, runs it in a controlledFunction()scope, captures the result (or error), and POSTs back to/__inspect/eval-push. - Exposes
window.__HLI__with helpers for debugging but nothing the page can exploit to exfiltrate data — the inspector is isolated per origin and only communicates with its own server.
Design decisions
- Zero npm deps — Node stdlib
http/fs/pathonly. Matchescontent-factoryand eliminates install friction. - Single-process server, single shared state — the server is expected to run one inspected site at a time, so in-memory state is fine. Pid + log files let a second invocation detect and replace a stale instance.
- Long-poll over WebSocket for eval — one code path, no upgrade dance, works behind anything. Long-poll timeout of 25s matches browser fetch defaults.
- Injection via regex on response body — simple and resilient. For
responses that are missing
</body>, the injector appends to the end. script-srcis NOT tightened — if a user’s HTML already has a strict CSP, the injected inspector may be blocked. Documented as a known limitation; the user can add'self'to theirscript-srcor set the--csp-relaxflag to rewrite theirContent-Security-Policyheader (off by default).- Eval is opt-out, not opt-in — the user explicitly opted for this
during skill design. Disable with
--no-evalfor demos to untrusted stakeholders. - No auth on localhost — bound to
127.0.0.1by default. Exposing to a LAN requires--host 0.0.0.0AND--allow-remote, and prints a loud warning.
Adapting for similar use cases
To build a variant (e.g. a PDF inspector, a canvas inspector):
- Duplicate the directory under
src/templates/skills/<new-name>/. - Replace
scripts/client/inspector.jswith the target-specific capture logic. - Adjust
scripts/lib/injector.cjsfor the new content type — the regex that finds an injection point will differ. - Re-export and register in
src/templates/skills/index.tsandsrc/core/scaffolder/skill-template-loader.ts.
The server core (server.cjs, routes/, lib/http-utils.cjs,
lib/workspace.cjs, lib/event-log.cjs, lib/selection-store.cjs,
lib/eval-bridge.cjs) is reusable as-is.
Testing
- Evals — run the 6 cases in
evals/evals.jsonafter any change to the description. Two negatives are required to keep false triggers down. - Integration test —
tests/integration/server.test.jsboots the server against a fixture directory, hits every API endpoint, asserts response shapes, and checks that HTML responses carry the injected script tag. Run withnpx vitest run src/templates/skills/html-live-inspect/tests/. - Manual smoke test —
bash scripts/start-server.sh --site-dir ./fixturesthen open the printed URL, click elements, andcurl $apiBase/selection.
SKILL.md
Overview
HTML Live Inspect turns any local HTML file or static website into a shared workspace between the human (in a browser) and the coding agent (in a terminal). You start a tiny Node HTTP server pointed at a directory, give the user the URL, and every page the server returns is silently augmented with a DOM inspector. When the user hovers or clicks an element, a full context snapshot is pushed to the server. The agent reads a handful of REST endpoints to know exactly what the user is looking at and what they just did. When enabled, the agent can also run JavaScript inside the live page to drive the UI, highlight things, fill forms, or trigger actions — a true collaboratory.
This skill is the HTML twin of codi-content-factory: same Node CJS +
stdlib http stack, same start/stop shell scripts, same workspace layout
under .codi_output/.
When to Activate
- The user asks you to open a local HTML file or folder so you can see what they click.
- The user says “start html live inspect”, “open this html in collaboratory mode”, “let me show you in the browser”, “watch what I select”, “inspect this page with me”.
- The user wants you to drive a local web page (click, fill, read state) while they watch in a browser.
- The user is designing HTML and wants you to know which element they mean without pasting selectors.
Skip When
- The user wants to test a live remote URL — use
codi-webapp-testing(Playwright-backed). - The user wants to generate social cards or slides — use
codi-content-factory. - The user wants to build frontend components from scratch — use
codi-frontend-design.
Skill assets
| Asset | Purpose |
|---|---|
${CLAUDE_SKILL_DIR}[[/scripts/start-server.sh]] | Start the server — prints JSON {url, apiBase, pid} |
${CLAUDE_SKILL_DIR}[[/scripts/stop-server.sh]] | Stop the server gracefully via pid file |
${CLAUDE_SKILL_DIR}[[/scripts/server.cjs]] | Node HTTP entrypoint (zero deps) |
${CLAUDE_SKILL_DIR}[[/scripts/client/inspector.js]] | Injected browser-side overlay and capture script |
${CLAUDE_SKILL_DIR}[[/scripts/routes/selection-routes.cjs]] | /api/selection handlers |
${CLAUDE_SKILL_DIR}[[/scripts/routes/events-routes.cjs]] | /api/events handlers |
${CLAUDE_SKILL_DIR}[[/scripts/routes/dom-routes.cjs]] | /api/dom handler |
${CLAUDE_SKILL_DIR}[[/scripts/routes/eval-routes.cjs]] | /api/eval handler (agent → page JS) |
${CLAUDE_SKILL_DIR}[[/scripts/routes/health-routes.cjs]] | /api/health and /api/page handlers |
Agent workflow
Step 1 — Start the server
Run:
bash ${CLAUDE_SKILL_DIR}[[/scripts/start-server.sh]] --site-dir <absolute-path-to-html-or-folder>
Flags:
| Flag | Default | Purpose |
|---|---|---|
--site-dir <path> | required | File or directory to serve |
--host <host> | 127.0.0.1 | Bind address (localhost only by default) |
--port <port> | random high port | Fixed port |
--workspace <dir> | /tmp/html-live-inspect-workspace | pid/log/state root |
--no-eval | eval enabled | Disable the /api/eval endpoint and in-page executor |
--idle-timeout <ms> | 1800000 | Auto-shutdown after idle (30 min) |
--foreground | background | Run attached (for debugging) |
The launcher prints one JSON line on stdout:
{"url":"http://localhost:49234","apiBase":"http://localhost:49234/api","pid":12345,"siteDir":"/abs/path","allowEval":true}
Capture url and apiBase. Tell the user the URL and ask them to open
it in their browser.
Step 2 — Wait for the user to select something
The inspector supports three selection modes:
- Plain click — replaces the single current selection. Read with
GET /api/selection. - Cmd/Ctrl + click — toggles membership in a multi-selection set.
Each selected element gets an orange overlay, and a badge in the corner
shows the count. Read the set with
GET /api/selections. - Alt + click — clears both the single selection and the entire multi-selection set.
After the user tells you they have selected, choose the right endpoint:
# Single selection (most common)
curl -s "$apiBase/selection"
# Multi-selection set — returns {count, selections: [...]}
curl -s "$apiBase/selections"
When count > 0, prefer /api/selections and apply operations to all
of them at once (see Step 5).
Step 3 — Read context and act
The selection payload contains everything you need to reason about the element the user picked:
selector— stable CSS selector (use for further DOM queries via/api/dom?selector=...or/api/eval)tag,id,classes,attributes— semantic identitytext,outerHTMLSnippet— content (truncated to 500 chars / 2 KB)computedStyles— curated subset (box model, typography, color, flex/grid) — not the full raw style objectboundingRect— x, y, width, height in viewport coordinatesparentChain— up to 5 ancestors (tag, id, classes, selector)childrenCount— direct children countpageUrl,pageTitle— context
Step 4 — Catch up on recent interactions
curl -s "$apiBase/events?since=0"
Returns a ring buffer of recent events (clicks, inputs, form submits,
navigations, scroll positions). Each event has a monotonic seq — poll
with ?since=<last-seq> to get only new events. The response also has
dropped (events lost to buffer overflow) and nextSeq.
Step 5 — Drive the page (optional, if allowEval is true)
curl -s -X POST "$apiBase/eval" \
-H "Content-Type: application/json" \
-d '{"js":"document.querySelector(\"#submit\").click(); return true;"}'
The body’s js string is wrapped in a Function() and executed in the
page context. The return value (or thrown error) comes back as
{ok, result, error}. Use this to click buttons, fill inputs, toggle
state, read computed values the static snapshot does not capture, or
highlight elements for the user.
Powerful things to do with eval:
- Highlight the element you are talking about:
const el = document.querySelector(sel); el.style.outline = '3px solid magenta'; setTimeout(() => el.style.outline = '', 2000); - Scroll an element into view before the user looks:
document.querySelector(sel).scrollIntoView({behavior:'smooth', block:'center'}); - Read live values not in the static snapshot:
return {value: document.querySelector('#name').value, scroll: window.scrollY}; - Drive a form end-to-end on behalf of the user for a demo.
Eval times out after 10 seconds by default. Override with
{"js":"...", "timeoutMs": 30000}.
Applying an operation to the multi-selection set:
selectors=$(curl -s "$apiBase/selections" | jq -c '.selections | map(.selector)')
curl -s -X POST "$apiBase/eval" -H "Content-Type: application/json" \
-d "{\"js\":\"const sels=${selectors}; sels.forEach(s=>document.querySelectorAll(s).forEach(el=>el.style.color='red')); return sels.length;\"}"
Similarity / query selection (Option C pattern):
When the user clicks one element and says “apply to all like this”, do not ask them to click every one. Instead:
- Read
GET /api/selectionto get the clicked element’s tag + classes + parent context. - Propose 2-3
querySelectorAllvariants that might match what they want, e.g.:- all siblings with the same tag:
.hero > p - all elements with the same class:
.btn.primary - all descendants matching a pattern:
.card h3
- all siblings with the same tag:
- Preview the match count for each with
POST /api/eval { js: "return window.__HLI__.previewQuery('.btn.primary');" }. - Highlight the best match set briefly to confirm visually:
POST /api/eval { js: "return window.__HLI__.highlight('.btn.primary', 2000);" }. - Get user confirmation in chat.
- Apply the operation with a single eval over the full
querySelectorAll.
The inspector exposes these helpers on window.__HLI__:
previewQuery(selector)→ match count (or-1on invalid selector)highlight(selectorOrList, ms)→ briefly outlines all matches in magentalistSelections()→ array of selectors currently in the multi setdescribe(selector)→ full snapshot for a CSS selector
Step 6 — Stop the server
bash ${CLAUDE_SKILL_DIR}[[/scripts/stop-server.sh]]
The server also auto-shuts down after 30 minutes of idle.
Server API reference
All endpoints return JSON. Base URL is ${apiBase} (from the start
script output).
| Method | Path | Body | Response |
|---|---|---|---|
| GET | /api/health | — | {status, uptimeMs, siteDir, allowEval, version} |
| GET | /api/page | — | {url, title, viewport, userAgent, lastUpdateMs} |
| GET | /api/selection | — | Single current selection, or null |
| GET | /api/selection/history?limit=N | — | Previous single-click selections |
| GET | /api/selections | — | Multi-selection set: {count, selections:[...]} |
| DELETE | /api/selections | — | Clear the multi-selection set |
| GET | /api/events?since=<seq>&limit=N | — | {events, nextSeq, dropped} |
| DELETE | /api/events | — | {ok: true} |
| GET | /api/dom?selector=<css> | — | Selection-shape object for the first match, or null |
| POST | /api/eval | {js, timeoutMs?} | {ok, result, error} — 403 if disabled |
The ingest, pull, and push endpoints under /__inspect/* are internal —
they are used by the injected client, never by the agent.
Conventions
- Always serve an absolute path. Pass
--site-dirwith the full path so the server does not depend on where it was started. - One inspected site at a time. If you need a second, stop the first (the workspace tracks a single active server via pid file).
- Bind to localhost only. Do not pass
--host 0.0.0.0unless the user explicitly asks for LAN access — and tell them it is unauthenticated. - Wait for user interaction. Do not poll
/api/selectionin a tight loop. Either wait for the user to say “I clicked” or poll once every few seconds. - Cite the selector, not the description. When referring to what the
user selected in your replies, include the exact
selectorfrom the payload so the user knows you are looking at the same element. - Always echo the URL to the user. The user cannot interact with a server they do not know how to open.
Output contract
When activated, you must:
- Start the server and emit the
urlto the user in plain text. - Confirm which directory is being served and whether eval is enabled.
- Explain in one line that the user can hover + click to select.
- Poll
/api/selectiononce the user confirms they have selected, and summarize the element in 2-3 sentences (tag + role + text + key classes). - Stop the server with the stop script when the user is done, or when you are handing off to another skill.
Constraints
- Do NOT start the server without an explicit
--site-dirfrom the user. - Do NOT bind to any interface other than
127.0.0.1unless the user explicitly asks. - Do NOT poll APIs in a busy loop — respect the server.
- Do NOT run
/api/evalwith code the user has not seen — show the JS first, get confirmation, then execute. - Do NOT ask the user to click every element when a similarity query could match them all — propose a selector, preview the count, confirm, then apply (Option C pattern in Step 5).
- Do NOT leave the server running across unrelated tasks. Stop it when done.
- Do NOT use this skill to inspect remote production sites — use
codi-webapp-testing(Playwright) for that.