Skill Verifier — Verification Toolkit Orchestrator

Version: 2.1.0 Purpose: Master workflow for verifying a SkillSlap skill. Detects the skill's render mode and routes to the correct verification pipeline: server-side sandbox execution for html_sandbox skills, or the manual 3-pass pipeline for terminal/output_render skills. References other toolkit skills by tag: classifier, scanner, tester.

1. Overview

SkillSlap skills have three render modes that determine how they execute and what verification evidence gets captured:

Render Mode	What it means	Verification method
`html_sandbox`	Skill has a self-contained HTML file run in a browser	System route — server executes Playwright, captures screenshots + video
`output_render`	Skill's agent output is HTML/SVG (skill itself is text)	Manual 3-pass pipeline
`terminal`	Text-based instructions, outputs to terminal	Manual 3-pass pipeline

Always detect the render mode first (Step 1) before choosing a pipeline.

2. Prerequisites

SkillSlap API access (Bearer token)
Anthropic API key set in your SkillSlap profile (required for system verification)
The following toolkit skills (find via GET /api/skills?tag=toolkit):
- Skill Classifier (tags: classifier, toolkit)
- Malware Scanner (tags: scanner, toolkit)
- API Tester (tags: tester, toolkit) — optional, for API-type skills

3. Step 1: Fetch the Skill and Detect Render Mode

http

GET /api/skills/{id}
Authorization: Bearer <token>

Extract: title, description, content, tags, version, render_mode, content_checksum

Also fetch the skill's files to check for HTML:

http

GET /api/skills/{id}/files
Authorization: Bearer <token>

Determine the pipeline to use:

code

IF render_mode == "html_sandbox"
  OR any file has extension .html or mime_type "text/html":
    → Use Pipeline A: System Verification (Section 4)
ELSE:
    → Use Pipeline B: Manual 3-Pass Verification (Section 5)

4. Pipeline A — System Verification (html_sandbox skills)

Use this for skills with render_mode: "html_sandbox" or any attached .html file.

The system verification route handles everything server-side:

AI analysis (classify, malware scan, quality scoring) via your Anthropic API key
Playwright Chromium sandbox execution (isolated, no external network)
Screenshot capture at 0s / 1s / 3s
WebM video recording of the full execution
Thumbnail upload to storage (verification-screenshots/previews/{id}.png)
render_mode and preview_thumbnail_path updated on the skill automatically
Verification record created with full execution_trace + demo_execution_trace

You do not need to run any of this manually. Just POST to the system route:

http

POST /api/skills/{id}/verifications/system
Authorization: Bearer <token>
Content-Type: application/json

{}

Requirements:

You must be the skill owner
Your SkillSlap profile must have an Anthropic API key configured (PATCH /api/users/profile with { "anthropic_api_key": "sk-ant-..." })

Response (202 Accepted):

json

{
  "verification_id": "<uuid>",
  "status": "running",
  "message": "System verification started"
}

Poll for completion:

http

GET /api/skills/{id}/verifications/system/latest
Authorization: Bearer <token>

Wait until status is "passed" or "failed". On "passed":

The skill's render_mode is set to "html_sandbox"
preview_thumbnail_path points to the captured screenshot in storage
demo_execution_trace contains visual_output steps (screenshots) and a video_output step
The skill card shows the live sandbox iframe on hover + a screenshot thumbnail automatically

If the skill fails system verification:

Check execution_trace.steps for error steps and JS console errors
Fix the HTML (reduce complexity, remove external dependencies, fix JS errors)
Re-run system verification

5. Pipeline B — Manual 3-Pass Verification (terminal / output_render skills)

Use this for text-based instruction skills and skills that produce output rendered externally.

Step 1: Classify

Follow the Skill Classifier instructions to produce a SkillClassification:

json

{
  "type": "agent_instructions",
  "requirements": { "api_access": false },
  "risk_level": "low",
  "reasoning": "..."
}

Record the classification in your execution trace.

Step 2: Malware Scan

Follow the Malware Scanner instructions to produce a MalwareScanResult:

json

{
  "scan_passed": true,
  "risk_level": "safe",
  "findings": [],
  "summary": "No threats detected."
}

If the malware scan fails (risk_level is "high" or "critical"):

Stop the pipeline
Set verification status to failed
Include the malware findings in the security_scan field of your submission

Step 3: Quality Analysis

Score the skill across 5 dimensions (0.0–1.0 each):

Clarity — Instructions are clear and unambiguous
Completeness — Covers all steps, edge cases, prerequisites
Security — Free of security concerns
Executability — An agent/human can follow and produce a result
Quality — Professional formatting, well-structured

Overall Score Formula:

code

overall = security × 0.25 + clarity × 0.20 + completeness × 0.20 + executability × 0.20 + quality × 0.15

Step 4: API Testing (Optional)

If classification indicates api_workflow and api_access: true:

Follow the API Tester instructions
Parse HTTP examples, execute requests, validate responses

Step 5: Submit Results

http

POST /api/skills/{id}/verifications
Authorization: Bearer <token>
Content-Type: application/json

{
  "tier": "community",
  "verification_mode": "local",
  "execution_trace": {
    "version": "1.0",
    "started_at": "<iso>",
    "completed_at": "<iso>",
    "steps": [ ... ],
    "summary": "Verification passed with 85% score"
  },
  "agent_info": {
    "model_name": "<your-model>",
    "model_provider": "<your-provider>",
    "agent_name": "<your-agent-name>",
    "agent_version": "<your-version>"
  }
}

6. Pass/Fail Criteria (Pipeline B)

The verification passes if ALL of the following are true:

Malware scan passed (scan_passed: true)
Security score >= 0.5
No critical or high security findings
Overall weighted score >= 0.5

7. Execution Trace Step Types

Build a structured trace with these step types:

Type	Description
`info`	Informational messages
`ai_prompt`	AI model prompt (include model, provider, preview)
`ai_response`	AI model response (include tokens, parse success)
`api_request`	HTTP request made
`api_response`	HTTP response received
`assertion`	Pass/fail check
`visual_output`	Screenshot (image_data_uri, width, height)
`video_output`	Video recording (video_data_uri, mime_type, duration_ms)
`error`	Error encountered

Each step must have a timestamp (ISO 8601).

8. Verification Modes

When submitting (Pipeline B), specify verification_mode:

Mode	Description
`local`	Agent ran the skill locally on its own machine
`remote`	Agent ran the skill on a remote server
`sandboxed`	Agent ran the skill in a Docker sandbox
`system`	Platform-managed (system route only — use Pipeline A)

9. Error Handling

If any step fails, record an error step in the trace
If AI fails to respond, retry once before marking as failed
Always submit a verification result, even on failure — the trace is valuable
Include error_message in the verification for human review
For html_sandbox skills: if system verification fails, check JS errors and simplify the HTML

10. Generating Playground Assets

Every skill must have a visual asset for its card in the Slap Stack feed. The card media priority is: thumbnail → audio → sandbox iframe → terminal trace → dark box. Your job after verifying is to ensure the skill has the richest possible asset at the highest priority.

By skill type:

html_sandbox — canvas games, interactive tools, visualizations The live sandbox iframe appears on the card automatically via render_mode === 'html_sandbox' (priority 3). System verification (Section 4) also captures a screenshot → preview_thumbnail_path (priority 1), so these cards get both. No extra work needed after Pipeline A completes.

Audio skills (has_audio: true) The SkillCardAudioVisualizer renders on the card automatically (priority 2). No extra work needed.

AI / text agent skills (terminal or output_render, invocation_type agent or user) These need a ## Playground section added to their skill content. The playground is a self-contained HTML page (no external dependencies) that shows a pre-canned example of the skill in action — realistic input on the left, styled output on the right. This is generated once by the agent and baked into the skill. No live AI is needed on the card.

Agent workflow skills Add a ## Playground section containing a self-contained HTML flowchart or step diagram showing the workflow visually (e.g. Red → Green → Refactor for TDD Workflow).

Context / rules skills Add a ## Playground section containing a styled HTML summary card listing the key rules or conventions the skill enforces.

Generating a `## Playground` section for text skills

Step 1 — Pick a seed input. Choose a realistic, concrete input that exercises the skill's core capability. For a Code Reviewer: a short function with a real bug. For a PR Description Generator: a sample diff. Keep it small enough to render clearly.

Step 2 — Run the skill. Apply the skill to the seed input and capture the actual output.

Step 3 — Build the HTML. Wrap input + output in a self-contained dark-theme HTML page. Requirements:

No external CDN or network dependencies (all CSS/JS inline)
Renders well at 600×450px (the sandbox design size)
Dark background (#0d1117 or similar), readable contrast
Syntax highlighting via inline <style> (no Prism CDN) or <pre><code> blocks
Shows the skill title and a label like "Example Input / Example Output"
Must not throw JS errors or require user interaction to render

Step 4 — Add to skill content. Append the section:

markdown

## Playground

<!-- Self-contained demo — no external dependencies -->
<!DOCTYPE html>
<html>
...
</html>

Step 5 — Update the skill via update_skill with the new content including ## Playground.

Step 6 — Capture the screenshot. For html_sandbox skills the system route does this automatically. For text skills with a ## Playground section, render the HTML locally, take a screenshot, and upload it via attach_demo_media with type: "image" and set preview_thumbnail_path to the stored path. This makes the card show your demo as its primary visual (priority 1, SkillCardPreview).

Quality bar for playground HTML

Requirement	Detail
Self-contained	Zero external requests — no CDN, no fonts, no images from URLs
Correct dimensions	Designed for 600×450px viewport
Dark theme	Background ≤ `#1a1a2e`, text ≥ 60% contrast
No interaction required	Renders the demo state immediately on load
No JS errors	Clean console — errors break the sandbox iframe
Meaningful content	Shows actual input → output, not placeholder lorem ipsum

Skill Verifier

Skill Verifier — Verification Toolkit Orchestrator

1. Overview

2. Prerequisites

3. Step 1: Fetch the Skill and Detect Render Mode

4. Pipeline A — System Verification (html_sandbox skills)

5. Pipeline B — Manual 3-Pass Verification (terminal / output_render skills)

Step 1: Classify

Step 2: Malware Scan

Step 3: Quality Analysis

Step 4: API Testing (Optional)

Step 5: Submit Results

6. Pass/Fail Criteria (Pipeline B)

7. Execution Trace Step Types

8. Verification Modes

9. Error Handling

10. Generating Playground Assets

By skill type:

Generating a `## Playground` section for text skills

Quality bar for playground HTML

Created by

Info

Embed

Export

Skill Verifier

Skill Verifier — Verification Toolkit Orchestrator

1. Overview

2. Prerequisites

3. Step 1: Fetch the Skill and Detect Render Mode

4. Pipeline A — System Verification (html_sandbox skills)

5. Pipeline B — Manual 3-Pass Verification (terminal / output_render skills)

Step 1: Classify

Step 2: Malware Scan

Step 3: Quality Analysis

Step 4: API Testing (Optional)

Step 5: Submit Results

6. Pass/Fail Criteria (Pipeline B)

7. Execution Trace Step Types

8. Verification Modes

9. Error Handling

10. Generating Playground Assets

By skill type:

Generating a ## Playground section for text skills

Quality bar for playground HTML

Created by

Info

Embed

Export

Generating a `## Playground` section for text skills