Session Retrospective

A structured self-improvement protocol for AI agents: capture execution signals during a session, analyze patterns, and write actionable retrospective documents that improve future performance.

Overview

AI agents repeat the same mistakes across sessions because there is no mechanism to capture what went wrong and act on it. This skill defines a lightweight signal-capture protocol and a standardized retrospective format that creates institutional memory across sessions.

Inspired by the GEP (Governance Evolution Protocol) pattern from the AI agent capability evolution community.

Signal Types

During any agent session, six types of signals indicate improvement opportunities. Capture them as they occur, not after.

Signal: `log_error`

Execution errors, exceptions, or tool failures during the session.

markdown

Signal: log_error
Description: vitest run failed with "Cannot find module '@/lib/stripe'"
Context: Ran tests before reading the mock setup in the test file
Root cause: Loaded the route file without checking for missing test prerequisites

Signal: `protocol_drift`

Behavior that deviated from stated guidelines, rules, or the user's stated workflow.

markdown

Signal: protocol_drift
Description: Started implementation without writing a failing test first
Context: Added tips_enabled to updateSkillSchema before writing the test
Root cause: Pressure to complete quickly overrode TDD protocol

Signal: `user_feature_request`

Explicit requests from the user for capabilities that don't exist yet.

markdown

Signal: user_feature_request
Description: User asked to disable tips for community-inspired skills
Context: tips_enabled existed in DB but not in the API or MCP tool
Impact: Required a Chunk 0 to fix 3 separate gaps before main work

Signal: `high_tool_usage`

A tool called more times than necessary — indicates an inefficient approach.

markdown

Signal: high_tool_usage
Description: Read() called 6 times on the same file during a single task
Context: First read was full file; subsequent reads were targeted ranges
Root cause: Should have used Grep first to identify the relevant section

Signal: `perf_bottleneck`

Operations that were slower than expected — timeouts, long waits, or rate limiting.

markdown

Signal: perf_bottleneck
Description: npx vitest run took 47s due to loading 3,847 test files in full
Context: Run triggered during a quick sanity check
Root cause: Should have used vitest run --reporter=dot to reduce output overhead

Signal: `capability_gap`

Missing functionality the agent needed but didn't have.

markdown

Signal: capability_gap
Description: No way to verify a skill on dev environment from within a task
Context: Had to use curl to call /api/v1/skills/{id}/verify directly
Root cause: MCP verify_skill tool exists but requires knowing the Supabase URL

Retrospective File Format

Write retrospectives to: retrospectives/YYYY-MM-DD-description.md

Name the description with 2–4 words describing the session's main work (e.g., 2026-03-15-tips-api-fix.md).

markdown

# Session Retrospective: {Description}

**Date:** YYYY-MM-DD
**Session scope:** {1-sentence description of what was accomplished}
**Outcome:** success | partial | blocked

---

## Result

{What was delivered. Be specific: which files changed, what tests pass, what deployed.}

---

## Signals

### {Signal Type}: {Short title}

**Observed:** {What happened}
**Context:** {Where/when in the session}
**Root cause:** {Why it happened — the real reason, not the symptom}
**Proposed fix:** {Specific change to default behavior or process}

---

## Metrics

- Files modified: N
- Tests added: N
- Context tokens consumed (estimate): N
- Retries/corrections needed: N

---

## Parent Evolution ID

{ID of a previous retrospective this builds on, if applicable. "None" if first in chain.}

Writing a Retrospective

When to write one

Write a retrospective at the end of any session that:

Encountered at least one error or correction
Took more iterations than expected
Involved a process that should work differently next time
Produced a behavior the user had to redirect

Do not write retrospectives for trivial sessions (simple edits, lookups, single-file changes with no corrections).

What makes a useful retrospective

Good root cause analysis:

"Root cause: I read the full 350-line route file instead of grepping for the specific function. This loaded 300 unnecessary lines."

Not:

"Root cause: I made a mistake."

Specific proposed fix:

"Proposed fix: For any file > 100 lines, always grep for the target symbol before reading. Default read limit: 60 lines."

Not:

"Proposed fix: Be more careful."

Honest metrics: Count the actual number of retries, errors, and corrections. Underreporting defeats the purpose.

Reading retrospectives

At the start of any complex session:

Glob(pattern="retrospectives/*.md") to find all retrospective files
Read the 3 most recent ones
Note any proposed fix items that apply to today's task
Apply them proactively before repeating past mistakes

Example Retrospective

markdown

# Session Retrospective: Tips API Fix

**Date:** 2026-03-15
**Session scope:** Add tips_enabled to updateSkillSchema, /api/tips validation, and MCP update_skill
**Outcome:** success

---

## Result

Modified 6 files, added 96 lines of tests. All 3 gaps closed:
- PUT /api/skills/[id] now accepts tips_enabled
- POST /api/tips returns 403 when tips_enabled=false
- MCP update_skill tool exposes tips_enabled parameter
Coverage held at 100% lines.

---

## Signals

### capability_gap: tips_enabled existed in DB but not surfaced anywhere

**Observed:** Column existed since DB migration but no API endpoint or MCP tool exposed it
**Context:** User asked to set tips_enabled=false for community-inspired skills
**Root cause:** Feature was added to DB schema without completing the API/MCP plumbing
**Proposed fix:** When adding a DB column that represents a setting, always check (1) API PUT schema, (2) API validation guards, (3) MCP tool parameters as a checklist

---

## Metrics

- Files modified: 6
- Tests added: 96 lines
- Context tokens consumed (estimate): ~8,000
- Retries/corrections needed: 0

---

## Parent Evolution ID

None

Integration with Memory

Retrospective files complement but do not replace memory:

Memory stores facts and preferences that apply to all future sessions
Retrospectives store session-specific analysis for review before similar future sessions

After writing a retrospective, update memory only if the proposed fix represents a durable change to default behavior — not if it's a one-time correction.

Session Retrospective

Session Retrospective

Overview

Signal Types

Signal: `log_error`

Signal: `protocol_drift`

Signal: `user_feature_request`

Signal: `high_tool_usage`

Signal: `perf_bottleneck`

Signal: `capability_gap`

Retrospective File Format

Writing a Retrospective

When to write one

What makes a useful retrospective

Reading retrospectives

Example Retrospective

Integration with Memory

Created by

Info

Demo

Embed

Export