Files
librenotes/.wave/pipelines/supervise.yaml
Michael Czechowski fc24f9a8ab Add Wave general-purpose pipelines
ADR, changelog, code-review, debug, doc-sync, explain, feature,
hotfix, improve, onboard, plan, prototype, refactor, security-scan,
smoke-test, speckit-flow, supervise, test-gen, and more.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 17:02:36 +01:00

169 lines
6.0 KiB
YAML

kind: WavePipeline
metadata:
name: supervise
description: "Review work quality and process quality, including claudit session transcripts"
input:
source: cli
example: "last pipeline run"
steps:
- id: gather
persona: supervisor
workspace:
mount:
- source: ./
target: /project
mode: readonly
exec:
type: prompt
source: |
Gather evidence for supervision of: {{ input }}
## Smart Input Detection
Determine what to inspect based on the input:
- **Empty or "last pipeline run"**: Find the most recent pipeline run via `.wave/workspaces/` timestamps and recent git activity
- **"current pr" or "PR #N"**: Inspect the current or specified pull request (`git log`, `gh pr view`)
- **Branch name**: Inspect all commits on that branch vs main
- **Free-form description**: Use grep/git log to find relevant recent work
## Evidence Collection
1. **Git history**: Recent commits with diffs (`git log --stat`, `git diff`)
2. **Session transcripts**: Check for claudit git notes (`git notes show <commit>` for each relevant commit). Summarize what happened in each session — tool calls, approach taken, detours, errors
3. **Pipeline artifacts**: Scan `.wave/workspaces/` for the relevant pipeline run. List all output artifacts and their contents
4. **Test state**: Run `go test ./...` to capture current test status
5. **Branch/PR context**: Branch name, ahead/behind status, PR state if applicable
## Output
Produce a comprehensive evidence bundle as structured JSON. Include all raw
evidence — the evaluation step will interpret it.
Be thorough in transcript analysis — the process quality evaluation depends
heavily on understanding what the agent actually did vs what it should have done.
output_artifacts:
- name: evidence
path: .wave/output/supervision-evidence.json
type: json
handover:
contract:
type: json_schema
source: .wave/output/supervision-evidence.json
schema_path: .wave/contracts/supervision-evidence.schema.json
on_failure: retry
max_retries: 2
- id: evaluate
persona: supervisor
dependencies: [gather]
memory:
inject_artifacts:
- step: gather
artifact: evidence
as: evidence
workspace:
mount:
- source: ./
target: /project
mode: readonly
exec:
type: prompt
source: |
Evaluate the work quality based on gathered evidence.
The gathered evidence has been injected into your workspace. Read it first.
## Output Quality Assessment
For each dimension, score as excellent/good/adequate/poor with specific findings:
1. **Correctness**: Does the code do what was intended? Check logic, edge cases, error handling
2. **Completeness**: Are all requirements addressed? Any gaps or TODOs left?
3. **Test coverage**: Are changes adequately tested? Run targeted tests if needed
4. **Code quality**: Does it follow project conventions? Clean abstractions? Good naming?
## Process Quality Assessment
Using the session transcripts from the evidence:
1. **Efficiency**: Was the approach direct? Count unnecessary file reads, repeated searches, abandoned approaches visible in transcripts
2. **Scope discipline**: Did the agent stay on task? Flag any scope creep — changes unrelated to the original goal
3. **Tool usage**: Were the right tools used? (e.g., Read vs Bash cat, Glob vs find)
4. **Token economy**: Was the work concise or bloated? Excessive context gathering? Redundant operations?
## Synthesis
- Overall score (excellent/good/adequate/poor)
- Key strengths (what went well)
- Key concerns (what needs attention)
Produce the evaluation as a structured JSON result.
output_artifacts:
- name: evaluation
path: .wave/output/supervision-evaluation.json
type: json
handover:
contract:
type: json_schema
source: .wave/output/supervision-evaluation.json
schema_path: .wave/contracts/supervision-evaluation.schema.json
on_failure: retry
max_retries: 2
- id: verdict
persona: reviewer
dependencies: [evaluate]
memory:
inject_artifacts:
- step: gather
artifact: evidence
as: evidence
- step: evaluate
artifact: evaluation
as: evaluation
workspace:
mount:
- source: ./
target: /project
mode: readonly
exec:
type: prompt
source: |
Synthesize a final supervision verdict.
The gathered evidence and evaluation have been injected into your workspace.
Read them both before proceeding.
## Independent Verification
1. Run the test suite: `go test ./...`
2. Cross-check evaluation claims against actual code
3. Verify any specific concerns raised in the evaluation
## Verdict
Issue one of:
- **APPROVE**: Work is good quality, process was efficient. Ship it.
- **PARTIAL_APPROVE**: Output is acceptable but process had issues worth noting for improvement.
- **REWORK**: Significant issues found that need to be addressed before the work is acceptable.
## Action Items (if REWORK or PARTIAL_APPROVE)
For each issue requiring action:
- Specific file and line references
- What needs to change and why
- Priority (must-fix vs should-fix)
## Lessons Learned
What should be done differently next time? Process improvements, common pitfalls observed.
Produce the verdict as a markdown report with clear sections:
## Verdict, ## Output Quality, ## Process Quality, ## Action Items, ## Lessons Learned
output_artifacts:
- name: verdict
path: .wave/output/supervision-verdict.md
type: markdown