ADR, changelog, code-review, debug, doc-sync, explain, feature, hotfix, improve, onboard, plan, prototype, refactor, security-scan, smoke-test, speckit-flow, supervise, test-gen, and more. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
174 lines
5.6 KiB
YAML
174 lines
5.6 KiB
YAML
kind: WavePipeline
|
|
metadata:
|
|
name: ingest
|
|
description: "Ingest a web article into the Zettelkasten as bibliographic and permanent notes"
|
|
release: true
|
|
|
|
input:
|
|
source: cli
|
|
examples:
|
|
- "https://simonwillison.net/2026/Feb/7/software-factory/"
|
|
- "https://langfuse.com/blog/2025-03-observability"
|
|
- "https://arxiv.org/abs/2401.12345"
|
|
|
|
steps:
|
|
- id: fetch
|
|
persona: scout
|
|
workspace:
|
|
mount:
|
|
- source: ./
|
|
target: /project
|
|
mode: readonly
|
|
exec:
|
|
type: prompt
|
|
source: |
|
|
Fetch and extract structured content from a web article.
|
|
|
|
URL: {{ input }}
|
|
|
|
## Steps
|
|
|
|
1. Use WebFetch to retrieve the article content
|
|
2. Extract:
|
|
- title: article title
|
|
- author: author name (look in byline, meta tags, about section)
|
|
- date: publication date
|
|
- summary: 50-3000 character summary of the article
|
|
- key_concepts: list of key concepts with name and description
|
|
- notable_quotes: direct quotes with context
|
|
- author_year_key: generate AuthorYear key (e.g., Willison2026)
|
|
3. If the author name is unclear, use the domain name as author
|
|
|
|
## Output
|
|
|
|
Write the result as JSON to output/source-extract.json matching the contract schema.
|
|
output_artifacts:
|
|
- name: source-extract
|
|
path: output/source-extract.json
|
|
type: json
|
|
handover:
|
|
contract:
|
|
type: json_schema
|
|
source: output/source-extract.json
|
|
schema_path: .wave/contracts/source-extract.schema.json
|
|
on_failure: retry
|
|
max_retries: 2
|
|
|
|
- id: connect
|
|
persona: navigator
|
|
dependencies: [fetch]
|
|
memory:
|
|
inject_artifacts:
|
|
- step: fetch
|
|
artifact: source-extract
|
|
as: source
|
|
workspace:
|
|
mount:
|
|
- source: ./
|
|
target: /project
|
|
mode: readonly
|
|
exec:
|
|
type: prompt
|
|
source: |
|
|
Find connections between extracted source content and existing Zettelkasten notes.
|
|
|
|
Read the source extract: cat artifacts/source
|
|
|
|
## Steps
|
|
|
|
1. For each key concept in the source, search for related notes:
|
|
- `notesium lines --filter="concept_name"`
|
|
- Read the most relevant matches
|
|
2. Identify the Folgezettel neighborhood where new notes belong:
|
|
- What section does this content fit in?
|
|
- What would be the parent note?
|
|
- What Folgezettel address should new notes get?
|
|
3. Check if the index note needs updating
|
|
4. Determine link directions (should new note link to existing, or existing link to new?)
|
|
|
|
## Output
|
|
|
|
Write the result as JSON to output/connections.json matching the contract schema.
|
|
Include:
|
|
- source_title: title of the source being connected
|
|
- related_notes: list of related existing notes with filename, title,
|
|
folgezettel_address, relationship explanation, and link_direction
|
|
- suggested_placements: where new notes should go in the Folgezettel
|
|
with address, parent_note, section, rationale, and concept
|
|
- index_update_needed: boolean
|
|
- suggested_index_entries: new entries if needed
|
|
- timestamp: current ISO 8601 timestamp
|
|
output_artifacts:
|
|
- name: connections
|
|
path: output/connections.json
|
|
type: json
|
|
handover:
|
|
contract:
|
|
type: json_schema
|
|
source: output/connections.json
|
|
schema_path: .wave/contracts/connections.schema.json
|
|
on_failure: retry
|
|
max_retries: 2
|
|
|
|
- id: create
|
|
persona: scribe
|
|
dependencies: [connect]
|
|
memory:
|
|
inject_artifacts:
|
|
- step: fetch
|
|
artifact: source-extract
|
|
as: source
|
|
- step: connect
|
|
artifact: connections
|
|
as: connections
|
|
workspace:
|
|
mount:
|
|
- source: ./
|
|
target: /project
|
|
mode: readwrite
|
|
exec:
|
|
type: prompt
|
|
source: |
|
|
Create Zettelkasten notes from an ingested web source.
|
|
|
|
Read the artifacts:
|
|
cat artifacts/source
|
|
cat artifacts/connections
|
|
|
|
## Steps
|
|
|
|
1. **Create the bibliographic note**:
|
|
- Use `notesium new` for the filename
|
|
- Title: `# AuthorYear` using the author_year_key from the source extract
|
|
- Content: source URL, author, date, summary, key quotes
|
|
- One sentence per line
|
|
|
|
2. **Create permanent notes** for key ideas that warrant standalone Zettel:
|
|
- Use `notesium new` for each
|
|
- Use the Folgezettel address from suggested_placements
|
|
- Title: `# {address} {Concept-Name}`
|
|
- Write in own words — transform, don't copy
|
|
- Add contextual links to related notes (explain *why* the connection exists)
|
|
- Link back to the bibliographic note
|
|
|
|
3. **Update existing notes** if bidirectional links are suggested:
|
|
- Add links from existing notes to the new permanent notes
|
|
- Include contextual explanation for each link
|
|
|
|
4. **Update the index note** if index_update_needed is true:
|
|
- Add new keyword → entry point mappings
|
|
|
|
5. **Commit all changes**:
|
|
- `git add *.md`
|
|
- `git commit -m "ingest: {AuthorYear key in lowercase}"`
|
|
|
|
6. **Write summary** to output/ingest-summary.md:
|
|
- Bibliographic note created (filename, title)
|
|
- Permanent notes created (filename, title, Folgezettel address)
|
|
- Links added to existing notes
|
|
- Index updates made
|
|
output_artifacts:
|
|
- name: ingest-summary
|
|
path: output/ingest-summary.md
|
|
type: markdown
|