Skip to content

Recall

Recall ranks captured sources by semantic similarity to your query:

Terminal window
khiipd recall "Inca knot record system"
khiipd recall "building a second brain with AI" --limit 5

How it works

At v0.1.x, recall uses the bundled MiniLM-L6 ONNX model to embed a per-source embed-text composition of each capture’s typed payload — a deterministic projection of the structured fields that matter for that source (title, body, author, key entities) rather than a raw text dump. Queries are embedded the same way, and recall returns cosine top-k matches (per ADR-0009 §C7).

This runs locally, at zero LLM cost, and works offline after the one-time model fetch on first use. Quality is whatever MiniLM-L6 gives you — good enough for “find that thing I captured about X,” and pluggable for more (below).

Tuning

  • --limit N — how many results to return (default 10).
  • Recall is by meaning, not keyword — “how LLMs actually work” can surface a captured talk transcript and an article that never use those exact words.

Pluggable embedders (roadmap)

The embedder sits behind a Protocol so it can be swapped without touching the rest of the substrate. Planned for v0.5+:

  • Local LLM via Ollama (better quality, still free + local)
  • BYOK (OpenAI / Anthropic / Gemini) for best-quality embeddings
  • BM25 keyword fallback for exact-term recall

Switching embedders requires a re-embed of the corpus (a backfill); the captured payloads themselves don’t change.