Skip to content

Khiip

Capture, structure, and preserve online sources — as plain Markdown your tools, agents, and archives can actually read. Self-hosted. AGPL. Filesystem-canonical.

The substrate, not the destination

Khiip captures a URL, X thread, Reddit post, YouTube video, web article, or PDF — stores it permanently in your own filesystem as Markdown + typed payloads — and lets you (or your agents) recall it later by meaning, structure, or time.

It is the best open substrate for the pattern that won: plain Markdown files + grep + MCP. Other tools — Obsidian, Logseq, LangChain agents, your own scripts — sit on top and consume what Khiip captures.

Multi-platform capture

X, Reddit, Wikipedia + generic web, YouTube, and PDF today. Each source emits a Pydantic-typed payload at the extractor boundary.

Local-first & portable

Markdown + YAML frontmatter is the canonical tier. Raw Source-tier bytes are preserved as insurance against upstream rot. No vendor lock-in.

Recall by meaning

Bundled MiniLM-L6 embeddings do cosine top-k recall out of the box — zero LLM cost, works offline after the first model fetch.

Works with your agents

REST + an MCP server expose the substrate to Claude Desktop, Cursor, or any MCP-aware client. Capture and recall without writing HTTP code.

Start here