context-mode
Concepts

The knowledge base

How FTS5 and BM25 search, ranking, and chunking let you recall anything without re-reading it.

context-mode keeps a single local knowledge base so you can recall anything you have already seen without reading it back into context. It is an SQLite FTS5 store ranked with BM25, queried on demand through ctx_search. Search a passage once and the answer costs a few hundred tokens; re-reading the source file costs thousands.

What lives in it

The knowledge base holds two kinds of content side by side.

  • Content you index explicitly — through ctx_index, through ctx_fetch_and_index for web pages, and from the output that ctx_batch_execute captures automatically.
  • Auto-captured session memory — the events that hooks record as you work, so a later search can recall prior decisions and errors.

Because both share one store, a single query can surface a passage from a fetched doc and a relevant past decision together.

How search ranks results

A query runs through two matchers in parallel, then the results are merged.

Two matchers, one pass

A Porter-stemming matcher handles word variants, so a search for running matches run and runs. A trigram substring matcher handles partial and in-word matches that stemming misses. Running both covers far more phrasings than either alone.

Reciprocal Rank Fusion

The two ranked lists are combined with Reciprocal Rank Fusion, which rewards passages that score well in either matcher rather than letting one dominate.

Proximity rerank

For a multi-term query, a proximity rerank boosts passages where your terms appear close together, so a tight match outranks one where the words are scattered across a long document.

Typo tolerance

Near-miss terms are corrected with Levenshtein distance, so a small typo still finds the passage you meant.

Snippets, not truncation

Results come back as windows extracted around each match — the lines on either side of the hit — rather than the first N characters of the source. You get the relevant passage with enough surrounding context to use it, and nothing more.

Batch several questions in one search
ctx_search({
  queries: [
    "how is the FTS5 index rebuilt",
    "default BM25 ranking weights",
    "where fetched pages are cached",
  ],
})

How content is chunked

Before indexing, content is split so that each chunk is a coherent unit and search lands you in the right place.

  • Markdown is split by headings, with code blocks kept intact so a fenced example is never cut in half.
  • JSON is split by key paths, so a deeply nested value stays attached to the keys that locate it.
  • Plain text is split by lines.

Fetched content is cached with a TTL — 24 hours by default, configurable via the ttl parameter on ctx_fetch_and_index — so re-fetching a recently indexed URL serves the stored copy instead of hitting the network again.

On this page