← All docs

Citations

A citation is a verifiable link between a claim in one place and its source in another. The claim might be a sentence in a collection's distilled context; the source might be a passage in a saved web page, a note, or an external URL. Sombra stores the exact text on both ends, the offsets, and the timestamps — so a reader can trace any assertion back to where it came from, and you can see when something has drifted.

There are three ways into the system:

  • Authoring in the app — select text on either side, capture it, then click a suggested passage on the other side to commit. No syntax to learn.
  • Verification — when an AI assistant writes a distilled context, Sombra automatically scans the new content and surfaces which substantive claims are anchored in your sources, which are partially matched, and which have no source at all.
  • Via MCP — your AI assistant can author, verify, and manage citations programmatically with eight citation tools.

Authoring in the app

The flow has two halves and you can start from either side.

1. Capture a claim. Select text in a collection context or a note's body. A floating button appears above the selection — click Capture as claim. A persistent toast pins to the bottom of the screen showing what you captured, so you can keep working without losing your place.

2. Find the source. Navigate to whichever saved artifact backs the claim. The page enters a focused "capture mode" — existing citation chrome is dimmed and Sombra highlights candidate passages it thinks could anchor the claim. Each candidate has a small match-strength meter on hover. Click one to commit the citation. The candidate disappears, the toast clears, and the new citation paints on both ends.

If no candidate looks right, just select source text manually and click Commit citation. Either way, the server only commits citations that anchor verbatim — selections that don't match the underlying source loudly fail rather than silently land on the wrong text. The error toast explains what happened and keeps your captured claim around so you can adjust.

Verification — finding claims that aren't grounded

When an AI assistant writes a distilled context for a collection, every substantive claim ought to trace back to one of the saved sources. Sombra can check this for you. Every time the distill is written, a server-side scan walks the new context, picks out the substantive sentences, and tries to anchor each one in a source artifact — no LLM, just sentence segmentation and the same word-coverage matcher the citation tools use.

Each substantive claim ends up in one of three buckets:

  • Anchored — high-confidence match (matcher ≥ 0.7). Same threshold the MCP commit_citations tool uses, so accepting it is the same level of safety as letting your agent commit. Underlined with a dashed indigo line.
  • Loose — partial match (0.3 ≤ score < 0.7). Some of the words are in the source but the matcher isn't confident enough to commit blindly. Underlined with a dotted amber line — these are the ones to look at one-by-one.
  • Uncited — no source above 0.3. The sentence looks like a factual claim but nothing in this collection supports it. Underlined with a wavy red line. Either find a source, rewrite the sentence, or strip it from the distill.

The review strip

Above the rendered distill, a header strip summarises the proposed batch. The headline shows the count + freshness (84 proposed · scanned 2m ago). Below is a score-distribution histogram — each dot is one proposed citation, positioned by matcher score and coloured by verdict. The vertical line at 0.7 is the commit threshold. Hover a dot to see the claim excerpt; click it to scroll the in-doc chrome into view.

Three inline phrases give you the actions:

  • Accept N anchored — a confirmation dialog appears, showing the count and the first few claim previews. Confirm to commit the whole bucket as active citations in a single transaction. An 8-second undo affordance appears in the status line; click it to roll back the batch.
  • Review N between — enters filter mode. The other verdicts dim in both the histogram and the rendered document so you can scan only the loose marks. Click each one to inspect its hover-card; accept or dismiss individually.
  • N uncited — same filter-mode pattern for unsourced claims. The hover-card for an uncited mark omits the Accept action and surfaces guidance to edit the distill or save a supporting source.

The strip collapses to a one-line summary via the caret on the left. On narrow screens it becomes a view-only segmented bar — curation requires the desktop layout.

Scanning collections that pre-date the feature

Distilled contexts written before the verifier shipped don't have a proposed batch yet — the background scan only fires on new context writes. If you open a collection with a distill and the strip shows 0 proposed · scan now, clicking scan now runs the scan synchronously and persists the result. Future context-writes refresh automatically.

The cost is bounded by the number of source artifacts in the collection (a few seconds for typical research collections). Working-document collections like living specs will produce mostly uncited claims because the distill is project-internal status, not a synthesis of external research — opt-in scanning keeps that noise off collections you didn't ask to verify.

Proposed citations behave like regular citations

A proposed citation is just a citation in the :proposed state, persisted the same way human-authored or agent-authored citations are. The hover-card, the Remove control, the drift-detection path — all work identically. Accepting a proposed citation flips it to :active; dismissing flips it to :invalidated. The audit trail is in Datomic history either way.

Drift detection

Citations don't break silently. The server resolves each citation's current location at read time and labels it:

  • Coincident — the cited text still appears exactly where it was when the citation was authored. The normal state.
  • Drifted — the text has changed but a similar passage is found nearby. The popover shows the original excerpt struck through alongside the new location, so you can see what shifted.
  • Stranded — the cited text is no longer findable. The citation surfaces in a stranded list at the bottom of the parent so you can supersede or invalidate it.

External citations to content-addressed URLs (like a GitHub blob pinned to a commit SHA) can never drift by construction, so the renderer skips the drift block for them.

Removing a citation

Hover any citation highlight — the popover footer has a small Remove button. First click arms it (becomes a destructive-styled "Confirm remove"); the second click invalidates the citation. Invalidation is reversible at the data layer — the citation stays in history with state invalidated rather than being deleted.

Removal works from either side of a citation: hover on the claim mark in a context, or on the source mark on the artifact body.

Where citations live

Citations are stored as their own entities, not embedded in markdown. That means your distilled context document stays clean and portable — you can copy the markdown out, export it, render it anywhere, and the citation chrome is added by the renderer rather than baked into the text. Sombra splices <cite> tags into the rendered markdown right before display.

The source can be:

  • Another Sombra artifact — a saved web page, a note. Common case.
  • An external content-addressed URL — currently GitHub blob URLs with a commit SHA. These can't drift; the source-of-truth lives outside Sombra.

Citations via MCP

If you're working with an AI assistant connected to Sombra, it has the same authoring surface plus a few power-tools:

  • cite_text — propose a batch of citations against a collection or artifact, returning ranges and confidence scores you can inspect before committing.
  • commit_citations — persist a validated batch in one transaction.
  • check_citations — get a drift report for every active citation on a parent.
  • find_citations — reverse lookup: what cites this source?
  • supersede_citation — replace a citation with a new range, keeping the old one in history.
  • set_citation_state — flip a citation between active, invalidated, and deferred.
  • repair_citations — re-anchor drifted citations to their current location.
  • verify_collection_context — run the mechanical claim scan on a collection's distilled context and return the structured per-sentence verdicts (anchored / loose / uncited). Read-only — no persistence; for an agent driving a verification workflow.

Full tool descriptions are at sombra.so/mcp. Citations from MCP and the in-app authoring flow share the same data substrate — they show up identically on the page, with the same hover-card chrome and drift detection.