You Don't Know What Your Coding Assistant Actually Reads

Claude Code's web fetching pipeline sends pages through a summarisation model before the main model sees them. The output is an interpretation, not the original — and you can't tell the difference.

Sombra Team·March 15, 2026

Claude Code doesn't read web pages. It sends them through a smaller model that summarises them, and the model you're paying for — Sonnet, Opus — only sees that summary. You have no way of knowing what was kept, what was dropped, and what was rewritten.

What the pipeline actually does

The pipeline was reverse-engineered last year from Claude Code's minified bundle (by Mikhail Shilkov and Giuseppe Gurgone):

Fetches the HTML
Converts to markdown with vanilla Turndown — no Readability, no content extraction, no nav/footer stripping
Truncates to 100k characters
Sends the markdown to Haiku 3.5 (a small, cheap model) with an empty system prompt
Haiku summarises and returns a condensed answer
Claude — the model you're actually talking to — only sees Haiku's summary

They discovered a fast path for around 80 trusted documentation sites (docs.python.org, react.dev) where if the server returns Content-Type: text/markdown under 100k characters, the Haiku step is skipped. But HTML from those same trusted sites still goes through the full pipeline. Non-trusted sites get an even stricter prompt with a 125-character maximum for quotes.

What actually comes back

We tested WebFetch against several technical pages and compared the results to direct content extraction. The results aren't consistent — and that's the problem.

A Clojure concurrency tutorial with one large code listing came back as 19 separate blocks — Haiku split and reorganised the content, adding section headings that don't exist on the original page. A Pedestal rate-limiting post with four code blocks came back with three — one quietly dropped. An htmx example with one code block came back as two — Haiku extracted inline JavaScript into a separate block.

None of this is documented. You don't know which blocks are verbatim and which are Haiku's interpretation. You don't know if a block was dropped or reorganised. The output looks like source material but it's been through a lossy, opaque intermediary.

The behaviour may also vary by plan tier — nothing in the documentation confirms which model runs the summarisation pass or whether the trusted-site list differs between plans. This is a black box, whose behaviour is non-deterministic, reliant on the open web and enforced by your assistant. Why they do this is clear — the default behaviour makes sense. But the gap between a safe default and the requirements of a skilled, intentional user leaves important signal on the table.

Two fixable problems

Vanilla Turndown is the wrong tool. Every nav link, cookie banner, footer, and sidebar gets converted to markdown and either bloats the Haiku pass or leaks noise into summaries. Readability — what Firefox Reader View uses — is battle-tested for exactly this. Better extraction upstream means fewer tokens into Haiku and cleaner summaries out.

The summarisation step itself is the deeper issue. For technical content, an interpretation is the wrong output. When a developer asks their coding assistant to read a migration guide, they need the configuration examples, the function signatures, the exact CLI commands. A reorganised summary — even a good one — introduces doubt about whether the code you're implementing against is what the page actually said.

These are fixable engineering decisions. Swap the extraction library. Give Claude a path to raw content for technical pages. Let users control whether summarisation happens. Claude Code's repo is closed source — no PRs, community can't contribute — so in the meantime, you work around it.

Workarounds

The simplest fix: tell Claude Code to use a proper content extraction tool. Add a line to your CLAUDE.md pointing it at something like r11y or Readability via a bash command. The markdown lands directly in context — no intermediary model, no summarisation step. This still relies on the open web being, well, open, but for getting purer signal this is a good start.

The broader problem

WebFetch works fine for skimming a news article. It's unreliable for the task a coding assistant needs most: reading technical content accurately and faithfully - and Haiku, while a good model, may make decisions that negatively impact the quality of the context provided. It's not deterministic, it's not fully repeatable, across sessions.

There's a second issue the pipeline can't solve: it requires the page to be live at fetch time. Pages go offline, change, get paywalled, return different content to different user agents. Your agent's context is only as stable as whatever URL it happened to hit in that moment.

This is why we built Sombra's content extraction the way we did — capture the content once, preserve code blocks verbatim, store it as clean markdown that doesn't change when the source page does. When your agent reads from a Sombra collection, it gets the actual content, not an intermediary model's interpretation of it. The distillation step is deliberate and user-controlled, not a hidden pipeline decision you can't opt out of. Crucially, you can see exactly what your agent received, edit it, and track every change.

Summaries of documentation produce code based on summaries. Verbatim content produces code based on documentation.

Sombra is a research library for AI agents. Save web pages, organise into collections, distil what matters, serve it through MCP. Start free.