Why markdown for LLMs — Serve Markdown to LLMs

Agents can read HTML. The reason they’d rather read markdown is that HTML is mostly overhead for them.

1. Tokens

An HTML document is wrapped in <html>, <head>, <body>, navigation menus, CSS-in-head, inline JavaScript, tracking pixels, sidebars, cookie banners, footers, and layout divs. The prose — the actual article — is usually a fraction of the bytes. When an agent stuffs that page into an LLM context, every token of wrapper is a token it can’t spend on your content, or on reasoning.

Markdown strips all of that. <h1>My article</h1> becomes # My article. A <ul> of three items becomes three lines prefixed with -. Typical reductions are 40–80% fewer tokens for the same semantic content, sometimes more for heavy sites.

Fewer tokens means:

Lower API costs for the agent (they pay per token)
More headroom in the context window
Less model attention wasted on ignoring chrome

2. Retrieval quality

RAG pipelines and agent memories often embed the fetched content. When the source is HTML, embeddings include the noise — “subscribe to our newsletter,” cookie text, related-article rails. That pollutes the vector and makes retrieval noisier.

Markdown arrives with only the prose and the semantic structure (headings, lists, code blocks). Embeddings of markdown content are tighter and match queries about the actual article, not the layout.

3. Latency

Less bytes to transfer, less parsing on the agent side. Content-Length on a markdown response is often 10–30% of the equivalent HTML. On a slow connection or a constrained agent (a phone, an edge function), the time-to-first-token drops noticeably.

What HTML-to-markdown converters miss

You could let the agent fetch your HTML and run a client-side converter. In practice they do that today. But:

Most converters are lossy on tables, code blocks with syntax highlighting, and custom components.
You have no control over what they strip.
You pay for the HTML bytes the agent ends up throwing away.

Serving markdown directly skips that roundtrip.

What’s next

See the Accept: text/markdown convention for how agents actually ask for markdown, and generating the markdown for how to produce it.