Why markdown for LLMs
Fewer tokens, higher retrieval signal, faster responses. The three reasons an agent prefers markdown over the HTML you'd send a browser.
Agents can read HTML. The reason they’d rather read markdown is that HTML is mostly overhead for them.
1. Tokens
An HTML document is wrapped in <html>, <head>, <body>, navigation
menus, CSS-in-head, inline JavaScript, tracking pixels, sidebars, cookie
banners, footers, and layout divs. The prose — the actual article —
is usually a fraction of the bytes. When an agent stuffs that page into
an LLM context, every token of wrapper is a token it can’t spend on your
content, or on reasoning.
Markdown strips all of that. <h1>My article</h1> becomes # My article.
A <ul> of three items becomes three lines prefixed with -. Typical
reductions are 40–80% fewer tokens for the same semantic content,
sometimes more for heavy sites.
Fewer tokens means:
- Lower API costs for the agent (they pay per token)
- More headroom in the context window
- Less model attention wasted on ignoring chrome
2. Retrieval quality
RAG pipelines and agent memories often embed the fetched content. When the source is HTML, embeddings include the noise — “subscribe to our newsletter,” cookie text, related-article rails. That pollutes the vector and makes retrieval noisier.
Markdown arrives with only the prose and the semantic structure (headings, lists, code blocks). Embeddings of markdown content are tighter and match queries about the actual article, not the layout.
3. Latency
Less bytes to transfer, less parsing on the agent side. Content-Length
on a markdown response is often 10–30% of the equivalent HTML. On a
slow connection or a constrained agent (a phone, an edge function), the
time-to-first-token drops noticeably.
What HTML-to-markdown converters miss
You could let the agent fetch your HTML and run a client-side converter. In practice they do that today. But:
- Most converters are lossy on tables, code blocks with syntax highlighting, and custom components.
- You have no control over what they strip.
- You pay for the HTML bytes the agent ends up throwing away.
Serving markdown directly skips that roundtrip.
What’s next
See the Accept: text/markdown convention
for how agents actually ask for markdown, and
generating the markdown for how to
produce it.