# Generating the Markdown

Three approaches — source-of-truth, build-time dual rendering, and runtime HTML-to-Markdown. Pick one based on where your content lives.

Content negotiation is only useful if you have Markdown to negotiate.
Three approaches, in rough order of fidelity:

## 1. Markdown is the source

Your content is authored in Markdown or MDX. You render it to HTML for
browsers, and serve the original (or a lightly-processed) Markdown to
agents.

**When this fits:** blogs, docs sites, anything static-site-generator
driven. Hugo, Astro, Next.js with MDX, Eleventy, Jekyll — all ship
content as Markdown files already.

**How:**
- Build step emits `page.html` and `page.md` side by side.
- Runtime chooses which to serve based on `Accept`.
- Strip front-matter if you don't want it in the Markdown response.
- Consider stripping MDX-specific components (e.g., `<Alert>`) that
  don't render as pure Markdown.

This is the highest-fidelity approach because you never round-trip.

## 2. Database or CMS content, dual-rendered at write-time

Your content is in a CMS or a database. Content is authored in
rich-text or block-based editors, stored as HTML or structured JSON.

**How:**
- When content is saved, run the HTML through a converter and store
  both representations.
- Content table grows by the size of the Markdown variant.
- Serve the cached Markdown at request time — no runtime conversion.

Converters:
- JavaScript: [`turndown`](https://github.com/mixmark-io/turndown), [`html-to-md`](https://github.com/stonehank/html-to-md)
- Python: [`html2text`](https://github.com/Alir3z4/html2text), [`markdownify`](https://github.com/matthewwithanm/python-markdownify)
- PHP: [`league/html-to-markdown`](https://github.com/thephpleague/html-to-markdown)
- Ruby: [`reverse_markdown`](https://github.com/xijo/reverse_markdown)

## 3. Runtime HTML-to-Markdown conversion

Your content is dynamic, not stored, and you can't touch the render
pipeline. You negotiate `Accept` at the edge or proxy, fetch the HTML,
and convert on the fly.

**How:**
- A reverse proxy (Worker, Lambda, middleware) intercepts requests.
- On `Accept: text/markdown`, it fetches the HTML from origin.
- Runs an HTML-to-Markdown converter.
- Returns the Markdown with correct headers.

The [Roots `post-content-to-markdown`](https://github.com/roots/post-content-to-markdown)
plugin is this approach for WordPress — it converts a post's content
to Markdown on request.

**Cloudflare's
[Markdown for Agents](/guides/cloudflare-markdown-for-agents)** is this
approach, managed. If you need it self-hosted, a Cloudflare Worker
running `turndown` is a 20-line implementation.

**Tradeoffs:**
- Runtime cost per request (unless you cache aggressively).
- Lossy on complex content — custom components, embedded widgets,
  interactive elements don't translate.
- Zero fidelity guarantees — the same HTML may convert differently
  after a CSS change.

## What to strip from the Markdown

Regardless of approach, the Markdown representation should be *just
the content*. Strip:

- Site navigation and footer chrome
- Related-content sidebars and widgets
- Share buttons, social metadata
- Cookie banners and consent UI
- Advertisements
- Newsletter signup forms
- Comment threads (unless integral to the content)

If your HTML renders the content inside a specific container (`<main>`,
`<article>`, `.post-body`), scope the conversion to that container.

## Preserve what matters

- **Headings** and their hierarchy (`#` through `######`)
- **Links** with their text and target
- **Code blocks** with language hints (the triple-backtick fence)
- **Lists** (ordered and unordered)
- **Emphasis** (`*italic*`, `**bold**`)
- **Tables** (GFM syntax is widely supported)
- **Images** — include the alt text and URL

## What to *not* generate

- **Front-matter** in the response body (YAML `---` blocks). Useful for
  SSG input, noise for agents. Strip before serving.
- **HTML fallbacks inside Markdown** (e.g., `<div class="callout">`)
  unless your Markdown flavor really needs them. GitHub-flavored
  Markdown renders most things natively.