← buildbench

Two engines, one content type, one inevitable bug

The drafts surface and the public site render the same markdown. They had no business producing different HTML for it. They did anyway.

The split started for a defensible reason. The public site is Astro, which has its own remark/rehype pipeline baked in. The drafts surface is a Cloudflare Worker, where I’d reached for marked because it was small and well-behaved on the edge. Two engines, same input, different runtimes. Fine.

The trouble was the bidi-isolate logic. Hebrew posts with inline English fragments need each Latin run wrapped in <bdi> so the punctuation around them doesn’t get pulled into the surrounding RTL paragraph direction. I wrote the rule once as a rehype plugin (for Astro) and once as a marked text-renderer override (for the Worker). Same partition logic, two integrations. Fine, I told myself, the partition is in shared/.

Then a published draft started showing this in its prose:

ואז קיבלתי הודעה אחת: &quot;Can&#39;t we actually fix these errors?&quot;

Those are HTML entity references rendering as literal characters — the markup contained &amp;quot;, the browser dutifully displayed &quot;. The marked text renderer was being handed text that the inline lexer had already escaped once, and my override was escaping it again. A bug that simply could not exist on the Astro side, because the Astro side wasn’t going through marked.

I almost wrote an unescapeHtml band-aid. Then I stopped. The bug wasn’t the double-escape — the bug was that I had two implementations of “render markdown for this blog.” One of them was always going to drift.

Convergence was the fix. The Worker now runs the same unified pipeline Astro uses: remark-parse → remark-gfm → remark-smartypants → remark-rehype → rehype-raw → @shikijs/rehype → rehype-isolate-ltr → rehype-stringify. Same plugin list. Same output. The marked-specific text renderer is gone, and so is the double-escape.

Two Workers-specific gotchas surfaced on the way:

WASM is disallowed. Shiki’s default highlighter boots an Oniguruma WASM engine, and Workers refuse dynamic Wasm compilation:

WebAssembly.instantiate(): Wasm code generation disallowed by embedder

The standard getSingletonHighlighter() path is unusable on the edge for that reason. @shikijs/rehype/core exposes the lower-level entrypoint, and shiki ships a pure-JS regex engine that’s a drop-in replacement:

import { createHighlighterCore } from 'shiki/core';
import { createJavaScriptRegexEngine } from 'shiki/engine/javascript';
import rehypeShikiFromHighlighter from '@shikijs/rehype/core';

const highlighter = await createHighlighterCore({
  themes: [import('shiki/themes/github-dark.mjs')],
  langs: [import('shiki/langs/typescript.mjs') /* ... */],
  engine: createJavaScriptRegexEngine(),
});

Bundled languages are not free. The first build came in at 9.9 MB. Shiki defaults to Object.keys(bundledLanguages) — every grammar in the package — and esbuild dutifully shipped all of them. Listing the half-dozen langs the blog actually uses dropped the bundle to 1.7 MB, ~330 KB gzipped. That’s well under the Workers Paid 10 MB ceiling, but it’s a reminder that “import the library” can mean “import every language file the maintainer wrote.”

The shape of the fix is the part I want to remember. Two engines for one content type is a duplication I tolerated because the surfaces were genuinely different runtimes and the immediate cost looked small. The cost showed up later, as a class of bug that could only exist on one side. Convergence didn’t just close the bug — it closed a category.