The Problem

DocNative ships over 100,000 documentation pages. MDN JavaScript alone runs to thousands of entries, some exceeding 100KB of raw HTML. These pages can be dense: syntax-highlighted code blocks, rendered LaTeX math, inline SVGs, tables with complex column spans. Many elements require bespoke native renderers.

Feed a 50KB page with 40 code blocks and a dozen math expressions to react-native-render-html, and the result is predictable. Parse time blocks the JavaScript thread for 800ms or more. Then, the entire page needs to be rendered before contentful paint. The entire DOM tree loads into memory at once.

The library does excellent work for its intended use case: small HTML snippets. Documentation pages are not small snippets.

The Insight

Parsing is expensive. We decided to do it once, at build time.

Instead of shipping raw HTML and parsing on the phone, we ship pre-chunked JSON. The build pipeline walks each document, splits it into semantic blocks (headings, paragraphs, code blocks, tables), and stores the result as a compressed array. The mobile app receives predictable pieces and renders them in a virtualized list.

The phone never constructs the full HTML tree. It decompresses an array and hands chunks to a recycling list. First contentful paint dropped from 800ms to under 200ms.

Build-Time Chunking

The chunker classifies HTML elements into three categories. Block content tags like h1, p, pre, and table become individual chunks. Wrapper tags like section and article get flattened; their children are processed directly. Skip tags like script and style are discarded.

The algorithm walks the DOM recursively, collecting chunks as it goes. Divs receive special handling: if a div contains block-level children, we flatten it and process those children. If it contains only inline content, we keep it as a single chunk. A depth limit of 10 prevents stack overflow on pathological nesting.

Each chunk stores three things: the tag name (for height estimation), the full HTML string, and an array of anchor IDs found within that element. The anchor IDs enable scroll-to navigation. Given any ID, we find the chunk that contains it and scroll the list to that index.

typescript•models/schema.ts

interface HtmlChunk {
  tag: string;      // "h1", "p", "pre", "table"
  html: string;     // Full outerHTML of this block
  ids: string[];    // Anchor IDs for scroll-to navigation
}

The chunk array is serialized to JSON, compressed with zstd at level 5, and stored as a blob in SQLite. The raw_text_fts column holds extracted plaintext for full-text search, kept uncompressed so FTS5 can index it directly.

Runtime Loading

On device, loading a document requires one SQLite query and one native decompression call. The query fetches the compressed blob by docroot and slug. The react-native-zstd module calls the C zstd library directly; there is no JavaScript decoding overhead.

We cache the last 50 documents in memory using a simple Map. JavaScript Maps maintain insertion order, so eviction means deleting the first key. Concurrent requests for the same document share a single database query through an in-flight request map. Ten simultaneous navigations to the same page trigger one read, not ten.

typescript•DocReader.ts

const inFlight = this.inFlightRequests.get(cacheKey);
if (inFlight) return inFlight;

const requestPromise = this.fetchChunks(docroot, slug);
this.inFlightRequests.set(cacheKey, requestPromise);

try {
  return await requestPromise;
} finally {
  this.inFlightRequests.delete(cacheKey);
}

The finally block ensures cleanup even when the query fails. Without it, a failed request would leave a rejected promise in the map forever.

Virtualized Rendering

The chunk array feeds directly into LegendList, a virtualized list built for variable-height content. We configure it with a draw distance of 1000 pixels, meaning it renders content one screen above and below the viewport. The recycleItems flag enables aggressive view recycling during scroll.

LegendList needs height estimates to position items before they render. The chunk's tag field provides this. Headings get 40-60 pixels depending on level. Code blocks get 120. Tables get 200. Paragraphs default to 80. These are estimates, not measurements. The list measures actual heights after render and adjusts scroll position to compensate. The estimates just need to be close enough to avoid jarring jumps.

typescript•DocViewer.tsx

const getEstimatedItemSize = useCallback((_index: number, item: HtmlChunk) => {
  switch (item.tag) {
    case 'h1': return 60;
    case 'h2': return 50;
    case 'pre': return 120;
    case 'table': return 200;
    default: return 80;
  }
}, []);

Each chunk renders through a memoized component with a custom equality check. The comparator looks at three props: html content, content width, and chunk index. Scroll events do not trigger re-renders. Only actual content changes or find-in-page highlights cause updates. Therefore, instead of one bloated HTML renderer per page, we’re rendering a LegendList of separate HTML renderers, each parsing and rendering as little as possible, when necessary, to optimise responsiveness.

Tip

Fine-grained Zustand selectors return primitives, not objects. Returning an object from a selector creates a new reference on every call, which defeats memoization and causes infinite re-render loops.

Results

On a Pixel 4a running Android 13:

Metric	Before	After
First contentful paint	800-1200ms	150-200ms
Scroll frame rate	15-30fps	58-60fps
Memory per document	8-15MB	2-4MB

iOS numbers are better. The native zstd binding runs faster, and UIKit's text rendering outperforms Android's.

The Pipeline

Build time: HTML enters the cleaning pipeline, gets chunked into semantic blocks, serialized to JSON, compressed with zstd, and written to a SQLite blob.

Runtime: SQLite query fetches the blob, native zstd decompresses it, JSON.parse returns the array, and LegendList renders visible chunks while recycling off-screen views.

The phone never parses the full HTML. It never builds a DOM tree. It never holds an entire document in memory. Parse once at build time, decompress and render at runtime. That is the entire trick.

What This Unlocks

The architecture handles any large-document use case: offline API references, ebook readers, technical wikis. The chunk format could encode richer metadata for smarter height prediction. Progressive streaming could load documents with hundreds of sections in stages.

This is the rendering engine behind DocNative. The app ships with docsets including MDN JavaScript, Python, React, TypeScript, and Go, all processed through this chunking pipeline and rendered through LegendList virtualization. You can read the Array.prototype.map reference on a flight or dig through Go concurrency patterns on the subway. The pages load in under 50ms and scroll at 60fps because the phone only parses what it needs to. It decompresses an array and renders what fits on screen. Everything else waits in SQLite until you scroll to it.

Rendering 100,000+ Documents at 60fps on Mobile