The Backend a Mobile App Needs

A mobile documentation app needs more backend than you might expect. Users download docsets, search across them, verify subscriptions, and hit rate limits. The server must respond quickly from anywhere in the world.

The conventional approach involves AWS Lambda for compute, S3 for storage, RDS or DynamoDB for the database, ElastiCache for caching, Pinecone for vector search, and OpenAI for embeddings. These services connect through IAM roles and VPC configurations. Each has its own billing console, its own failure modes, and its own SDK. A Terraform state file grows by the month.

We went a different direction. The DocNative API runs on Cloudflare Workers with D1 for the database, R2 for object storage, Vectorize for semantic search, Workers AI for embeddings, KV for caching, Durable Objects for rate limiting, and Pages for the marketing site. Nine services under one account.

Beyond the Services

Cloudflare provides infrastructure that other providers charge extra for or leave out entirely.

Domain registration runs at cost. The docnative.app domain costs $9.15 per year with WHOIS privacy included. The global edge network spans 300+ points of presence, routing requests to the nearest data center automatically. SSL certificates generate and renew without intervention. DDoS protection runs at layers 3, 4, and 7 without configuration.

Observability comes from a single line in the config file. Logs, metrics, and traces appear in the dashboard without a Datadog contract or agent installation.

These capabilities ship with the account. They are not premium add-ons.

The Stack

Nine services handle different parts of the backend.

Service	Purpose
Workers	API backend using the Hono framework
D1	SQLite database for docset metadata and manifests
R2	Object storage for compressed documentation bundles
Vectorize	Vector database for semantic search
Workers AI	Embedding generation
Durable Objects	Per-user rate limiting with SQLite state
KV	Short-lived cache for subscription verification
Pages	Static Nuxt marketing site

The list looks complex. In practice, it simplifies operations compared to maintaining separate AWS services with network hops between them.

Service Bindings

Service bindings make this architecture practical.

Traditional microservices communicate over HTTP. Service A calls Service B's endpoint, waits for a response, and handles timeouts and retries. Each hop adds latency and introduces failure modes.

Cloudflare service bindings skip the network entirely. A Worker accesses D1, R2, KV, and Vectorize through typed bindings injected at runtime. There are no credentials to rotate, no endpoints to configure, and no network round trips within the data center.

typescript•env.ts

interface Env {
  DB: D1Database;
  AI: Ai;
  RATE_LIMITER: DurableObjectNamespace;
  VECTORIZE: VectorizeIndex;
  REVENUECAT_CACHE: KVNamespace;
}

The DB binding is a D1 database. Calling env.DB.prepare(sql).all() returns results directly. The VECTORIZE binding is a vector index. Calling env.VECTORIZE.query(vector, { topK: 10 }) returns matches. No SDK initialization, no connection pooling.

The wrangler.toml file declares all bindings in one place. Running wrangler deploy resolves them automatically.

Rate Limiting with Durable Objects

Users get daily download limits. Free tier allows 3 downloads per day. Pro tier allows 100. The rate limiter needs strong consistency. When a user hits the limit, the next request must fail immediately.

KV would not work here because it is eventually consistent. Two requests arriving at the same time might both succeed before the counter updates. D1 could handle it, but concurrent writes from multiple Workers risk contention.

Durable Objects solve this problem cleanly. Each instance runs on a single thread with guaranteed sequential execution. No distributed locking is required. SQLite comes built in.

typescript•rate-limiter.ts

this.sql.exec(`
  INSERT INTO rate_limits (date, count) VALUES (?, 1)
  ON CONFLICT(date) DO UPDATE SET count = count + 1
`, today);

One SQL statement handles both first access and subsequent increments. The ON CONFLICT clause makes it atomic. The Durable Object guarantees that no two requests execute this statement simultaneously for the same user.

The pattern assigns one Durable Object per user-action combination. The Worker computes a deterministic ID from the user and action type, retrieves the corresponding stub, and calls its fetch method. The Durable Object checks the count, increments if allowed, and returns the result.

Counters reset at UTC midnight. The date column serves as the primary key. Old dates accumulate until a cleanup job removes them.

Presigned URLs for Mobile Downloads

Documentation bundles live in R2. Users download them directly rather than through the API. Proxying multi-megabyte files through a Worker would add latency and cost.

R2 supports S3-compatible presigned URLs. The API generates a signed URL valid for one hour, and the mobile app downloads directly from Cloudflare's edge.

typescript•r2-signer.ts

const client = new AwsClient({
  accessKeyId: env.R2_ACCESS_KEY,
  secretAccessKey: env.R2_SECRET_KEY,
  service: 's3',
  region: 'auto',
});

const signed = await client.sign(
  new Request(r2Url, { method: 'GET' }),
  { aws: { signQuery: true } }
);

The aws4fetch library handles SigV4 signing. Setting signQuery: true puts the signature in query parameters rather than headers, producing a standalone URL that the client can fetch without additional authentication.

The mobile app calls /download/{slug}, receives a presigned URL, and downloads directly from R2. The API never touches the file bytes.

Vector Search

Documentation search should understand intent. A query for "iterate array" should return Array.prototype.forEach even when the page title contains neither word.

Cloudflare Vectorize stores embeddings and Workers AI generates them. Both run on Cloudflare infrastructure.

The ingestion pipeline chunks documentation into paragraphs, generates embeddings with Google's EmbeddingGEMMA 300M model, and upserts vectors to the index. Each vector carries metadata including the docset, page slug, and section title.

typescript•vectorize.ts

const response = await env.AI.run('@cf/google/embeddinggemma-300m', {
  text: texts,
});

await env.VECTORIZE.upsert(
  vectors.map((embedding, i) => ({
    id: chunks[i].id,
    values: embedding,
    metadata: { docroot, slug: chunks[i].slug, title: chunks[i].title },
  }))
);

Search queries follow the same path: embed the query, search the index, fetch snippets from D1, return results. The entire flow typically completes in under 100ms.

Change detection prevents redundant work. Each chunk stores a content hash. During re-ingestion, unchanged chunks skip the embedding step. Only new or modified content hits Workers AI.

Caching Subscriptions with KV

The API verifies subscriptions through RevenueCat. Hitting RevenueCat's API on every request would add latency and risk rate limiting.

KV caches entitlements for five minutes. The cache key is the customer ID, and the value indicates whether the user has an active subscription.

The five-minute TTL balances freshness against API load. A user who cancels waits at most five minutes before losing access. A user who subscribes gains access immediately because the subscription call invalidates the cache.

Cache invalidation happens on writes. Granting a trial, upgrading to pro, or revoking access each delete the cached entry. The next read fetches fresh data from RevenueCat.

The Data Flow

Documentation moves through the system in stages.

A TypeScript pipeline scrapes source documentation, cleans the HTML, chunks it into blocks, compresses the result with gzip, and uploads to R2. Each docset becomes a .db.gz file containing a SQLite database.

The /ingest endpoint receives a manifest listing all chunks for a docset. It generates embeddings for new or changed chunks, upserts vectors to Vectorize, and writes metadata to D1.

The mobile app fetches the manifest from /manifest, downloads bundles using presigned R2 URLs, and stores them locally. Search queries hit the API, which queries Vectorize and returns matches with snippets from D1.

Updates follow the same path. New documentation versions upload to R2, trigger re-ingestion, and appear in the manifest. The mobile app checks for updates periodically and downloads changed bundles.

The Tradeoffs

This architecture trades some flexibility for operational simplicity.

D1 is SQLite. There are no stored procedures, no advanced query planner, no PostGIS. But the access pattern fits SQLite well: the build pipeline writes docset metadata, and users only read it. No concurrent writes and no complex transactions mean Postgres would be overkill.

Service bindings only work on Cloudflare. Moving to AWS would require rewriting the data access layer. We accept this tradeoff because the operational simplicity is worth it.

Cloudflare's developer tools are younger than AWS's. Wrangler occasionally surprises us. D1 migrations require manual SQL files. The ecosystem improves monthly, but gaps remain.

Secrets management is still maturing. Cloudflare Secrets is in beta. We use Doppler to manage secrets and sync them to Workers through a deployment script. When Secrets reaches general availability, we will consolidate.

The gains justify the costs. Workers have no cold starts, so every request responds quickly. Service bindings eliminate network hops between services. Edge execution provides consistent latency worldwide. A single bill simplifies vendor management.

Deployment

Deploying the API takes one command: wrangler deploy. The wrangler.toml file declares every binding, route, and compatibility flag. Cloudflare provisions resources automatically.

Secrets sync from Doppler before each deployment. A justfile recipe filters and transforms environment variables, then pipes them to wrangler secret bulk. The API picks up new secrets on the next deploy.

D1 migrations run separately. Each migration is a numbered SQL file executed in order against the remote database.

The marketing site deploys through Cloudflare Pages. Pushing to main triggers a build of the Nuxt static site and deploys it to the edge. Pull requests generate preview deployments automatically.

What This Enables

DocNative ships documentation for MDN, Python, React, Swift, TypeScript, and Go. The API serves manifest requests, generates presigned download URLs, and handles semantic search. Rate limiting prevents abuse. Subscription verification gates premium features.

Everything runs on Cloudflare. One account manages domains, DNS, compute, storage, databases, vector search, and static hosting. The marketing site serves from the same edge network as the API. Documentation downloads stream from R2 buckets in the same account.

When you read the Array.prototype.map documentation offline or search through SwiftUI references on the subway, the content comes from a bundle that downloaded once and works forever. Behind that experience, nine Cloudflare services coordinate to keep everything fast and reliable. The phone never waits for a cold start. The API never routes through a distant region. The download never proxies through a Worker.

The infrastructure stays out of the way.

Coming Soon

We're excited to announce that in the coming weeks, we'll be shipping offline documentation for all Cloudflare services—including Workers, D1, R2, Vectorize, Workers AI, Durable Objects, KV, and Pages. Read Cloudflare's developer docs offline, anywhere.

How Nine Cloudflare Services Power Our Offline Documentation App