Engineering

Hono in Production: Patterns from a 20-Route API on Cloudflare Workers

Jan 27, 202617 min read
honocloudflaretypescriptapi

Beyond Hello World

The DocNative API handles documentation downloads, semantic search, subscription verification, and rate limiting for a production mobile app. It runs on Cloudflare Workers, uses Hono as its framework, and touches six different Cloudflare services.

Most Hono content shows a three-line server and stops there. This post shows what happens when you scale that to 20+ routes, integrate with D1, R2, KV, Durable Objects, Vectorize, and Workers AI, and ship to production.

The patterns that emerge are not in the documentation. They come from shipping.

Why Hono

Express dominated Node.js APIs for a decade. It still works. But Express was designed for a different era: long-running Node processes, callback-based middleware, and loose typing bolted on after the fact.

Hono was built for the edge from the start. It uses Web Standard APIs (Request, Response, fetch). It has no Node dependencies. It runs on Workers, Deno, Bun, and Node without modification. TypeScript is first-class, not an afterthought.

The middleware typing is the real differentiator. Express middleware signatures are loose. You extend req with arbitrary properties and hope downstream handlers know what to expect. Hono's context object is fully typed. When middleware sets a value, handlers downstream see the correct type with no casting.

tRPC was the other contender. It excels at end-to-end TypeScript applications where the client and server share types. But DocNative has a React Native mobile client and a Vue admin panel. We needed standard REST endpoints with OpenAPI documentation, not RPC procedures. Hono with @hono/zod-openapi gave us type safety without locking us into a specific client.

OpenAPI-First Development

Routes in this codebase are not handlers with documentation sprinkled on top. The OpenAPI schema is the source of truth. Handlers implement the schema.

Each route starts with createRoute():

typescriptroutes/manifest.ts
const manifestRoute = createRoute({
  method: 'get',
  path: '/',
  operationId: 'getManifest',
  description: 'Get all docsets with metadata. Requires X-Customer-ID header.',
  request: {
    headers: z.object({
      'x-customer-id': CustomerIdHeader
    })
  },
  responses: {
    200: {
      content: { 'application/json': { schema: ManifestSchema } },
      description: 'Manifest with all docsets'
    },
    500: {
      content: { 'application/json': { schema: ErrorSchema } },
      description: 'Database error'
    }
  },
  tags: ['Manifest']
});

The Zod schemas define validation rules and OpenAPI examples in one place. ManifestSchema is not just a type. It validates incoming data, documents the response shape, and generates example values for the API docs.

The handler registers with app.openapi():

typescriptroutes/manifest.ts
app.openapi(manifestRoute, async (c) => {
  const auth = c.get('auth');
  const docsets = await c.env.DB.prepare('SELECT * FROM docsets WHERE visible = 1').all();

  return c.json({
    database_version: version,
    docsets: docsets.results,
    updated_at: timestamp
  }, 200);
});

c.req.valid('query'), c.req.valid('param'), and c.req.valid('header') return typed data. The validation happens before the handler runs. If it fails, the client gets a structured error response. No manual parsing, no try-catch around JSON.parse.

The Schema-to-Types Pipeline

Zod schemas define the API contract. A build script extracts the OpenAPI document and generates TypeScript types for the admin panel.

The script imports every route module, mounts them on a minimal OpenAPIHono instance, and calls getOpenAPIDocument():

typescriptscripts/export-openapi.ts
const app = new OpenAPIHono();
app.route('/manifest', manifestRoutes);
app.route('/download', downloadRoutes);
app.route('/search', searchRoutes);
// ... 17 more routes

const openAPIDocument = app.getOpenAPIDocument(openAPIConfig);
writeFileSync('openapi.json', JSON.stringify(openAPIDocument, null, 2));

const ast = await openapiTS(openAPIDocument);
writeFileSync('../docnative-admin/app/types/api.d.ts', astToString(ast));
writeFileSync('../src/types/api.d.ts', astToString(ast));

The output is a 130KB TypeScript declaration file. The app and admin panel import it and gets fully typed API responses:

typescriptadmin/api-client.ts
import type { components } from './types/api.d.ts';
type Manifest = components['schemas']['Manifest'];

One source of truth. Zod schema changes flow to OpenAPI JSON, then to TypeScript types. The mobile client uses the OpenAPI spec for code generation. The admin panel uses the generated types directly. Nobody writes API types by hand.

Edge Validation with API Shield

The same OpenAPI schema that generates types also validates requests at the edge before the Worker runs.

Cloudflare API Shield accepts an OpenAPI specification and rejects malformed requests at the edge. A request with an invalid path, wrong method, or missing required parameter never reaches the Worker. The validation happens in Cloudflare's network, not in your code.

Upload the schema via wrangler:

bashterminal
wrangler api-gateway upload openapi.json

Or through the dashboard under Security > API Shield > Schema Validation. Once uploaded, Cloudflare compares every incoming request against the schema. Requests that violate the contract get a 400 response before consuming Worker CPU time.

This matters for a few reasons. Malformed requests from bots, scanners, and misconfigured clients hit the API constantly. Each one spins up a Worker instance, runs through middleware, hits validation, and returns an error. That costs money and adds noise to logs. Edge validation eliminates the entire category.

The schema also documents what endpoints exist. Requests to undefined paths get rejected. A scanner probing for /admin, /wp-admin, or /.env never reaches the Worker. The attack surface shrinks to exactly what the API exposes.

The workflow stays simple. Change a route, regenerate the schema, upload it. The same openapi.json that generates TypeScript types also configures edge validation. One artifact, three purposes: documentation, type generation, and security.

Type-Safe Context Variables

Hono's context object carries request data, environment bindings, and custom variables set by middleware. The typing for custom variables uses module augmentation:

typescriptmiddleware/auth.ts
declare module 'hono' {
  interface ContextVariableMap {
    auth: AuthInfo;
  }
}

interface AuthInfo {
  customerId: string;
  tier: 'free' | 'pro';
}

This declaration extends Hono's built-in types. Now c.get('auth') returns AuthInfo without casting. c.set('auth', value) requires the value to match the interface.

The auth middleware sets the variable:

typescriptmiddleware/auth.ts
export function auth(): MiddlewareHandler<{ Bindings: Env }> {
  return async (c, next) => {
    const customerId = c.req.header('X-Customer-ID');
    const entitlements = await getEntitlements(c.env, customerId);

    c.set('auth', { customerId, tier: entitlements.isPro ? 'pro' : 'free' });
    return next();
  };
}

Route handlers access it with full type inference. The compiler knows auth.tier is 'free' | 'pro', not string. No runtime type guards needed.

Middleware Composition

Global middleware applies to all routes. Path-specific middleware applies selectively. Order matters.

typescriptindex.ts
// Global middleware
app.use('*', errorHandler());
app.use('*', secureHeaders());
app.use('*', cors({ origin: ['https://docnative.app', 'https://admin.example.com'] }));

// Public routes (no auth)
app.route('/', publicRoutes);
app.route('/contact', contactRoutes);

// Authenticated routes
app.use('/manifest', auth());
app.use('/download/*', auth());
app.use('/download/*', rateLimit('download'));
app.use('/search', auth());
app.use('/search', requirePro());
app.use('/search', rateLimit('ai_search'));

// Mount route handlers after middleware
app.route('/manifest', manifestRoutes);
app.route('/download', downloadRoutes);
app.route('/search', searchRoutes);

Middleware dependencies are implicit in the order. requirePro() reads from c.get('auth'), so it must run after auth(). The rate limiter needs the customer ID, so it runs after authentication.

Each route file exports its own OpenAPIHono instance. The main app mounts them with .route(). This keeps route files small and independently testable.

Error Handling Architecture

Custom error classes encode their HTTP contract:

typescriptlib/errors.ts
export class ApiError extends Error {
  constructor(
    message: string,
    public readonly statusCode: number,
    public readonly code?: string,
    public readonly detail?: string
  ) {
    super(message);
  }

  toJSON() {
    return {
      error: this.message,
      ...(this.code && { code: this.code }),
      ...(this.detail && { detail: this.detail })
    };
  }
}

export class RateLimitError extends ApiError {
  constructor(public readonly limit: number, public readonly current: number) {
    super('Rate limit exceeded', 429, 'RATE_LIMIT_EXCEEDED');
  }

  override toJSON() {
    return {
      error: this.message,
      code: this.code,
      limit: this.limit,
      current: this.current,
      remaining: 0,
      retry_after: SECONDS_UNTIL_RESET
    };
  }
}

A global error handler catches everything:

typescriptmiddleware/error-handler.ts
export function errorHandler(): MiddlewareHandler<{ Bindings: Env }> {
  return async (c, next) => {
    try {
      await next();
    } catch (error) {
      if (error instanceof ApiError) {
        if (error.statusCode >= 500) {
          captureException(error);
          c.executionCtx.waitUntil(flushSentry());
        }
        return c.json(error.toJSON(), error.statusCode);
      }

      logger.error('Unhandled error:', error);
      return c.json({ error: 'INTERNAL_ERROR' }, 500);
    }
  };
}

The Sentry call uses waitUntil() to avoid blocking the response. Error reporting happens after the client receives their error. 4xx errors are client problems; they get logged but not reported to Sentry. 5xx errors go to Sentry for investigation.

Cloudflare Service Bindings

Workers access Cloudflare services through typed bindings. No SDK initialization, no connection pooling, no credentials in environment variables (except for R2 presigning).

typescripttypes/env.ts
export interface Env {
  DB: D1Database;
  AI: Ai;
  RATE_LIMITER: DurableObjectNamespace;
  VECTORIZE: VectorizeIndex;
  REVENUECAT_CACHE: KVNamespace;

  R2_ACCESS_KEY: string;
  R2_SECRET_KEY: string;
  ACCOUNT_ID: string;
}

D1 queries look like prepared statements. The binding handles connection management:

typescriptroutes/download.ts
const result = await c.env.DB.prepare('SELECT * FROM docsets WHERE slug = ?')
  .bind(slug)
  .first<DbDocset>();

KV caches RevenueCat entitlements with a short TTL. The cache key is the customer ID:

typescriptservices/revenuecat.ts
const cached = await env.REVENUECAT_CACHE.get(cacheKey, 'json');
if (cached) return cached;

const fresh = await fetchFromRevenueCat(customerId);
await env.REVENUECAT_CACHE.put(cacheKey, JSON.stringify(fresh), {
  expirationTtl: CACHE_TTL_SECONDS
});

Vectorize stores embeddings for semantic search. Workers AI generates them:

typescriptservices/vectorize.ts
const embedding = await env.AI.run('@cf/google/embeddinggemma-300m', { text: query });
const results = await env.VECTORIZE.query(embedding.data[0], { topK: 10 });

Rate Limiting with Durable Objects

KV is eventually consistent. Two requests hitting different edge locations might both read a counter as 0, increment it, and write back 1. Users could exceed their limits.

Durable Objects provide strong consistency. Each instance runs on a single thread with sequential execution. No distributed locking required.

The rate limiter uses SQLite storage inside the Durable Object:

typescriptdurable-objects/rate-limiter.ts
export class RateLimiter extends DurableObject {
  private sql: SqlStorage;

  constructor(ctx: DurableObjectState) {
    super(ctx);
    this.sql = ctx.storage.sql;
    this.sql.exec(`
      CREATE TABLE IF NOT EXISTS rate_limits (
        date TEXT PRIMARY KEY,
        count INTEGER NOT NULL DEFAULT 0
      )
    `);
  }

  private checkAndIncrement(limit: number) {
    const today = new Date().toISOString().split('T')[0];
    const row = this.sql.exec('SELECT count FROM rate_limits WHERE date = ?', today).toArray()[0];
    const current = row?.count ?? 0;

    if (current >= limit) {
      return { allowed: false, current, limit };
    }

    this.sql.exec(
      `INSERT INTO rate_limits (date, count) VALUES (?, 1)
       ON CONFLICT(date) DO UPDATE SET count = count + 1`,
      today
    );

    return { allowed: true, current: current + 1, limit };
  }
}

The middleware gets a stub for the user's Durable Object and makes an RPC call:

typescriptmiddleware/rate-limit.ts
const id = c.env.RATE_LIMITER.idFromName(`download:${auth.customerId}`);
const stub = c.env.RATE_LIMITER.get(id);
const response = await stub.fetch(new Request('https://rate-limiter/check', {
  method: 'POST',
  body: JSON.stringify({ limit })
}));

Each user-action combination gets its own Durable Object. The naming convention download:customer123 partitions the namespace. Daily counters reset at UTC midnight because the date column is the primary key.

The middleware fails open. If the Durable Object throws, the request proceeds. Availability beats strict enforcement:

typescriptmiddleware/rate-limit.ts
try {
  const result = await stub.fetch(...);
  if (!result.allowed) throw new RateLimitError(result.limit, result.current);
} catch (error) {
  if (error instanceof RateLimitError) throw error;
  logger.warn('Rate limiter error, failing open:', error);
}

Fire-and-Forget Async

Workers have a waitUntil() method on the execution context. It queues async work that continues after the response is sent. The runtime waits for these promises before terminating the Worker, but the client does not wait.

Every route uses this for analytics:

typescriptroutes/download.ts
app.openapi(downloadRoute, async (c) => {
  const presignedUrl = await generatePresignedUrl(tarballUrl, config);

  c.executionCtx.waitUntil(recordDownload(c.env, {
    slug,
    tier: auth.tier,
    platform: detectPlatform(c.req.header('User-Agent'))
  }));

  return c.json({ url: presignedUrl }, 200);
});

The download response returns immediately. The analytics write happens in the background. If it fails, the user never knows. The error gets logged, but the API stays fast.

Sentry flushing uses the same pattern. Capturing an exception is synchronous, but sending it to Sentry is async:

typescriptmiddleware/error-handler.ts
captureException(error);
c.executionCtx.waitUntil(flushSentry());

R2 Presigned URLs

Documentation bundles live in R2. Downloads go directly from R2 to the mobile app. The API never proxies the bytes.

R2 supports S3-compatible presigned URLs. The aws4fetch library handles SigV4 signing:

typescriptservices/r2-signer.ts
const client = new AwsClient({
  accessKeyId: config.r2AccessKey,
  secretAccessKey: config.r2SecretKey
});

const url = new URL(
  `https://${config.accountId}.r2.cloudflarestorage.com/${config.bucket}/${objectKey}`
);

const signed = await client.sign(new Request(url, { method: 'GET' }), {
  aws: { signQuery: true }
});

signQuery: true puts the signature in query parameters instead of headers. The result is a standalone URL the mobile app can fetch without additional authentication. URLs expire after one hour.

Testing

The @cloudflare/vitest-pool-workers package runs tests in a simulated Workers environment. Bindings work like production:

typescripttests/manifest.test.ts
import { env } from 'cloudflare:test';

it('returns manifest for authenticated user', async () => {
  const response = await app.request('/manifest', {
    headers: { 'X-Customer-ID': 'test_user' }
  }, env);

  expect(response.status).toBe(200);
  const body = await response.json();
  expect(body.docsets).toBeInstanceOf(Array);
});

Local development uses wrangler dev. It binds to real Cloudflare services or local emulators depending on configuration. The D1 binding can point to a local SQLite file for fast iteration.

Deployment

One command: wrangler deploy. The wrangler.toml declares all bindings, routes, and compatibility flags. Cloudflare provisions resources automatically.

Secrets sync from Doppler before deployment:

bashterminal
doppler secrets --json | \
  jaq -c 'with_entries(.value = .value.computed)' | \
  wrangler secret bulk

D1 migrations are numbered SQL files. Each runs once against the remote database:

bashterminal
wrangler d1 execute docnative-db --remote --file=migrations/0001_initial.sql

Workers have no cold starts. The first request to a new deployment is as fast as the thousandth. The runtime keeps instances warm across the edge network.

What This Enables

DocNative ships documentation for MDN, Swift, Python, React, and a dozen other sources. The API serves manifest requests, generates presigned download URLs, handles semantic search across all documentation, verifies subscriptions, and enforces rate limits. Everything runs on Cloudflare. One account manages compute, storage, databases, vector search, caching, and rate limiting.

When a user searches for "array methods" on their phone, the query embeds via Workers AI, searches Vectorize, fetches snippets from D1, and returns results. Latency is under 200ms from anywhere in the world. When they download the MDN JavaScript docs, they get a presigned R2 URL that streams directly to their device. The API never touches the bytes.

The framework stays out of the way. Hono's job is to route requests, validate input, and let TypeScript catch mistakes before they ship. Twenty routes with six service integrations, and the entire codebase fits in a developer's head.

That is the point.

Read Docs Anywhere

Download DocNative and access documentation offline on iOS and Android.