The System Design Mental Model
How to read a system — inputs, outputs, components, and the three dimensions of scale
System design is the practice of choosing how the components of a system fit together to meet requirements. It is not about knowing the "right" answer — it is about making trade-offs you can reason about. Every design decision is a bet: this approach is faster to build but harder to scale; this one handles more load but costs more. Your job is to make those bets consciously, not accidentally.
The three dimensions of scale
- Read scale — More users reading data than writing it. Most web apps are 80–90% reads. Solutions: caching, read replicas, CDN for static assets. A URL shortener is the extreme case — billions of reads against a small write set.
- Write scale — High write throughput overwhelming the database. Solutions: write-optimised databases, batching, queues, sharding. Click-tracking analytics is a write-scale problem — never write one row per redirect synchronously.
- Compute scale — CPU-intensive operations (image processing, AI inference, PDF generation) blocking web servers. Solutions: background jobs, worker queues, dedicated compute. Generating short codes is cheap; generating thumbnails is not.
The components of any web system
- Client — Browser, mobile app, or API consumer. Sends requests, receives responses. Responsible for nothing you cannot trust.
- DNS / CDN — Routes traffic; serves static assets from edge nodes close to the user. For a URL shortener, the redirect response itself can be cached at the edge — 0ms database lookup.
- Load balancer — Distributes requests across multiple server instances. Required for horizontal scaling. Also terminates TLS so app servers handle plain HTTP internally.
- Application server(s) — Runs your code — handles HTTP requests, business logic, API responses. Must be stateless to scale horizontally.
- Database — Persists data. The most common bottleneck. PostgreSQL is the default choice for new products — ACID, flexible, well-understood failure modes.
- Cache — In-memory store (Redis) for frequently-read data. Reduces database load by 70–90%. For a URL shortener, a cache hit means the database never sees the redirect request.
- Queue / message broker — Decouples producers and consumers. Enables async processing — click analytics, email sending, webhooks. Never block a redirect on analytics writes.
- Object storage — Files, images, PDFs, videos. S3 (or Supabase Storage) instead of the database or server filesystem. Scales independently of your compute.
How to approach any system design question
- Clarify requirements — What does the system do? Read-heavy or write-heavy? Expected users at launch vs peak? Any latency constraints? A URL shortener with 100M redirects/day is a very different problem from one with 1,000.
- Estimate scale — Back-of-envelope: 100M redirects/day = ~1,160 req/sec sustained, with peaks 10x higher at ~12,000 req/sec. A single server handles this — but only if redirect lookups are cached.
- Start with the simplest viable architecture — One server, one database, one CDN. Identify where the bottleneck will appear first under load, and add complexity only there. The URL shortener bottleneck is the redirect lookup — solve that first.
- Reason about trade-offs out loud — Name what you are giving up with every choice. Caching improves speed but introduces stale redirect risk. Queues decouple analytics but add infrastructure. Every decision has a cost.
- Define failure modes explicitly — What breaks if the cache is unavailable? What breaks if the database is unavailable? A resilient design degrades gracefully — the redirect still works without analytics if the queue is down.
The over-engineering trap
A URL shortener for 1,000 users/day does not need a queue, a CDN, or a read replica. The cost of added complexity — debugging, ops overhead, mental model — is real. Add components only when you have evidence the simpler approach is breaking.
Try this
Draw the initial architecture for a URL shortener on paper or in Excalidraw. It takes a long URL, returns a short code; visiting the short code redirects to the long URL. Label: the database schema (just two tables), the read path (redirect), the write path (shorten), and the CDN layer. Annotate which component will become the bottleneck first at 100M redirects/day. Keep this diagram — you will add to it in every lesson.