Design URL Shortener: System Design Interview Guide
A URL shortener at TinyURL scale handles 10 billion redirects per month with 100 million new short links created daily, all at single-digit millisecond latency.
Designing a URL shortener (TinyURL, bit.ly) is the canonical warm-up system design interview. It looks simple but every detail matters: how you generate short IDs without collisions, how you shard a hot-read workload, how you cache, and how you do analytics without slowing down redirects.
Asked at: Commonly asked at Meta, Google, Amazon, Microsoft, and almost every system design interview as a warm-up. Often the first question in a phone screen.
Why this question is asked
Design a URL shortener tests fundamentals: ID generation strategies (random vs counter vs base62), database choice for a read-heavy workload, caching, and the often-overlooked analytics path. Interviewers use it to gauge whether you can handle a known problem cleanly before they ask something harder.
Requirements
Always clarify these in the first 5 minutes of the interview. Do not start drawing boxes until both lists are agreed.
Functional requirements
- User submits a long URL and gets back a short URL
- Hitting the short URL redirects to the long URL
- Short URLs are 7 to 10 characters, easy to share
- Users can optionally pick a custom alias
- Click analytics: total clicks, unique clicks, geo breakdown
- Links can expire after a TTL
- API access for batch creation
Non-functional requirements
- Redirect latency under 10 ms at the 99th percentile
- 99.99% availability for the redirect path
- 100M new URLs created per day
- 10B redirects per month, with a 10:1 to 100:1 read-to-write ratio
- Short URLs must be globally unique with no collisions
- Analytics writes must not slow down redirects
Back-of-envelope scale estimates
Show your math. Pulling numbers from thin air signals you have not thought about the load.
URLs created per day
100M
Public bit.ly scale plus growth. Used to size ID space.
Redirects per second (peak)
100K
10B per month is ~3,800 average per second, with a 25x peak factor for marketing campaigns.
ID space requirement (10 years)
365 billion
100M per day times 365 days times 10 years. Base62 with 7 characters gives 3.5 trillion. Plenty of room.
Storage per URL
500 bytes
Long URL plus metadata. 100M per day equals 18 TB per year. Modest.
Cache hit rate target
>90%
Pareto distribution: a small fraction of URLs get most of the clicks. Caching the top 1% in memory gets you 90% hit rate.
High-level architecture
Two paths. Create path: client POSTs the long URL to a Create Service, which generates a short ID (more on this in the deep dive), writes the mapping to a sharded SQL store keyed by short_id, and returns the short URL. Redirect path: client GETs the short URL, the Redirect Service looks up the long URL (cache first, DB on miss), issues a 301 or 302 redirect, and asynchronously emits a click event to Kafka. Analytics consumers (Flink or batch jobs) aggregate the click stream into per-URL daily counters. The cache is Redis with the short_id as the key, populated on miss with a 24-hour TTL. The SQL store is sharded by short_id hash, with read replicas for the redirect path.
In a real interview, sketch this on the whiteboard before diving into any single box.
Core components
Walk through each service. The interviewer wants to hear what each one owns, not just the names.
Create Service
Validates the input URL, generates a unique short ID, writes the mapping to the database, and returns the short URL. Rate-limited per IP and per API key.
ID Generator
Produces the next short ID. Options include a counter-based approach (Snowflake-like with a node ID), random base62 strings with collision retry, or a pre-allocated batch from a counter shard. See the deep dive for trade-offs.
Redirect Service
Looks up the long URL by short_id, issues an HTTP 301 or 302 redirect, and fires a click event. Heavily cached. This is the hottest read path.
URL Cache
Redis with short_id as key and long URL as value. 24-hour TTL. Population on miss. Hit rate ~90%+ because click distribution is heavily skewed.
Click Analytics Pipeline
Kafka topic for click events. Flink job aggregates per URL per day (total clicks, unique IPs, geo histogram). Results stored in a separate analytics SQL table.
Database
Sharded SQL (MySQL or PostgreSQL) keyed by short_id hash. Each shard has a primary plus 2 read replicas. The redirect path reads from replicas; the create path writes to primary.
Custom Alias Service
When the user requests a specific short alias, this service checks availability (across all shards) and writes if free. Race conditions are handled by a uniqueness constraint on short_id.
Data model
Pick the right store per table. Justify each choice with the access pattern, not by reflex.
urlsshort_id (PK)long_urluser_idcreated_atexpires_atis_custom_aliasSharded by short_id hash. Primary key is enforced unique to prevent collisions on custom aliases. The redirect path reads only by short_id.
click_eventsshort_id (PK partition)clicked_at (clustering)ip_hashuser_agentcountry_codeAppend-only event log in Kafka, eventually persisted to BigQuery or S3 for ad-hoc queries.
click_countersshort_id (PK)date (clustering)total_clicksunique_clickscountry_breakdown JSONDaily-aggregated counters maintained by Flink. This is what the analytics UI reads, not the raw events.
Deep dives
These are the conversations the interviewer is steering you toward. Practice each one until you can talk through it without notes.
ID generation strategies and why base62 wins
Three main options. Random base62 (generate a 7-character string, check the DB for collision, retry on conflict). Simple, but collision checks scale linearly with the table; at billions of rows you start seeing meaningful retries. Counter-based (a global monotonic counter incremented per request, then converted to base62). Fast and collision-free, but requires a coordinator (single point) or a Snowflake-style distributed counter with node IDs. Hash-based (hash the long URL, take the first 7 base62 chars). Deterministic but causes duplicate short IDs for the same long URL, which is usually a feature, not a bug. The production answer is usually a hybrid: a Snowflake-style ID generator that gives each app server a node ID and a local counter, encoded as base62. No coordinator on the hot path.
Sharding strategy for the URL table
Reads dominate writes 10:1 or more. Shard by short_id hash so that any redirect hits exactly one shard. The number of shards is set based on per-shard QPS budget (e.g., 50K QPS per shard requires 4 shards at the projected peak). Each shard runs primary plus 2 read replicas; the redirect path reads from replicas (eventual consistency on a brand-new write is acceptable: the user will retry in 100 ms if they immediately click their own new link). For custom aliases, a uniqueness check across all shards is needed before insert; use a Bloom filter or a dedicated custom_aliases table to make this fast.
Caching strategy and hit rate
Click distribution follows Pareto: a small percentage of URLs get most of the clicks. A Redis cluster (or Memcached) with the top ~1% of URLs by recent clicks holds the working set. On miss, the redirect path reads from the SQL replica and populates the cache with a 24-hour TTL. To prevent cache stampede on a sudden viral link, use the request coalescing pattern: only one request per missing short_id queries the DB; concurrent requests for the same key wait on the in-flight fetch. This is a standard Redis lock + read-through pattern.
Click analytics without slowing down redirects
Counting clicks synchronously on the redirect path would double the latency (one DB write per click). Instead, the redirect handler fires the click event to a Kafka topic asynchronously and returns the 301 immediately. A Flink job consumes the stream and updates per-URL daily counters. The analytics UI reads from the daily counter table, not from raw events. Real-time use cases (dashboards that need second-level freshness) read from the Flink streaming state directly. This gives the redirect path sub-10ms latency while still supporting rich analytics.
Trade-offs to discuss
Every senior interviewer expects you to surface at least 3 of these. Pick the decisions, state the alternatives, and justify your choice.
Counter-based vs random IDs
Counter-based is collision-free but predictable (you can iterate /a1, /a2, ...). Random is unpredictable but needs collision checking. The standard fix is a Snowflake-style ID: counter plus node ID plus timestamp, base62-encoded. Unpredictable enough for most use cases, collision-free, and no coordinator on the hot path.
301 vs 302 for the redirect
301 (permanent) lets browsers cache the redirect, which means future clicks bypass your server. Great for latency, bad for analytics. 302 (temporary) hits your server every time. Most production URL shorteners use 301 for the redirect itself but issue a 302 if they want to track clicks. bit.ly uses 301.
SQL vs NoSQL for the URL table
The workload is simple key-value (short_id to long_url). NoSQL (DynamoDB, Cassandra) gives you easier sharding and write throughput. SQL gives you the uniqueness constraint for custom aliases. Both work; the choice depends on team familiarity. Most large URL shorteners use NoSQL.
Sync vs async click counting
Sync is simple but doubles redirect latency. Async (Kafka plus Flink) keeps redirects fast and lets you scale analytics independently. Almost everyone picks async.
Allow URL deletion vs append-only
Allowing deletion lets users clean up. Append-only is simpler and helps detect abuse (you have a full history). Most services soft-delete: a flag hides the URL but the row stays.
How Tiny URL Shortener actually does it
bit.ly publishes a fair amount about its architecture: a sharded MySQL backend with Redis caching, a Kafka-based analytics pipeline, and a custom ID generator. The redirect service is a thin Go service in front of the cache. TinyURL is older and less well-documented. Major social platforms (Twitter's t.co, LinkedIn's lnkd.in) run their own shorteners that are integrated with their main URL graph and used for click tracking on shared content.
Lessons to study before this interview
If any of these topics are fuzzy, the interviewer will catch it. Each lesson is 15 to 60 minutes with diagrams, code, and a quiz.