Design Flipkart: System Design Interview Guide
During Big Billion Days, Flipkart traffic runs 5-10x its normal load, and its Aerospike fleet alone serves ~90 million queries per second at sub-millisecond latency across search, pricing, ads, and inventory.
A complete system design walkthrough of an e-commerce platform like Flipkart, built for India's Big Billion Days. We cover the read-heavy catalog and search path, the write-heavy checkout path, how inventory stays consistent without overselling during flash sales, cart reservation with TTLs, the order-management state machine (Flipkart's open-source Flux), and how to absorb a thundering herd with virtual waiting rooms, rate limits, and queues.
Asked at: Asked at Flipkart, Amazon, Walmart Global Tech, Myntra, Meesho, Swiggy, Zomato, PhonePe, and most product-company interviews in India. "Design an e-commerce platform / flash sale / Big Billion Days" is one of the most common SDE2/SDE3 prompts because it exercises caching, consistency, and high-write contention in a single question.
Why this question is asked
E-commerce is the canonical interview problem because a single question forces you to reconcile two opposing workloads: a massively read-heavy catalog (which wants aggressive caching and eventual consistency) and a write-heavy, contention-prone checkout (which demands strong consistency on inventory so you never sell the same unit twice). The flash-sale angle adds a thundering-herd dimension — millions of users hitting one SKU in the same second — which separates candidates who can recite CAP from candidates who can actually shape load. Interviewers use it to probe whether you understand idempotency, distributed transactions vs. sagas, cache invalidation, and graceful degradation under 10x traffic.
Requirements
Always clarify these in the first 5 minutes of the interview. Do not start drawing boxes until both lists are agreed.
Functional requirements
- Browse and search a product catalog of 100M+ listings with filters (category, brand, price, rating), facets, and autocomplete
- View a product detail page (PDP) with price, ratings, seller, and live availability
- Add items to cart; cart persists across sessions and devices
- Place an order: address, payment, and a confirmed order id, with no double-charging on retries
- Decrement inventory atomically so a unit is never sold twice (no oversell), even under flash-sale contention
- Run time-boxed flash sales / deals where a limited-stock SKU opens at a fixed time
- Track order lifecycle: created -> payment authorized -> packed -> shipped -> delivered, with cancellations and refunds
- Show personalized recommendations and category/deal pages
- Let sellers manage listings, price, and stock; reflect stock changes in near real-time
Non-functional requirements
- Catalog read path p99 < 150ms; search results p99 < 300ms
- Withstand 5-10x baseline traffic during Big Billion Days without manual intervention
- Strong consistency on the inventory ledger (never oversell); eventual consistency acceptable for catalog, reviews, recommendations
- Checkout must be idempotent: a retried/duplicate request produces exactly one order and one charge
- High availability (target 99.95%+) for browse/search; checkout may degrade gracefully (queue) rather than fail
- Horizontal scalability — every tier scales independently (search, cart, inventory, orders)
- Durability: orders, payments, and inventory mutations are never lost (replicated, persisted before ack)
- India-first latency: serve from in-country data centers / CDN edge close to Tier-1 and Tier-2 cities
Back-of-envelope scale estimates
Show your math. Pulling numbers from thin air signals you have not thought about the load.
Registered users / monthly actives
~500M registered, ~150-200M MAU (est.)
Flipkart is one of India's two largest e-commerce platforms; public figures put the user base in the hundreds of millions. Treat exact MAU as an estimate for sizing.
Catalog size
100M+ active listings
A horizontal marketplace with millions of sellers easily reaches 9-figure SKU counts across categories. Drives the search index size and catalog cache footprint.
Baseline vs. BBD traffic
5-10x spike during Big Billion Days
Stated directly in Flipkart engineering interviews/case studies — sale traffic is 5x to 10x normal. This multiplier is the central capacity-planning input.
Read:write ratio on catalog
~100:1 to 1000:1 reads:writes
A PDP is viewed millions of times between price/stock changes. Justifies heavy caching and a separate read path from the write path.
Aerospike fleet QPS
~90M QPS aggregate, sub-ms latency
Per Aerospike's published Flipkart case study: ~90M QPS across three DCs and 50+ use cases (search bar, recommendations, ads, pricing, inventory) at sub-millisecond latency, 200+ clusters.
Flash-sale contention on one SKU
100K-1M+ requests/sec on a single hot key
A limited-stock deal (e.g., a phone) draws the whole audience to one product+inventory key in the same second — the classic thundering-herd / hot-key problem to design around.
Checkout write throughput (peak)
Tens of thousands of order writes/sec
Even a small fraction of browsers converting at peak generates a large, contention-heavy write load on the order and inventory stores.
High-level architecture
Start by splitting the system along the read/write seam, because that split drives every other decision. The read path serves browse, search, and product detail pages; it is enormous in volume but tolerant of staleness, so it leans on a CDN, multiple cache layers, and a search index. The write path serves cart, inventory reservation, checkout, and order management; it is smaller in volume but unforgiving — it must be strongly consistent and idempotent. On the read path, a request from a browser or app first hits a CDN edge (static assets, images, and cacheable PDP fragments). Dynamic requests pass through an API gateway / load balancer into stateless service tiers. Catalog reads are served from a product service backed by a fast key-value store (Flipkart famously runs Aerospike here, serving the search bar, pricing, recommendations, ads, and inventory reads at ~90M QPS aggregate with sub-millisecond latency). Search and faceted filtering hit a dedicated inverted-index cluster (Elasticsearch/Solr-class), kept in sync from the catalog via a change stream. The source of truth for product data lives in a durable store and is denormalized into these read-optimized projections — classic CQRS. On the write path, adding to cart writes to a cart service (a low-latency KV store keyed by user). Checkout is where the interesting consistency work happens: the system reserves inventory against a strongly-consistent inventory ledger (an atomic decrement with a reservation TTL), creates an order in a "pending" state, then drives the order through a distributed workflow — payment authorization, inventory commit, packing, shipping — using a saga / state machine. Flipkart open-sourced exactly this: Flux, a state-machine orchestration framework that models order fulfillment as states (Order Created, Confirmed, Packed, Shipped) with event-driven transitions, retries, and replay. Each step is idempotent (keyed by an idempotency key derived from the order/saga id), and failed steps trigger compensating actions rather than a single big ACID transaction across services. For Big Billion Days, the architecture adds shock absorbers in front of the write path. Because traffic runs 5-10x normal and flash sales concentrate millions of users on a single SKU, you put a virtual waiting room / admission control in front of checkout, rate-limit per user and per SKU, and front the hot inventory key with an in-memory counter and a queue so the database sees a smooth, bounded write rate instead of a spike. Flipkart runs all of this on its own Flipkart Cloud Platform (FCP) over Kubernetes, bursting overflow workloads to GCP during peaks, with chaos testing and autoscaling so individual microservices scale independently based on demand.
In a real interview, sketch this on the whiteboard before diving into any single box.
Core components
Walk through each service. The interviewer wants to hear what each one owns, not just the names.
API Gateway / Load Balancer
Entry point for all client traffic. Terminates TLS, authenticates, routes to service tiers, and enforces global and per-user rate limits. During sales it also hosts admission control (virtual waiting room tokens) so the backend never sees more than it can serve. Stateless and horizontally scaled behind GSLB/DNS for geo-routing within India.
Catalog / Product Service
Serves product detail data (title, price, attributes, seller, availability summary). Backed by a fast KV store (Aerospike-class) holding the read projection. The durable source of truth is a separate transactional store; updates flow into the KV cache and search index via a change stream. Read-only and aggressively cached — this is the 100:1+ read side.
Search & Discovery Service
Inverted-index cluster (Elasticsearch/Solr-class) powering keyword search, facets, filters, sort, and autocomplete over 100M+ listings. Kept in near-real-time sync with the catalog through CDC/Kafka. Personalization and ranking signals are layered on top; recommendations are precomputed offline (the Flipkart Data Platform / 35PB Hadoop side) and served from a low-latency store.
Cart Service
Stores per-user carts in a low-latency KV store keyed by user id, replicated for durability across devices. Holds item references and quantities but does NOT hold inventory — adding to cart is a soft intent, not a reservation. Cart TTLs keep abandoned carts from leaking storage.
Inventory Service (the consistency core)
Owns the strongly-consistent stock ledger per SKU per warehouse. Exposes atomic reserve/commit/release operations with a reservation TTL. A reserve decrements available stock and creates a short-lived hold; commit finalizes on payment success; release returns stock on timeout/cancel. This is where overselling is prevented — every other component treats its number as advisory until reserve succeeds.
Order Management Service (OMS)
Creates orders and drives them through their lifecycle with a state machine / saga orchestrator (Flipkart's Flux: Order Created -> Payment Authorized -> Inventory Committed -> Packed -> Shipped -> Delivered). Each transition is idempotent and durably persisted; failures trigger compensating actions (refund, release inventory) instead of leaving partial state.
Payment Service
Integrates UPI, cards, net banking, wallets, EMI, and cash-on-delivery. Uses idempotency keys so a retried checkout never double-charges, and a two-phase pattern (authorize then capture) so money is only captured once inventory is committed. Talks to external PSPs/banks via webhooks; reconciliation handles async settlement.
Flash-Sale Admission & Hot-Key Layer
In front of the inventory service for deal SKUs: an in-memory atomic counter (Redis/Aerospike) acts as a fast gate so the database isn't hit by the full herd; a FIFO queue smooths bursts into a bounded write rate; per-user and per-SKU rate limits and bot detection shed abusive load. Users beyond capacity get a 'sold out / try again' fast path instead of a hung request.
CDN & Edge
Caches images, static assets, and cacheable PDP fragments at edge locations close to users. Offloads the bulk of bytes from origin and is the first line of defense against a traffic spike — most of a sale page is static.
Event Bus (Kafka)
Backbone for change streams and async workflows: catalog/price/stock changes propagate to caches and search, order events drive notifications and analytics, and inventory mutations are logged for audit. Decouples producers from consumers so a slow downstream never blocks checkout.
Notification Service
Sends order confirmations, shipping updates, and deal alerts over email, SMS, push, and in-app channels. Driven off the event bus, rate-limited, and deduplicated so a retry storm doesn't spam users.
Data model
Pick the right store per table. Justify each choice with the access pattern, not by reflex.
productsproduct_id (PK)titlebrandcategory_iddescriptionattributes (jsonb)default_pricestatusSource-of-truth catalog row. Mutated rarely relative to reads. Denormalized into the product KV cache and the search index via CDC. Keep price out of the heavily-cached blob if it changes often, or version it.
listingslisting_id (PK)product_id (FK)seller_idpricemrpwarehouse_idis_activeA marketplace has many sellers per product. The buy-box / lowest-price selection happens over listings. Separating listings from products keeps the catalog row stable while prices churn per seller.
inventorysku_id (PK)warehouse_id (PK)available_qtyreserved_qtyversionThe consistency core. available_qty is the only number that matters for oversell. Use an optimistic version column (compare-and-swap) or a serialized atomic decrement. Composite key by warehouse so geo-routing can ship from the nearest stock.
inventory_reservationsreservation_id (PK)sku_iduser_idqtystatus (held/committed/released)expires_atShort-lived holds created at checkout start. A background sweeper (or TTL) releases expired holds back to available_qty. This is what lets you reserve before payment without permanently losing stock to abandoners.
cartsuser_id (PK)items (jsonb: listing_id, qty)updated_atttlStored in a KV store, not a relational table at scale. Holds intent only — no stock is reserved here. TTL evicts abandoned carts.
ordersorder_id (PK)user_idstatustotal_amountidempotency_key (UNIQUE)created_atidempotency_key is UNIQUE so a duplicate/retried checkout maps to the same order instead of creating a second one. status is driven by the OMS state machine.
order_itemsorder_id (FK)sku_idlisting_idqtyunit_pricereservation_idLine items snapshot the price at purchase time (never re-read current price) and link to the reservation that guaranteed the stock.
paymentspayment_id (PK)order_id (FK)providerprovider_refstatus (authorized/captured/refunded)idempotency_key (UNIQUE)amountAuthorize-then-capture two-phase flow. idempotency_key prevents double-charge on retry. Reconciliation job matches provider webhooks against this table.
Deep dives
These are the conversations the interviewer is steering you toward. Practice each one until you can talk through it without notes.
Preventing oversell: how to atomically decrement inventory under contention
The whole interview hinges on this. Never read-then-write in application code — two requests read available_qty=1, both think they can sell, both decrement: that's the oversell bug. Two correct patterns. (1) Atomic compare-and-swap with a version column: UPDATE inventory SET available_qty = available_qty - 1, version = version + 1 WHERE sku_id = ? AND available_qty >= 1 — the row count tells you if you won; this is optimistic concurrency and works great when contention is moderate. (2) For a single hot SKU under flash-sale contention, route all decrements for that key to one serialization point: an atomic counter in Redis/Aerospike (DECR returns the new value; reject if it goes below zero) or a single-partition queue so writes are serialized. The database then sees a bounded, ordered stream instead of a stampede. Pair either with a reservation TTL so stock held for a checkout that never pays is returned automatically. State clearly that you accept rejecting a request ('sold out') over overselling — in e-commerce, oversell is a refund, an angry customer, and sometimes a legal/SLA problem; under-serving is just a retry.
The thundering herd: absorbing a flash sale on one SKU
When a deal opens at noon, millions of users hit the same product+inventory key in the same second. You cannot let that reach the inventory database. Layer the defenses: (1) Admission control / virtual waiting room — issue queue tokens at the gateway; only N users at a time get a checkout slot, the rest see a waiting page with their position. (2) Per-user and per-SKU rate limits plus bot detection to shed scripted load. (3) Front the hot key with an in-memory counter so 99% of 'is there stock' checks are answered without touching the DB, and short-circuit to 'sold out' the instant the counter hits zero. (4) Smooth the remaining writes through a FIFO queue into a bounded consumer rate. (5) Cache the PDP itself hard at the CDN so the read flood doesn't melt origin either. The mental model: convert a spike into a queue, answer the impossible-to-satisfy 99% cheaply and instantly, and let only the winners through to the expensive consistent path.
Idempotent checkout: exactly one order, exactly one charge
Networks retry. Users double-click. Mobile apps replay requests after a flaky connection. Without idempotency you create duplicate orders and double-charge customers. The fix: the client generates an idempotency key per checkout attempt and sends it on every retry. The order table has a UNIQUE constraint on idempotency_key — the first insert wins, the retry hits the constraint and you return the already-created order instead of making a new one. Carry the same key (or a derived one) into the payment call so the PSP also dedupes. For the multi-step order workflow, derive each step's idempotency key from the saga id + step name (Flipkart's Flux pattern), so if the orchestrator crashes and replays mid-flight, re-executing a step is a no-op rather than a second inventory decrement or a second capture.
Distributed transactions: saga / state machine instead of 2PC
Checkout touches inventory, payment, and orders — three services, three data stores. A single ACID transaction (or two-phase commit) across them doesn't scale and creates locks that kill you at BBD load. Use a saga: a sequence of local transactions, each with a compensating action. Reserve inventory -> authorize payment -> create order -> commit inventory -> capture payment. If payment fails after reserve, the compensation releases the reservation. Flipkart built and open-sourced Flux for exactly this: order fulfillment modeled as a state machine (Order Created -> Confirmed -> Packed -> Shipped) with event-driven transitions, durable persistence, retries, and replay. Orchestration (a central coordinator drives the steps) is usually preferred over choreography here because order flows are complex and you want one place to see and recover state. Be explicit that sagas give you eventual consistency with no isolation — design for intermediate states being visible (e.g., an order that's 'pending payment') and make every step idempotent so replay is safe.
Read path: catalog caching, CQRS, and cache invalidation
The catalog is read ~100-1000x more than it's written, so separate the read model from the write model (CQRS). The source of truth is a transactional store; reads are served from a denormalized KV cache (Aerospike-class) and a search index. Updates flow one way: write to source of truth -> emit a change event on Kafka -> update KV cache and reindex search. The hard part is cache invalidation, especially for price and stock. Strategies: (1) version every product so a stale read can be detected; (2) for stock, do NOT show the exact DB number on the PDP under a flash sale — show 'in stock / few left / sold out' buckets so a slightly stale cache is harmless and you avoid hammering inventory for every page view; (3) use short TTLs plus event-driven invalidation (TTL is your safety net if an invalidation event is missed). Also guard against cache stampede: when a hot key expires, thousands of requests miss simultaneously — use request coalescing (single-flight) or a slightly randomized TTL so they don't all expire at once.
Cart vs. reservation: why add-to-cart must NOT reserve stock
A common wrong answer is to decrement inventory when a user adds to cart. At BBD scale that's catastrophic: millions of carts would lock up all stock, most of which is never bought, and you'd show 'sold out' to people willing to pay. Cart is intent, not commitment. Stock is only reserved at the start of checkout, with a short TTL (a few minutes), and only committed on successful payment. This keeps inventory liquid. The tradeoff is honesty: between cart and checkout the price or availability can change, so you re-validate price and re-check availability at checkout and surface any change to the user before charging. This is also why the PDP availability is a fuzzy bucket and the cart shows 'we'll confirm availability at checkout' rather than a hard guarantee.
Scaling for Big Billion Days: capacity, autoscaling, and graceful degradation
BBD traffic runs 5-10x baseline, so you can't just turn on autoscaling and hope. Flipkart runs on its own Flipkart Cloud Platform (FCP) over Kubernetes and bursts to GCP for peaks; microservices scale independently so the search tier and the checkout tier flex on their own curves. Practices to mention: (1) load-test against synthetic BBD traffic and pre-warm caches and pre-scale ahead of the sale start (cold autoscaling can't react fast enough to an instantaneous spike). (2) Define a degradation ladder: under extreme load, shed non-critical features first (recommendations, reviews, 'people also viewed') to protect the money path (browse -> cart -> checkout). (3) Use circuit breakers and backpressure so a slow downstream (a struggling payment provider) trips fast and the system queues rather than cascades into failure. (4) Run chaos testing in production (kill pods, simulate partial failures) so failover is proven, not hoped for. The thesis: at 10x you don't keep everything up — you choose what to sacrifice in advance.
Trade-offs to discuss
Every senior interviewer expects you to surface at least 3 of these. Pick the decisions, state the alternatives, and justify your choice.
Strong consistency on inventory, eventual consistency on catalog
Overselling a unit is a real-money, real-customer failure, so the inventory ledger gets strong consistency and serialized decrements. The catalog is read-dominated and tolerant of staleness, so it gets cached aggressively and updated eventually. Applying strong consistency everywhere would not scale; applying eventual consistency to inventory would oversell. Pick consistency per-domain, not globally.
Saga / state machine over two-phase commit for checkout
2PC across inventory, payment, and order services creates distributed locks that collapse under BBD write contention. Sagas trade away isolation (intermediate states are visible) for availability and scale, with compensating actions for rollback. The cost is more application-level complexity and the need to make every step idempotent.
Reserve stock at checkout, not at add-to-cart
Reserving at cart locks up inventory that's mostly never purchased, showing false 'sold out' to real buyers. Reserving at checkout with a short TTL keeps stock liquid. The tradeoff is that price/availability can shift between cart and checkout, so you must re-validate at checkout and accept occasionally telling a user the item just went out of stock.
Virtual waiting room / queue instead of pure autoscaling for flash sales
Autoscaling reacts in tens of seconds; a flash sale spikes in one second, and the bottleneck is a single hot inventory key that can't be scaled horizontally anyway. Admission control converts the spike into an orderly queue and answers the 99% who can't win instantly and cheaply. The cost is a worse experience for some users (waiting page) — but it's a controlled, fair degradation instead of a meltdown.
Fuzzy availability buckets on the PDP ('few left') instead of exact stock counts
Showing the exact live count would force every page view to read the consistent inventory store, defeating the cache and creating a hot key on reads, not just writes. Buckets make a slightly stale cache harmless. The tradeoff is precision — you can't promise a specific quantity until checkout reserves it.
Authorize-then-capture payments with idempotency keys
Capturing money before inventory is committed risks charging for stock you can't fulfill; idempotency keys prevent double-charge on retries. The two-phase flow adds latency and a reconciliation job for async settlement, but it's the only safe way to keep money and stock in agreement across a distributed checkout.
Separate read store (KV + search index) from write store (transactional DB) — CQRS
One store can't be both a sub-millisecond 90M-QPS read cache and a strongly-consistent transactional ledger. Splitting them lets each scale on its own terms. The cost is the synchronization machinery (CDC/Kafka) and the eventual-consistency window between a write and when reads see it — acceptable for catalog, designed-around for stock.
How Flipkart actually does it
Flipkart runs much of its low-latency serving layer on Aerospike: per Aerospike's published case study, ~90 million QPS aggregate across three data centers and 50+ use cases — the homepage search bar, recommendations, ads, pricing, and inventory — at sub-millisecond latency, on 200+ production clusters managed by a team of fewer than ten engineers via the Aerospike Kubernetes Operator. The platform sits on Flipkart Cloud Platform (FCP), Flipkart's internal Kubernetes-based cloud over private India data centers, with hybrid bursting to GCP during peak events; microservices scale independently and the team runs continuous chaos testing (killing pods, simulating partial failures) to prove failover. For order orchestration, Flipkart open-sourced Flux, a state-machine framework that models fulfillment as states and event-driven transitions with retries and replay — a real, inspectable implementation of the saga pattern this page describes. Big Billion Days traffic is publicly described as 5-10x normal load, and the Flipkart Data Platform runs an 800+ node, 35PB+ Hadoop cluster powering the offline recommendation and analytics side. Note that some specific QPS and percentage figures come from vendor case studies and engineering talks rather than first-party SLAs; treat them as directionally accurate orders of magnitude, not contractual numbers.
Sources
- Aerospike — Inside Flipkart's Journey to 90 Million QPS with Aerospike and Kubernetes
- Aerospike — Flipkart customer story
- Flipkart Flux — Traditional State Machine: Order Fulfilment (GitHub wiki)
- RealOps — Scaling Flipkart for Big Billion Days (interview with a Flipkart Senior Engineering Manager)
- Flipkart's Big Billion Days: How Backend Systems Fight Cart Wars and Flash Sales (CodeKerdos)
- Flash Sale System Design: Architecture, Scale, and Oversell — Ajit Singh
Lessons to study before this interview
If any of these topics are fuzzy, the interviewer will catch it. Each lesson is 15 to 60 minutes with diagrams, code, and a quiz.
Design a Payment System
capstone / capstone
Design a Rate Limiter
capstone / capstone
Idempotency
foundation / core fundamentals
Saga Pattern
advanced / distributed systems core
Distributed Transactions
advanced / distributed systems core
Cache Stampede Prevention
foundation / caching strategies
Cache Invalidation
foundation / caching strategies