PhonePe System Design Interview: UPI at National Scale
PhonePe handles close to half of every UPI payment in India. In January 2025 it processed more than 8 billion UPI transactions in a single month, about 48 percent of all UPI volume and just over 50 percent by value, and its event pipeline alone moves roughly 100 billion events a day.
Designing PhonePe is the India payments problem at national scale. You have to move real money over the NPCI UPI rails without ever creating or losing a rupee, make every step idempotent so a retry never double charges, and reconcile against the bank when a callback arrives late. On top of that correctness core, PhonePe is a study in scaling: a shared-nothing sharded MySQL ledger, an Aerospike layer serving real-time reads and fraud checks at very high throughput, a Kafka backbone carrying about 100 billion events a day, and its own on-premises data centers. The interview is as much about horizontal scale and availability as it is about money.
Asked at: Commonly asked at PhonePe, Paytm, Razorpay, Cred, Google Pay, Amazon Pay, and most India fintech and FAANG-India teams, for SDE2 and above. It is the standard UPI payments and high-scale infrastructure interview in the Indian market. The PhonePe variant leans harder on the scaling and data-store questions, because PhonePe is the largest UPI player and has published a lot about how it scales.
Why this question is asked
Payments is the one domain where eventual consistency is not an acceptable answer. The interviewer wants to see that you understand money movement as a distributed transaction across systems you do not control, meaning the payer's bank, NPCI, and the payee's bank, that the network will fail in the middle, and that the only acceptable outcomes are fully done or fully reversed with a customer who can see exactly what happened. The PhonePe framing adds a second axis: national scale. You are expected to talk about how the transactional store is sharded so no single database is a bottleneck, how a fast in-memory layer serves balance checks and fraud lookups in under a millisecond during a festival peak, how an event backbone carries tens of billions of events a day without the read load hurting writes, and why a company might run its own data centers instead of the public cloud. You earn the offer by combining idempotency, a strict ledger, and reconciliation with a credible horizontal-scale story. You lose it by drawing one database box and moving on.
Requirements
Always clarify these in the first 5 minutes of the interview. Do not start drawing boxes until both lists are agreed.
Functional requirements
- User links a bank account, creates a UPI ID (VPA) like name@ybl, and sets a UPI PIN
- User pays a person or merchant by VPA, QR code, phone number, or bank account (push or pay)
- User approves a collect request raised by a merchant or another user (pull or collect) with the UPI PIN
- Scan-and-pay at a merchant QR for person-to-merchant payments
- User can check balance, see transaction history, and get an instant status for every payment
- Value-added flows such as recharges, bill payments, and merchant checkout run on top of the payment core
- Support and dispute handling for a payment that is stuck, failed, or debited-but-not-credited
- Adjacent products such as insurance, lending, and stockbroking run on the same identity and payment rails
Non-functional requirements
- Every payment is idempotent, so a retried request never causes a second debit
- Strong consistency and durability on money movement and the ledger, so a rupee is never created or lost
- Correctly handle NPCI callbacks that arrive late or out of order, and reconcile the true status of every pending payment
- Sub-millisecond reads on the hot real-time paths, such as balance checks and fraud lookups, even at festival peak
- Very high availability on the payment path, with non-critical features degraded first under stress
- Horizontal scale to national UPI volume, with no single database or queue acting as a bottleneck
- Real-time fraud and risk decisioning inline with the payment, not after the fact
Back-of-envelope scale estimates
Show your math. Pulling numbers from thin air signals you have not thought about the load.
UPI market position
~48% by volume, 50%+ by value
NPCI data reported for January 2025: PhonePe held about 48 percent of UPI transaction volume and just over 50 percent by value, the number one position by a wide margin. This is the most robustly verifiable figure about PhonePe and the anchor for the scale story.
Monthly UPI transactions
8B+ / month (Jan 2025)
PhonePe processed more than 8 billion UPI transactions in January 2025 out of roughly 17 billion across all of UPI that month. Divided across the month that is on the order of 3,000 transactions per second on average, with festival and month-start peaks several times higher.
Registered users
500M+
PhonePe crossed 500 million lifetime registered users in November 2023, described as roughly one in three Indians. This drives identity, VPA, and account-linking scale.
Merchants and coverage
47M+ merchants, 98.61% of pin codes
PhonePe's IPO filing reported about 47.19 million registered merchants covering 98.61 percent of India's pin codes as of September 2025. This drives the person-to-merchant payment path and the QR acceptance network.
Event backbone volume
~100B events/day on Kafka
PhonePe's engineering blog states its Kafka pipeline carries about 100 billion events per day. This is what feeds fraud, reconciliation, analytics, and downstream flows, and it is why the read and write paths are separated.
Real-time read throughput
500,000+ QPS, sub-ms reads
PhonePe engineers, quoted by Aerospike, report more than 500,000 queries per second on real-time transactional workloads with sub-millisecond reads. Note that this is queries per second on the fast read layer, not payment transactions per second, which PhonePe does not publish. Do not confuse the two in an interview.
High-level architecture
Split PhonePe into the money-movement core and the scaling infrastructure around it, because the interview rewards both. One honesty note first: the UPI PSP flow, the pending or DEEMED state, and reconciliation described below are the standard pattern that any UPI app must implement, not internals PhonePe has published. The data-store and infrastructure details, meaning sharded MySQL, Aerospike, the Kafka backbone, the in-house platforms, and the on-premises data centers, are things PhonePe has actually written about on its engineering blog, and those are called out as such. The money-movement core is a UPI PSP flow. When a user pays, the app talks to PhonePe as a Payment Service Provider, which talks to NPCI, which routes to the payer bank and the payee bank. There are two shapes. In the pay or intent flow, the payer initiates and approves with a UPI PIN, and the debit and credit happen over NPCI. In the collect flow, a request is pushed to the payer, who approves it. In both cases the final status comes back asynchronously through an NPCI callback, and it can be delayed or arrive out of order. So the payment is written first as a pending record with an idempotency key, and only moved to success or failure when the true status is known. A double-entry ledger records every debit and matching credit so the books always balance, and every state change is appended to an immutable event log for audit and dispute handling. The scaling infrastructure is what makes PhonePe distinctive. The primary transactional store is sharded MySQL, run as a strict shared-nothing architecture. PhonePe uses a common sharding library across all its MySQL databases, avoids scatter-gather queries in the user path, and keeps zero local data on the service containers, so a service and its data are decoupled and each can scale on its own. A fast layer built on Aerospike serves the reads that must be quick, such as balance checks, session data, a feature store, and fraud lookups, at more than 500,000 queries per second with sub-millisecond latency, replicated active-active across sites. An event backbone built on Kafka carries about 100 billion events a day, split into separate read and write clusters per data center so that scaling the read side never slows down writes, fed by an in-house two-tier ingestion path. On top of these sit PhonePe's own platforms: a Payments Orchestrator that models any money flow, a Risk and Decisioning Engine that aggregates across the billions of daily events in real time to score fraud inline, and a Fulfillment Engine for the steps around a payment. All of this runs in PhonePe's own on-premises data centers, orchestrated with Mesos and Marathon and fronted by an edge stack of NGINX and Traefik with AnyCast routing and circuit breakers.
In a real interview, sketch this on the whiteboard before diving into any single box.
Core components
Walk through each service. The interviewer wants to hear what each one owns, not just the names.
API edge and gateway
The entry point for every app request. PhonePe has described an edge stack that uses Mesos and Marathon as the data-center orchestration layer, NGINX and Traefik as edge routers, AnyCast with a routing component for traffic steering, and Hystrix-style circuit breaking to shed load. It authenticates the request, applies rate limits, and routes to the right service.
Payments Orchestrator
PhonePe's in-house platform, built on a flexible framework, that models any money flow as a sequence of steps. It coordinates the debit and credit legs, writes the pending record, applies the idempotency key, and drives the payment through its state machine. Keeping this generic is what lets PhonePe add new payment types and products without rebuilding the core.
Ledger and transaction store on sharded MySQL
The strongly consistent core, held in sharded MySQL under a shared-nothing design. A common sharding library spreads users and transactions across shards so no single database is a bottleneck, the user path avoids scatter-gather queries, and services hold no local data. The ledger is double-entry, so every debit has a matching credit and the books always balance.
UPI switch and NPCI connector
The component that speaks the UPI protocol to NPCI and the banks. It initiates the pay or collect flow, handles the UPI PIN verification, and processes the asynchronous callback that reports the true outcome. It has to treat every callback as possibly duplicated or late, which is why idempotency and reconciliation live close to it. This flow is the standard UPI PSP pattern rather than a PhonePe-published internal.
Aerospike real-time store
PhonePe's low-latency layer, used for real-time transactions, balance and session reads, a feature store, and fraud lookups. PhonePe engineers report running many Aerospike clusters per site with over a trillion records, replicated active-active across sites, serving more than 500,000 queries per second with sub-millisecond reads. It replaced heavier workloads and cut the server footprint sharply.
Risk and Decisioning Engine
PhonePe's in-house fraud and risk platform, built on a generic entity store that does high-velocity real-time aggregation across the billions of events the system produces each day. It scores a payment for fraud inline, before the money moves, so a suspicious transaction can be held or blocked rather than reversed afterward.
Kafka event backbone
The asynchronous spine of the system, carrying about 100 billion events a day. PhonePe moved from a single Kafka cluster to separate write and read clusters per data center, so heavy read consumers never slow down the write path, and uses an in-house two-tier ingestion model with a disk buffer for durability and a fast direct path for latency-critical producers.
Reconciliation service
The safety net for money movement. It polls NPCI for the true status of any payment still pending past a threshold, matches every callback against the pending record, and drives the transaction to a final state. It must be idempotent, because a duplicate or late callback should never mark a payment paid twice. This is the standard UPI reconciliation pattern.
Fulfillment Engine and product services
The use-case-agnostic platform that handles the steps around a payment, with pre-checkout, checkout, and post-checkout stages, plus the services for recharges, bill payments, insurance, lending, and stockbroking that reuse the same identity and rails.
Data model
Pick the right store per table. Justify each choice with the access pattern, not by reflex.
accountsaccount_id (PK)user_idbank_refvpastatusshard_keyThe linked bank account and its VPA (UPI ID). Sharded by user or account id via the common sharding library. The vpa is the routable address other users pay to.
transactionstxn_id (PK)payer_account_idpayee_vpaamount_paiseflow (pay|collect)stateidempotency_keynpci_refcreated_atThe payment record. Written first as pending with an idempotency key, then moved to success or failure once NPCI confirms. Sharded by payer. Strong consistency required. The idempotency_key is indexed so a retried request maps to the same transaction.
ledger_entriesentry_id (PK)txn_id (FK)account_iddirection (debit|credit)amount_paisebalance_aftercreated_atDouble-entry accounting. Every transaction produces matching debit and credit rows so the books always balance. Append-only, never updated in place, so the full money history is auditable.
txn_eventsevent_id (PK)txn_id (FK)from_stateto_statesource (app|npci|recon|risk)created_atAn immutable log of every state change on a transaction, so you can reconstruct exactly what happened and when. This is what a dispute or a reconciliation run reads.
risk_signalsentity_id (PK)entity_type (user|device|vpa|merchant)features (jsonb)scoreupdated_atHeld in the Aerospike layer for sub-millisecond reads. Real-time aggregated features per entity that the Risk and Decisioning Engine reads inline to score a payment before it clears.
idempotency_keysidempotency_key (PK)txn_idrequest_hashstatuscreated_atMaps a client request to a single transaction. A retried pay request with the same key returns the existing transaction rather than creating a new debit. Central to never double charging.
Deep dives
These are the conversations the interviewer is steering you toward. Practice each one until you can talk through it without notes.
The UPI PSP flow and why every step must be idempotent
A UPI payment is a distributed transaction across parties PhonePe does not control: the payer bank, NPCI, and the payee bank. PhonePe acts as the Payment Service Provider. In the pay flow the payer initiates and approves with a UPI PIN; in the collect flow a request is pushed to the payer to approve. In both, the debit and credit happen at the banks over NPCI, and the final status returns asynchronously through a callback that can be delayed, duplicated, or out of order. That is why the payment is written first as a pending record tied to an idempotency key. If the app retries because it did not get a response, the same key maps to the same transaction, so there is no second debit. The status only moves to success or failure once the true outcome is known. This end-to-end flow is the standard UPI pattern that every PSP implements, rather than a PhonePe-published internal, but it is the correctness backbone the interviewer is looking for.
Sharded shared-nothing MySQL as the transactional core
PhonePe has written that its primary transactional store is sharded MySQL, run as a strict shared-nothing architecture. Three choices matter. First, a common sharding library is used across all MySQL databases, so sharding logic is consistent and not re-invented per service. Second, the user path avoids scatter-gather queries, meaning a single request does not fan out to every shard and wait for the slowest, which is what keeps latency predictable at scale. Third, services store zero local data, so the data is decoupled from the service container and both can be scaled independently. The payoff is that no single database is a bottleneck and the system grows by adding shards. The cost is that cross-shard operations, such as a payment where payer and payee live on different shards, need care, and analytics that would want to scan across shards have to be served elsewhere, which is part of why the event backbone exists.
Aerospike for real-time reads and inline fraud at scale
Not every read can wait for a sharded SQL query. Balance checks, session lookups, feature-store reads, and fraud signals have to return in well under a millisecond, and they run at very high volume during a festival peak. PhonePe uses Aerospike for this layer. Engineers have reported running many Aerospike clusters per site holding over a trillion records, replicated active-active across sites with strong consistency where it is required, and serving more than 500,000 queries per second with sub-millisecond reads. Aerospike replaced heavier workloads and cut the server footprint substantially. The interview point is the split: keep the durable, auditable money state in the sharded SQL ledger, and put the hot, high-throughput reads that feed the live payment decision in a fast key-value layer next to it.
A Kafka backbone at 100 billion events a day
Behind the synchronous payment path, PhonePe runs an event backbone on Kafka that carries about 100 billion events a day. Two design decisions stand out. First, PhonePe moved from a single Kafka cluster to separate write and read clusters per data center, so heavy read consumers, such as analytics and reconciliation jobs, cannot slow down the write path that the live system depends on. Second, ingestion is a two-tier model: a client library writes to a local disk-backed buffer for durability, an ingestor forwards to Kafka, and latency-critical producers get a faster direct path. This backbone is how fraud scoring, reconciliation, notifications, and downstream products all get the payment events without loading the transactional store.
The pending or DEEMED state and reconciliation
The hardest part of a UPI payment is the case where the outcome is uncertain. NPCI can mark a transaction deemed, meaning the result is not yet known, not a clean success or failure. The system must not show failed and let the user retry, because the money may in fact have moved, and a blind retry could double debit. So the transaction sits in a pending state, and a reconciliation service takes over: it polls NPCI for the true status, matches the callback against the pending record when it arrives, and drives the transaction to a final state. Every step is idempotent, so a duplicate or late callback never marks a payment paid twice, and the ledger is only finalized when the truth is known. This reconciliation behavior is the standard UPI pattern rather than a PhonePe-specific published design, but handling it correctly is what separates a real payments answer from a naive one.
Running your own data centers instead of the public cloud
PhonePe runs its own on-premises data centers, reported as three sites including Mumbai and Bangalore plus a hybrid environment, orchestrated with Mesos and Marathon. This is a real differentiator worth discussing. The reasons a payments company at this scale might choose on-premises are cost at very high and steady volume, control over latency and hardware for sub-millisecond workloads, data residency and regulatory comfort, and predictable performance during national peak events. The costs are large upfront capital, the need to build and run the orchestration, networking, and failover that a cloud would otherwise provide, and slower elasticity than the cloud gives. The honest framing is that on-premises at national scale is a deliberate trade PhonePe made, not the default choice for a smaller product.
Trade-offs to discuss
Every senior interviewer expects you to surface at least 3 of these. Pick the decisions, state the alternatives, and justify your choice.
Sharded MySQL versus a single large database versus a NoSQL store
A single large SQL database is simplest and gives easy transactions, but it becomes a hard ceiling at national UPI volume. A NoSQL store scales writes easily but makes the strong consistency and relational integrity that money needs harder to guarantee. PhonePe chose sharded MySQL with a shared-nothing design: it keeps SQL consistency per shard while scaling horizontally by adding shards. The cost is that cross-shard work and any query that wants to span shards need deliberate handling, which is why scatter-gather is banned on the user path and analytics run off the event backbone.
Own on-premises data centers versus the public cloud
The public cloud gives fast elasticity and no capital outlay, which is right for most products. At PhonePe's scale and steadiness of volume, running its own data centers can be cheaper over time, gives tighter control over latency and hardware for sub-millisecond workloads, and helps with data residency. The costs are heavy upfront investment and having to build the orchestration, networking, and failover the cloud would otherwise handle, plus slower elasticity. It is a trade that only makes sense at very large, sustained scale.
Aerospike fast layer versus a cache over the primary database
A simple read-through cache over the sharded SQL store is easy to add, but under a festival peak a cache miss storm can hammer the database, and a cache does not give the durable, replicated, strongly-consistent-where-needed store that balance and fraud reads want. A purpose-built low-latency store like Aerospike serves those reads at very high throughput with sub-millisecond latency and its own replication. The cost is another system to operate and keep consistent with the source of truth, justified by the volume of hot reads on the payment path.
Separate read and write Kafka clusters versus one cluster
One Kafka cluster is simpler to run, but heavy read consumers such as analytics and reconciliation can steal capacity from the write path that the live payment flow depends on. Splitting into write and read clusters isolates the two so a spike in downstream reading never slows down producers. The cost is more clusters to operate and the need to replicate data from write to read side, which PhonePe accepted at 100 billion events a day.
Strong consistency on money versus eventual consistency elsewhere
The ledger and payment state cannot be eventually consistent, because that is where double charges, lost money, and stuck transactions come from, so they live in the strongly consistent sharded SQL core with idempotency and double-entry accounting. Reads that tolerate slight staleness, such as transaction history views or aggregated features, can be served from the fast layer or the event backbone. Splitting the system this way avoids paying for strong consistency where it is not needed while never giving it up on money.
Inline synchronous fraud check versus scoring after the payment
Scoring fraud after a payment clears is simpler and keeps the payment path fast, but it means a fraudulent transaction has already moved money and must be clawed back. Scoring inline, before the money moves, can block or hold a suspicious payment, which is why PhonePe built a real-time risk engine reading from the sub-millisecond Aerospike layer. The cost is that the fraud check is now on the critical path and must itself be extremely fast and highly available, which is exactly why it reads from the fast store rather than the SQL ledger.
How PhonePe actually does it
Several parts of this are documented on PhonePe's own engineering blog. PhonePe has written that its primary transactional store is sharded MySQL under a shared-nothing architecture, with a common sharding library, no scatter-gather queries on the user path, and no local data on service containers. It has described in-house platforms including a Payments Orchestrator, a Risk and Decisioning Engine that aggregates across its event stream in real time, and a Fulfillment Engine, along with an edge stack built on Mesos and Marathon with NGINX and Traefik. Its Kafka backbone carries about 100 billion events a day and is split into separate write and read clusters per data center with a two-tier ingestion path. The Aerospike figures, meaning many clusters per site, over a trillion records, active-active replication across sites, and more than 500,000 queries per second with sub-millisecond reads, come from PhonePe engineers quoted in Aerospike case studies, so they are attributed engineer statements rather than audited numbers, and different write-ups give slightly different snapshots. The market position, meaning about 48 percent of UPI by volume and just over 50 percent by value in January 2025, comes from NPCI data reported in the press. The user and merchant figures, meaning 500 million registered users in November 2023 and about 47.19 million merchants across 98.61 percent of pin codes in the 2025 IPO filing, come from PhonePe press releases and its filing. Two honesty notes for the interview. First, PhonePe does not publish a payment transactions-per-second figure, so the 500,000 number is queries per second on the fast read layer, not payments per second. Second, the NPCI PSP flow, the pending or DEEMED state, and reconciliation are the standard UPI pattern that every app implements, not internals PhonePe has published, so present them as the correct design and reserve the phrase PhonePe does this for the data-store and infrastructure work it has actually written about.
Sources
- PhonePe Engineering, The Kafka Edge: managing 100 billion daily events, the dual write and read cluster design and two-tier ingestion
- PhonePe Engineering, Living on the Edge: the Mesos and Marathon edge and orchestration stack
- PhonePe Engineering, Chronicling our Technology Journey: sharded MySQL shared-nothing design and the in-house Orchestrator, Risk, and Fulfillment platforms
- Aerospike, How PhonePe runs real-time transactions with Aerospike: clusters per site, over a trillion records, 500,000+ QPS, sub-millisecond reads (engineer-attributed)
- PhonePe Press, PhonePe crosses 500 million lifetime registered users (November 2023)
- PhonePe Press, PhonePe hits USD 1 Trillion annualised TPV runrate (March 2023)
- Entrackr, PhonePe hits 8 billion UPI transactions in January 2025, with NPCI market-share data
Lessons to study before this interview
If any of these topics are fuzzy, the interviewer will catch it. Each lesson is 15 to 60 minutes with diagrams, code, and a quiz.
Idempotency
foundation / core fundamentals
Database Sharding
foundation / database fundamentals
Distributed Transactions
advanced / distributed systems core
Design a Payment System
capstone / capstone
Retry Patterns
advanced / reliability resilience
High Availability
advanced / reliability resilience
Rate Limiting for Resilience
advanced / reliability resilience