Groww System Design Interview: Mass-Market Investing at Scale
Groww overtook Zerodha in October 2023 to become India's largest stockbroker by NSE active clients, reaching about 13 million active clients and more than 40 million registered users, most of them first-time investors on a mobile app. To carry that growth, it migrated its core databases from MySQL to a distributed, cell-based architecture built for roughly 10x scale.
Designing Groww is the mass-market investing problem. Groww grew fast to become India's number one broker by active clients, largely by making investing simple on a mobile app for first-time investors, and by running several products, mutual funds, systematic investment plans, stocks, futures and options, and UPI payments, behind one app. The engineering that Groww has published is distinctive: a move from MySQL to a distributed, cell-based database for horizontal scale, a high-performance trading terminal that streams market data as compact binary messages, and serious reliability engineering around the daily market open. This walkthrough centers on those published parts and the mass-market scale, and is honest that the exchange-connectivity plumbing is the standard broker pattern rather than a Groww-published design.
Asked at: Commonly asked at Groww, Zerodha, Upstox, Angel One, and fintech and trading teams, and the general forms, meaning design a stock or investing app, a real-time market-data system, or a highly reliable low-downtime service, show up at most product companies for SDE2 and SDE3 rounds. Groww is a good question because it is a broker at the largest active-client scale in India, reached quickly, which foregrounds horizontal database scale, mobile-first mass usage, and market-open reliability.
Why this question is asked
A broker has to do two hard things: stream live market data to many users with low latency, and take orders to the exchange with a strict lifecycle where money and positions must reconcile. Groww adds a third dimension that it has published about: growing to the largest active-client base in India in a short time, which forces the data layer to scale horizontally rather than sit on one big database, and forces real discipline around the market open, when a mass of mobile users all act in the same minutes. Interviewers use Groww to check whether you can design a real-time terminal and an order path, reason about scaling a transactional data store with a distributed, cell-based design, and think about reliability as an engineering practice, with service-level objectives and a readiness process, rather than as an afterthought. It rewards candidates who can separate the standard broker pattern from Groww's specific published choices.
Requirements
Always clarify these in the first 5 minutes of the interview. Do not start drawing boxes until both lists are agreed.
Functional requirements
- User sees live prices for stocks and other instruments, updating in real time on a mobile app or web terminal
- User places an order (market or limit, buy or sell), routed to the exchange, and sees its status change live
- User invests in mutual funds and sets up systematic investment plans that run on a schedule
- User sees holdings, positions, and profit and loss update as trades execute and prices move
- User adds money and pays through UPI and other methods, with funds and margin checked before an order
- Onboarding and account opening for large numbers of first-time investors
- A simple mobile experience for the mass market, plus a high-performance terminal for active traders
- End-of-day reconciliation of trades and funds across products
Non-functional requirements
- Stream live prices to a very large mobile user base with low latency and low bandwidth
- Absorb the market-open surge at 9:15, when a mass of mobile users connect and act within minutes
- Orders, funds, and positions must always reconcile, even under load or partial failure
- Very high availability during market hours, since downtime during trading is severe
- Scale the transactional data layer horizontally to roughly 10x, without a single database as the ceiling
- Run several products (mutual funds, systematic plans, stocks, futures and options, UPI) with isolated failure surfaces
- Operate cost-efficiently, since a mass-market, lower-fee model depends on low cost per user
Back-of-envelope scale estimates
Show your math. Pulling numbers from thin air signals you have not thought about the load.
NSE active clients
~13M, #1 (FY25)
Groww reported about 13 million NSE active clients and roughly 26 percent market share at the end of FY25, the largest in India, having overtaken Zerodha in October 2023. This is the active base the live systems serve.
Registered users
40.2M (June 2025)
Groww reported about 40.2 million registered users across 98 percent of India's pin codes as of its 2025 IPO. The mass-market, largely first-time-investor base is what drives the mobile-first, simple-UX design.
Growth rate
15% -> 26% NSE share in ~2 years
Groww's NSE active-client share grew from about 15 percent in March 2023 to about 26 percent in mid-2025, roughly doubling users in a year. The speed of growth is why the data layer had to be re-architected for scale.
Database scale target
~10x via cell-based architecture
Groww migrated from MySQL to a distributed SQL, cell-based architecture targeting roughly 10x scale, because the old setup had no native sharding and no clean path to grow. This is the headline data-layer number.
Revenue
~3,902 crore rupees (FY25)
Groww reported revenue from operations of about 3,902 crore rupees in FY25, up from about 2,609 crore in FY24, with most of it from broking. It listed publicly in November 2025. A sizing signal for the business.
Operations team
4-5 person DevOps
Groww reported running its cloud infrastructure with a DevOps team of only four to five people by leaning on managed services on Google Cloud. It shows how much the design favors managed, automated infrastructure.
High-level architecture
Design Groww in three parts: the mass-market app and its data layer, the real-time and order path, and the reliability engineering around the market open. Groww runs on Google Cloud and has published a fair amount about all three, while the exchange-connectivity plumbing is the standard broker pattern and is described as such. Note that Groww's published real-time and scaling story is different from a colocated Go-ticker broker, so it is treated on its own terms. The app and data layer carry the mass-market load. Most of Groww's users are first-time investors on a mobile app, so the experience is deliberately simple, and several products, mutual funds, systematic investment plans, stocks, futures and options, and UPI payments, sit behind one app, each run as its own system with its own failure surface. The most distinctive published choice is the data layer. Groww migrated its core databases from MySQL to CockroachDB, a distributed SQL store, to power a cell-based architecture aimed at roughly 10x scale. The reasons it gave are concrete: MySQL had no native sharding, so scaling meant custom application logic and error-prone manual movement of users between database instances, and there was no clean multi-cloud path. Distributed SQL gives horizontal scale while keeping SQL and transactions, and a cell-based design partitions users into self-contained cells so the system grows by adding cells. Groww built the migration with change-data-capture, using Debezium, to move large datasets with minimal downtime. The rest of the stack it has published includes Java and Spring Boot services, Go services, Kafka for event-driven messaging with delayed queues and dead-letter handling, and BigQuery as the analytics warehouse, all on Google Kubernetes Engine with autoscaling and cost-optimized preemptible machines. The real-time and order path serves live prices and trades. Groww built a high-performance trading terminal it calls 915. It streams market data through a custom publish-subscribe framework that sends compact binary-encoded messages over sockets, which the client decodes, cutting both bandwidth and parsing time compared with sending text, and it syncs across browser tabs with the BroadcastChannel API. The terminal front end uses React with a lightweight state store and integrates charting with a custom order-placement layer. The order path itself, taking an order to the exchange through an order management system and processing fills, is the standard broker flow and is not something Groww has published in detail, so it is described as the general pattern. The reliability layer is a published strength. Because a mass of mobile users all act at the 9:15 open, Groww treats market-open readiness as an engineering practice. It runs observability on the Loki, Grafana, Tempo, and Mimir stack, sets per-service service-level objectives on error rate and p95 and p99 latency evaluated every minute, and monitors each app route across mobile and web. It holds a production-readiness call before the open each day where every product domain has a named owner, and it runs three independent detection layers: automated alerting, anomaly detection, and human monitoring. Over one ten-day window it reported more than 13,000 alerts, most auto-resolving within minutes.
In a real interview, sketch this on the whiteboard before diving into any single box.
Core components
Walk through each service. The interviewer wants to hear what each one owns, not just the names.
Mobile app and product services
The simple, mass-market mobile experience for first-time investors, backed by separate services for each product: mutual funds, systematic investment plans, stocks, futures and options, and UPI payments. Groww runs these as distinct systems with isolated failure surfaces, so a problem in one product does not take down the others.
Distributed data layer (CockroachDB, cell-based)
Groww's core transactional store, migrated from MySQL to CockroachDB to power a cell-based architecture for roughly 10x scale. Distributed SQL keeps transactions and SQL while scaling horizontally, and cells partition users into self-contained units so the system grows by adding cells rather than by resharding a single database.
Real-time market-data pipeline
The streaming layer behind the 915 terminal. A custom publish-subscribe framework sends compact binary-encoded market messages over sockets, decoded on the client to save bandwidth and parsing time, with cross-tab synchronization via the BroadcastChannel API. This is Groww's published real-time approach.
Trading terminal (915)
Groww's high-performance web terminal for active traders, built with React and a lightweight state store, integrating charting with a custom order-placement layer over the real-time data pipeline. It sits alongside the simple mobile app, serving the power-user end of the spectrum.
Order management and exchange connectivity
The path that takes an order to the exchange and processes acknowledgements and fills, updating the order, position, and funds. As a broker Groww runs this, but it has not published its internals, so it is described as the standard broker pattern rather than a Groww-specific design, and it is distinct from a colocated Go-ticker approach.
Event backbone (Kafka)
The asynchronous messaging spine. Groww has published work on Kafka, including a production-grade delayed-message system and retry using delayed queues and dead-letter queues, which is how events flow between services for orders, payments, and downstream processing.
Reliability and observability platform
The market-open reliability layer. Observability on the Loki, Grafana, Tempo, and Mimir stack, per-service objectives on error rate and p95 and p99 latency evaluated every minute, per-route monitoring across mobile and web, a daily pre-open readiness call with named owners, and three detection layers: alerting, anomaly detection, and humans.
Cloud infrastructure
Runs on Google Cloud, orchestrated with Google Kubernetes Engine and autoscaling to absorb morning and evening spikes, cost-optimized with preemptible machines at a fraction of standard cost, with BigQuery as the warehouse, all in an India region for data residency and managed by a small DevOps team.
Data model
Pick the right store per table. Justify each choice with the access pattern, not by reflex.
usersuser_id (PK)cell_idkyc_statussegment_permissionscreated_atThe account, assigned to a cell in the cell-based architecture. cell_id is how the distributed data layer partitions users. Segment permissions decide which products the user can access.
ordersorder_id (PK)user_idinstrument_tokenside (buy|sell)type (market|limit)qtypricestateexchange_order_idThe order and its lifecycle: placed, sent to exchange, open, partially or fully filled, rejected, cancelled. Strong consistency required. Held in the distributed SQL store, partitioned with the user's cell.
tradestrade_id (PK)order_id (FK)qtypriceexecuted_atExecutions that fill an order. One order can produce several trades. Drives updates to positions and funds, reconciled at end of day.
holdingsuser_idinstrument_tokennet_qtyavg_priceupdated_atThe user's current holdings, derived from trades. Read constantly on the portfolio screen, so it is served fast and recomputed as trades execute and prices move.
sip_mandatessip_id (PK)user_idfund_idamount_paisefrequencynext_run_datestatusA systematic investment plan: a recurring mutual-fund investment that runs on a schedule. Part of the mutual-funds product, which runs as its own system. Reliable scheduled execution is the key requirement.
ticksinstrument_tokenlast_pricemarket_depthtsLive market data, held in memory and pushed to clients as compact binary messages over sockets, not written to the durable store at full fidelity. The highest-volume data in the system.
Deep dives
These are the conversations the interviewer is steering you toward. Practice each one until you can talk through it without notes.
Migrating from MySQL to a distributed, cell-based database
This is Groww's most distinctive published engineering decision. As it grew to the largest active-client base in India, its MySQL setup became a ceiling: MySQL has no native sharding, so scaling meant custom application-level sharding logic and manually moving users between database instances, which is error-prone, and there was no clean path to grow across clouds. Groww migrated to CockroachDB, a distributed SQL database, to power a cell-based architecture aimed at roughly 10x scale. Distributed SQL matters because it keeps the transactions and relational guarantees that money and orders need while scaling horizontally across nodes, rather than forcing a move to a weaker NoSQL model. The cell-based design partitions users into self-contained cells, so the system grows by adding cells rather than by resharding one giant database, and a problem in one cell is contained. The migration itself is a system design problem: Groww used change-data-capture with Debezium to stream data from the old databases to the new one and cut over with minimal downtime and high data integrity. The interview lesson is when and how to move off a single relational database, and that distributed SQL plus cells is a way to scale writes while keeping SQL.
Streaming market data as compact binary messages
Groww's published real-time approach centers on its 915 terminal and a custom publish-subscribe framework. The core idea is to send market data as compact binary-encoded messages over sockets rather than as text like JSON, and to decode them on the client. Binary encoding cuts both the bandwidth used and the time spent parsing each message, which matters when a large mobile user base is receiving frequent price and option-Greek updates. The framework is publish-subscribe, so a client subscribes to the instruments it cares about and receives just those updates, and Groww uses the browser BroadcastChannel API to synchronize data across multiple open tabs so each tab does not open its own connection. This is a different real-time story from a colocated ticker: the emphasis is on an efficient client-facing streaming protocol and terminal for a mass mobile audience. The design point is that at high fan-out, the encoding and subscription model of your push protocol, not just the server, decides whether streaming stays affordable.
Engineering for the market open
A broker's load spikes hard at the 9:15 market open, when a mass of mobile users connect and act within minutes, and Groww treats readiness for that as an explicit engineering practice rather than hoping autoscaling copes. It runs observability on the Loki, Grafana, Tempo, and Mimir stack, and sets per-service service-level objectives on error rate and on p95 and p99 latency, evaluated every minute, so a 30-second problem counts as a breached minute rather than being averaged away. It monitors each app route across iOS, Android, and web with its own objective, request rate, error counts, and latency. Operationally, it runs a production-readiness call before the open each day where every product domain, mutual funds, systematic plans, UPI, futures and options, has a named owner accountable for its health, and it uses three independent detection layers: automated alerting, anomaly detection, and human monitoring. Over one ten-day window it reported more than 13,000 alerts, most auto-resolving within minutes, and it rejected a proposal to delay alerts by five minutes because metric-collection latency already adds time and more delay is too risky for live trading. The lesson is that reliability at a known daily peak is a process and a measurement discipline, not just infrastructure.
Running many products behind one app
Groww started as a direct mutual-fund platform and then added stocks, futures and options, and UPI payments, and it runs each of these as its own system rather than as one monolith. Mutual funds and systematic investment plans, equities trading, and payments have different flows, different external dependencies, and different failure modes, so isolating them means a problem in one, say a mutual-fund processing delay, does not take down stock trading or payments. This shows up directly in the market-open readiness process, where each product domain is owned and checked separately. The design point is that product breadth is best served by separate, well-bounded systems behind a unified app, so each can scale, fail, and be operated independently, which is also what lets a mass-market app keep adding products without the whole thing becoming fragile.
Cost-efficient infrastructure for a mass-market model
A mass-market, lower-fee investing model only works if the cost per user stays low, so Groww's infrastructure choices lean hard on managed services and cost optimization. It runs on Google Cloud with Google Kubernetes Engine and autoscaling to expand for the morning and evening market spikes and shrink afterward, uses preemptible machines that cost a fraction of standard instances for suitable workloads, and centralizes analytics in BigQuery, all in an India region for data residency. Notably, it reported running this with a DevOps team of only four to five people, which is possible precisely because it offloads so much to managed services and automation, and it has published a cost-optimization effort that cut cloud spend substantially. The interview framing is that at mass-market scale with thin margins, cost efficiency is a design requirement, and managed, autoscaled, spot-heavy infrastructure operated by a small team is how Groww meets it.
Trade-offs to discuss
Every senior interviewer expects you to surface at least 3 of these. Pick the decisions, state the alternatives, and justify your choice.
Distributed SQL with cells versus a single large MySQL database
A single MySQL database is simple, but it has no native sharding, so scaling it meant custom application sharding and manually moving users between instances, which is error-prone and has a ceiling. Migrating to distributed SQL with a cell-based design gives horizontal scale while keeping transactions and SQL, and contains problems within a cell. The cost is running a distributed database and executing a careful migration, which Groww did with change-data-capture, accepted because the old setup could not carry the growth to the largest active-client base in India.
Binary publish-subscribe messages versus JSON over the socket
JSON is easy to work with, but at the fan-out of a mass mobile user base receiving frequent price and Greek updates, its size and parsing cost are too high. Groww sends compact binary-encoded messages decoded on the client, cutting bandwidth and parse time, and uses BroadcastChannel to share data across tabs. The cost is a more complex client that must decode the binary format, accepted because it is what keeps high-fan-out streaming affordable and fast.
Per-service, minute-granularity SLOs versus coarse monitoring
Coarse, averaged monitoring is cheaper and simpler, but it hides short, sharp problems that matter enormously during live trading, where a 30-second outage at the open is severe. Groww sets per-service objectives on error rate and p95 and p99 latency evaluated every minute, so brief spikes are caught, and it accepts the extra alerting volume, more than 13,000 alerts in a ten-day window, because most auto-resolve and the sensitivity is worth it for a trading platform.
Separate systems per product versus one monolith
One monolith across mutual funds, stocks, and payments is simpler to build initially, but it couples very different flows and failure modes, so one product's problem can sink the others. Groww runs each product as its own system with an isolated failure surface and a named owner in the readiness process. The cost is more services to build and coordinate, accepted because isolation is what lets a mass-market app keep adding products without becoming fragile.
Managed cloud and preemptible machines versus self-managed, on-demand infrastructure
Self-managing infrastructure gives maximum control, and running everything on on-demand instances is the most reliable, but both are expensive in money and people for a thin-margin mass-market business. Groww leans on managed Google Cloud services, autoscaling, and cheaper preemptible machines, run by a small DevOps team. The cost is tolerating preemptible capacity being reclaimed and depending on a provider's managed services, accepted because it keeps cost per user low, which the business model needs.
How Groww actually does it
Groww runs an active engineering blog, so much of this is documented directly, while the exchange-connectivity and order-management internals are the standard broker pattern rather than a Groww-published design. Groww became India's largest stockbroker by NSE active clients in October 2023, overtaking Zerodha, and reported about 13 million NSE active clients and roughly 26 percent market share at the end of FY25, up from about 15 percent share in March 2023, with about 40.2 million registered users across 98 percent of pin codes as of its 2025 IPO. It reported revenue from operations of about 3,902 crore rupees in FY25, mostly from broking, and listed publicly in November 2025. On engineering, Groww published that it migrated its core databases from MySQL to CockroachDB to power a cell-based architecture targeting roughly 10x scale, citing MySQL's lack of native sharding and error-prone manual user movement, and that it used Debezium change-data-capture for the migration. It published its 915 trading terminal, which streams market data as compact binary-encoded messages over sockets, decoded on the client, with BroadcastChannel cross-tab sync and a React front end. It published its market-open reliability practice: observability on the Loki, Grafana, Tempo, and Mimir stack, per-service service-level objectives on error rate and p95 and p99 latency evaluated every minute, per-route monitoring across mobile and web, a daily pre-open readiness call with named product owners, three detection layers, and more than 13,000 alerts in a ten-day window. It runs on Google Cloud with Google Kubernetes Engine autoscaling and preemptible machines and BigQuery, managed by a four-to-five-person DevOps team, and uses Kafka with delayed queues and dead-letter handling, along with Java and Spring Boot and Go services. Three accuracy notes for the interview. First, Groww's colocation and order-management specifics are not published, so present the exchange path as the general broker pattern and do not attribute a colocated Go-ticker design to it, which is a different broker's published story. Second, exact orders per day, systematic-plan transaction volume, and absolute peak concurrency are not published, so treat any such number as an assumption. Third, the finer details of the CockroachDB migration beyond the cell-based, 10x, and change-data-capture summary should be treated as reported rather than exhaustively verified.
Sources
- Groww Engineering, Before the Bell Rings: how Groww prepares for market open, covering LGTM observability, per-service SLOs, the readiness call, and 13,000-plus alerts in ten days
- Groww Engineering, How Groww migrated from MySQL to CockroachDB to power its cell-based architecture: distributed SQL, 10x scale, Debezium change-data-capture
- Groww Engineering, Building 915: inside Groww's high-performance trading terminal, the binary publish-subscribe socket pipeline and React front end
- Google Cloud, Groww customer case study: Google Kubernetes Engine autoscaling, preemptible machines, BigQuery, and a small DevOps team
- TechCrunch, Groww raises nearly 750 million dollars in its IPO: FY25 revenue and profit, 40.2 million users, and market share
- Business Standard, NSE active clients up in FY25 with Groww cementing its lead: FY25 market-share figures
- Groww Engineering, implementing retry using delayed queues and dead-letter queues in Kafka: the event-driven backbone
Lessons to study before this interview
If any of these topics are fuzzy, the interviewer will catch it. Each lesson is 15 to 60 minutes with diagrams, code, and a quiz.
Database Sharding
foundation / database fundamentals
WebSockets
intermediate / messaging event systems
Message Queues
intermediate / messaging event systems
Cache-Aside Pattern
foundation / caching strategies
Load Balancing
foundation / core fundamentals
High Availability
advanced / reliability resilience
Rate Limiting for Resilience
advanced / reliability resilience