Delhivery System Design Interview: Logistics Network at Scale
Delhivery moved more than 100 million parcels in a single peak month across a hub-and-spoke network of sort centers and delivery hubs. Its hardest problem is not the trucks, it is the addresses: a fifth to a third of Indian addresses carry the wrong pin code, so Delhivery built machine learning that reads a messy address and pins it to about a 200-meter spot.
Designing Delhivery is the logistics-network problem, which is different from the consumer apps. A parcel is picked up, moves through a hub-and-spoke network of sort centers and gateways, and is delivered to a doorstep, and the whole thing runs at the scale of e-commerce infrastructure for the country. The distinctive engineering, and what Delhivery has published, is address intelligence: Indian addresses are unstructured and often carry the wrong pin code, so Delhivery built machine learning that resolves a messy address to a precise location, which lets it route by geocode rather than by pin code. This walkthrough covers the network, the address-resolution system Delhivery published, tracking at scale, and peak handling, and is honest about which internals are the standard logistics pattern.
Asked at: Commonly asked at Delhivery and logistics and supply-chain teams, and the general forms, meaning design a logistics or delivery network, a package-tracking system, or an address and geocoding system, show up at Amazon, Flipkart, and most product companies for SDE2 and SDE3 rounds. Delhivery is a good question because it moves the focus from a consumer app to the physical network and the messy-address problem underneath e-commerce.
Why this question is asked
Logistics forces a set of problems most app designs skip. A parcel has to physically move through a multi-stage network, pickup, line-haul between sort centers, and last-mile delivery, and the design has to route it, track it, and forecast volume so capacity is in place. But the problem that makes Delhivery distinctive, and that it has published about, is address resolution: in India, a large share of addresses are unstructured and carry the wrong pin code, so simply sorting by pin code misroutes parcels, and the system needs to understand a messy address well enough to place it accurately. Interviewers use Delhivery to check whether you can design a hub-and-spoke network with tracking at scale, and whether you can reason about turning noisy, human-written addresses into precise, routable locations, which is a real machine-learning and data problem rather than a lookup.
Requirements
Always clarify these in the first 5 minutes of the interview. Do not start drawing boxes until both lists are agreed.
Functional requirements
- Accept a shipment from a seller, with a pickup and a destination address
- Resolve the destination address, which is often unstructured and may have a wrong pin code, to a precise location
- Route the parcel through the network: first mile to a sort center, line-haul between facilities, and last mile to the doorstep
- Sort parcels at automated sort centers by their true destination, not just the written pin code
- Track every parcel through each scan and give the customer live status
- Assign and optimize last-mile delivery routes for delivery agents
- Forecast volume and plan capacity, especially for festive-season peaks
- Handle returns back through the network
Non-functional requirements
- Resolve messy addresses accurately, since a wrong pin code otherwise sends a parcel to the wrong place
- Handle very high volume: on the order of 100 million shipments in a peak month and millions on a peak day
- Track parcels at the scale of billions of status scans, with fresh status for customers
- Route and sort correctly and efficiently to keep misroutes and lost shipments low
- Scale capacity up for festive peaks that are many times normal volume
- High reliability across the network, since a failure at a hub delays many parcels
- Efficient enough per parcel to work at logistics margins
Back-of-envelope scale estimates
Show your math. Pulling numbers from thin air signals you have not thought about the load.
Peak monthly shipments
107M+ (Oct 2025)
Delhivery reported handling more than 107 million shipments in October 2025, a festive-season peak. This is the volume the network and tracking must absorb at its busiest.
Peak single day
~7.2M shipments
Delhivery reported a single-day peak of about 7.2 million shipments in October 2025. The daily peak is what sort-center and last-mile capacity is sized for.
Annual shipments (historical anchor)
~100M/year, 12,000+ pin codes
Delhivery's own engineering writing cited close to 100 million shipments a year to more than 12,000 pin codes at the time of writing, an older but Delhivery-stated anchor for the network's scale.
Address problem
20-30% wrong pin codes
Delhivery stated that 20 to 30 percent of written Indian addresses carry an incorrect pin code, and that a pin code can span a median of about 90 square kilometers with up to a million households. This is why address resolution is a core system, not a lookup.
Address resolution accuracy
>90% of shipments at ~200m
Delhivery reported that a later version of its address-resolution system resolves more than 90 percent of shipments to about 200-meter median precision, up from 80 to 85 percent at 500 meters in the rule-based version. This precision is what enables geocode-based routing.
Fleet and location data
100,000+ vehicles, ~1B GPS pings/day
Delhivery reported a fleet of more than 100,000 vehicles emitting about 1 billion GPS pings a day, feeding its newer geospatial systems. This is the live telemetry behind routing and tracking.
High-level architecture
Design Delhivery in two layers: the physical network that moves parcels, and the address and data intelligence that makes it accurate. The network structure and the address-resolution system are grounded in what Delhivery has stated, while the specific routing algorithms and the streaming and database technology are the general logistics pattern and are described as such. The physical network is hub and spoke. A parcel is picked up from a seller in the first mile and taken to a sort center. Sort centers, many of them automated, sort parcels by destination and hand them to line-haul, the middle mile, which moves them between facilities and gateways across the country. At the destination end, a parcel reaches a local delivery center, and the last mile takes it to the doorstep, with a delivery agent on an optimized route. Delhivery operates this structure at large scale, with tens of automated sort centers, dozens of hubs and gateways, and thousands of delivery centers, though the exact facility counts vary by source and period. The routing and sortation mechanics themselves are the standard logistics approach rather than a Delhivery-published algorithm. The address and data intelligence is where Delhivery is distinctive. The core problem it published is that Indian addresses are unstructured and unreliable: a pin code can cover a median of about 90 square kilometers with up to a million households, and 20 to 30 percent of written addresses carry the wrong pin code, so sorting purely by the written pin code misroutes parcels. Delhivery built a system, which its data-science leaders described publicly, that learns to decode addresses using generative machine learning over both the text customers write and the GPS traces of delivery agents who actually went there, in an unsupervised way. It derives locality names, a hierarchy from state down to sublocality and rooftop, alternate spellings, and geographic polygon boundaries, and it uses phonetic fuzzy matching tuned for Indian languages. A later version resolves more than 90 percent of shipments to about 200-meter precision. The architectural payoff, which Delhivery states, is that it can move off pin-code-based sorting toward routing by resolved geocode and locality, which is more accurate than the postal code alone. Behind both sits a data and tracking platform. Every parcel is scanned at each step, producing billions of status scans that feed live tracking, and Delhivery has reported using its data for demand forecasting and capacity planning as well as descriptive reporting, with measurable gains in delivery productivity and reductions in lost shipments and misroutes. The specific streaming and storage technology is not published, so a high-throughput event pipeline is described as the industry pattern rather than a stated fact. Delhivery's newer geospatial work adds vehicle-aware routing, which accounts for whether a heavy vehicle or a two-wheeler is traveling, built on its own map data.
In a real interview, sketch this on the whiteboard before diving into any single box.
Core components
Walk through each service. The interviewer wants to hear what each one owns, not just the names.
Address resolution (AddFix)
Delhivery's flagship published system that turns a messy, human-written address into a precise location. It learns from address text and delivery-agent GPS traces, unsupervised, deriving locality, hierarchy, alternate spellings, and polygon boundaries, with phonetic matching for Indian languages, resolving more than 90 percent of shipments to about 200-meter precision. This is what makes accurate routing possible.
First-mile pickup
Collects shipments from sellers and brings them into the network at a sort center. The entry point where a shipment and its destination address are registered and resolved.
Sort centers
Facilities, many automated, that sort parcels by their true destination. Delhivery runs tens of automated sort centers with a combined capacity of millions of parcels a day. Because of address resolution, sorting can key on a resolved geocode or locality rather than only the written pin code.
Line-haul (middle mile)
The network of trucks and gateways that move parcels between sort centers across the country. Load and route optimization here follow the standard logistics pattern; Delhivery operates the structure at national scale.
Last-mile delivery
Delivery centers and agents that take a parcel the final distance to the doorstep, on optimized routes. Delhivery's newer vehicle-aware routing accounts for whether a heavy vehicle or a two-wheeler is delivering, using its own map data.
Tracking and scan pipeline
Every parcel is scanned at each step, producing billions of status scans that drive live customer tracking and internal visibility. The high-throughput event pipeline behind this is the standard pattern; Delhivery has not published the specific technology.
Forecasting and capacity planning
The data systems that predict volume and plan capacity, especially for festive peaks that are many times normal load. Delhivery reported using predictive modeling for demand planning and descriptive analytics for operations, with gains in productivity and reductions in misroutes and lost shipments.
Geospatial and maps (Delhivery Maps)
Delhivery's own geospatial suite, the successor to its address work, offering geocoding, reverse geocoding, and vehicle-aware routing built for landmark-based navigation and incomplete addresses, tested on billions of shipments and backed by about a billion GPS pings a day from its fleet.
Data model
Pick the right store per table. Justify each choice with the access pattern, not by reflex.
shipmentsshipment_id (PK)seller_idraw_addressresolved_locationcurrent_facility_idstatecreated_atThe parcel and its journey. raw_address is what the customer wrote; resolved_location is the output of address resolution. state moves through the network stages. current_facility_id tracks where it is now.
resolved_addressesaddress_hash (PK)hierarchy (state->sublocality)geocodepolygonconfidenceThe output of the address-resolution system: a precise geocode, a locality hierarchy, and a polygon boundary, derived from address text and delivery GPS traces. Routing and sorting use this rather than the raw pin code.
facilitiesfacility_id (PK)type (sort_center|gateway|delivery_center)geocapacity_per_dayThe nodes of the hub-and-spoke network. Types and capacities differ, and the network moves parcels between them. Exact counts vary by period.
scansscan_id (PK)shipment_id (FK)facility_idscan_typetsThe event stream of the system: a scan at each step, billions in total, driving live tracking and internal visibility. The append-only record of a parcel's path.
routesroute_id (PK)agent_idvehicle_typestops[]dateA last-mile delivery route: an ordered set of stops for an agent, optimized for the vehicle type. vehicle_type matters because a heavy vehicle and a two-wheeler travel differently.
fleet_locationsvehicle_id (PK)latlngtsLive vehicle positions, on the order of a billion pings a day from 100,000-plus vehicles. Feeds routing, tracking, and the map and address systems. Held in a fast store, not durably at full fidelity.
Deep dives
These are the conversations the interviewer is steering you toward. Practice each one until you can talk through it without notes.
Why Indian address resolution is the core problem
The distinctive Delhivery insight is that in India, the address itself is unreliable, and that breaks naive logistics. A pin code, the postal code, can cover a median of about 90 square kilometers and up to a million households, so it is far too coarse to deliver to, and Delhivery stated that 20 to 30 percent of written addresses carry the wrong pin code entirely. Addresses also have spelling errors, missing localities, landmark-based directions rather than street numbers, and transliteration variants across languages. If you sort and route purely by the written pin code, you misroute a large fraction of parcels. So the address is not a field to look up, it is a noisy signal to interpret. Delhivery framed this explicitly and built a system to decode it, which is why address intelligence, rather than trucks, is the heart of its published engineering. For an interview, recognizing that the address is the hard, unsolved input, and that solving it unlocks everything downstream, is the key insight.
How Delhivery resolves a messy address
Delhivery's data-science leaders published how its address-resolution system, which they called AddFix, works. Rather than hand-writing rules for every address format, it learns from two sources together: the text of the addresses customers write, and the GPS traces of the delivery agents who actually went to those addresses. From this it learns, in an unsupervised way, locality names, a hierarchy from state down through city, locality, and sublocality to rooftop, alternate and misspelled forms of place names, and geographic polygon boundaries for localities, plus a phonetic fuzzy search tuned for Indian-language variants so that differently spelled versions of the same place match. The result is that a new, messy address can be placed into that learned structure and assigned a precise location. Delhivery reported that a rule-based first version resolved 80 to 85 percent of addresses to 500-meter precision, while a later, learned version resolves more than 90 percent of shipments to about 200 meters. The lesson is a general one: when the input is noisy human text tied to physical reality, learning from behavior, here the agents' GPS, is more powerful than rules, because the ground truth is where the parcels actually went.
Moving from pin-code sorting to geocode routing
The architectural payoff of address resolution, which Delhivery states, is that it can change how it sorts and routes. Traditional courier operations sort by pin code, which is simple but coarse and wrong a fraction of the time. Once a system can resolve an address to a precise geocode and locality with high accuracy, it can sort and route by that resolved location instead, sending a parcel to the correct delivery center and the correct beat within it, rather than to whatever the written pin code says. This is a concrete example of how an upstream data-quality improvement changes a downstream physical process: better address understanding means fewer misroutes, more first-attempt deliveries, and tighter last-mile routes. Delhivery reported large reductions in misroutes and lost shipments over the period it invested in this. The interview framing is that the routing system should be built on the resolved, trustworthy location, not on the raw, unreliable field the customer typed.
The hub-and-spoke network and the parcel's journey
The physical side is a hub-and-spoke network, and it is worth walking the journey. A parcel is picked up from a seller in the first mile and brought to a sort center. Sort centers, many automated, sort it by destination, and line-haul, the middle mile, carries it between sort centers and gateways across the country, consolidating loads so trucks run full. At the destination region it reaches a local delivery center, and the last mile takes it to the doorstep, with agents on optimized routes. Delhivery runs this at national scale, with tens of automated sort centers handling millions of parcels a day, dozens of gateways, and thousands of delivery centers. The design tension in any such network is consolidation versus speed: routing everything through big sort centers is efficient because trucks run full and sortation is automated, but it adds hops and time, so the network is tuned to balance cost against delivery speed. The specific routing and load-optimization algorithms are the standard logistics approach rather than a Delhivery-published design, but the structure and its scale are real.
Tracking and forecasting at the scale of e-commerce
Delhivery positions itself as infrastructure for e-commerce, and two data problems follow. First, tracking: every parcel is scanned at each step, which at this volume produces billions of status scans, and those feed both the live status a customer sees and the internal visibility operations needs. That is a high-throughput event pipeline, though Delhivery has not published the specific technology, so it is described as the standard pattern of a scan event stream feeding a tracking store and a data platform. Second, forecasting: because volume swings enormously, especially in festive season when a peak month can exceed 100 million shipments and a peak day around 7 million, Delhivery reported using predictive modeling for demand planning so capacity, staff, vehicles, and sort-center throughput, is in place ahead of the surge, rather than reacting to it. It reported concrete gains from its data investment, including higher deliveries per agent shift and large reductions in misroutes and lost shipments. The lesson is that a logistics network is a data system as much as a physical one, and that tracking and forecasting are first-class.
Vehicle-aware routing and building its own maps
Delhivery's newer work extends the address intelligence into a full geospatial system, including its own maps, and the distinctive piece is vehicle-aware routing. Generic consumer maps optimize for a car, but a logistics network runs heavy vehicles on line-haul and two-wheelers or small vehicles on last mile, which travel at different speeds and have different access, so routing that ignores the vehicle is wrong. Delhivery built routing that accounts for the vehicle type, along with geocoding and landmark-based navigation suited to incomplete Indian addresses, tested on billions of shipments and backed by about a billion GPS pings a day from its own fleet. Building this in-house, rather than relying on a third-party map, follows from the same logic as the address work: the company's own delivery data is the best source of truth for how its vehicles actually move through Indian roads and addresses. The interview point is that at this scale, the map and routing layer is itself a system worth owning, because it is tuned to the specific physical reality of the network.
Trade-offs to discuss
Every senior interviewer expects you to surface at least 3 of these. Pick the decisions, state the alternatives, and justify your choice.
Learning address resolution from data versus trusting the written pin code
Sorting and routing by the written pin code is trivial, but it is coarse and wrong 20 to 30 percent of the time in India, causing misroutes and failed deliveries. Learning to resolve an address from text and delivery GPS traces is far more accurate, more than 90 percent of shipments to about 200 meters, but it requires building and training a machine-learning system and the data pipeline behind it. Delhivery judged the accuracy worth the investment because misroutes and failed deliveries are expensive at its volume, and the resolved location improves everything downstream.
Geocode and locality routing versus pin-code sorting
Pin-code sorting is the industry default and simple to operate, but it inherits the pin code's coarseness and errors. Routing on a resolved geocode and locality sends parcels to the right delivery center and beat, cutting misroutes and improving first-attempt delivery. The cost is depending on the address-resolution system being accurate and available, which Delhivery accepted because the downstream gains, fewer misroutes and lost shipments, are large.
A tech-led, fully integrated network versus an asset-light aggregator
An asset-light model that brokers third-party couriers is cheaper to start and avoids running facilities, but it gives up control over quality, data, and the whole chain. Delhivery built and runs an integrated network, first mile through last mile, with its own software and machine learning, which gives control and a data advantage at the cost of heavy capital and operational complexity. At national e-commerce scale, the integration is what makes the address and routing intelligence possible, since the network generates the delivery data the models learn from.
Consolidating through big automated sort centers versus point-to-point
Point-to-point delivery is fastest for a single parcel but wasteful, since vehicles run empty and there is no sortation leverage. Consolidating through large automated sort centers keeps trucks full and sortation efficient, at the cost of extra hops and some added time per parcel. The network is tuned to balance this, using consolidation for efficiency while managing the number of hops so delivery speed stays acceptable.
Building its own vehicle-aware maps versus using a third-party map
Using a third-party consumer map is less work, but such maps optimize for a car and do not model heavy-vehicle constraints or the landmark-based, incomplete addresses common in India. Delhivery built its own geospatial and routing layer from its delivery data, which fits its vehicles and addresses far better, at the cost of building and maintaining a maps system. Its own billion-ping-a-day fleet data is exactly what makes an in-house map more accurate for its use than a generic one.
How Delhivery actually does it
Delhivery's address-resolution and data work are documented by its own data-science leaders on their engineering blog, its financials and peak volumes are reported because it is publicly listed, and the routing and streaming internals are the standard logistics pattern rather than published designs. Delhivery stated that Indian pin codes can span a median of about 90 square kilometers with up to a million households and that 20 to 30 percent of written addresses carry the wrong pin code, and it described a system that resolves addresses using generative machine learning over address text and delivery-agent GPS traces, unsupervised, deriving locality, hierarchy, alternate spellings, and polygon boundaries with phonetic matching for Indian languages, improving from 80 to 85 percent of addresses at 500-meter precision in a rule-based version to more than 90 percent of shipments at about 200 meters in a later version, and stated that this lets it move off pin-code-based sorting toward geocode-based routing. Its data-platform writing, from an earlier period, cited handling on the order of 75 to 100 million packages a year, generating around 3 billion status scans, and reported gains including higher deliveries per agent shift and reductions in misroutes and lost shipments, without naming a specific streaming or database technology. On scale and financials, Delhivery reported revenue of about 8,931.9 crore rupees in FY25, its first full profitable year since its 2022 listing, and a festive peak of more than 107 million shipments in October 2025 with a single-day peak around 7.2 million. Its newer geospatial suite offers vehicle-aware routing and geocoding, tested on billions of shipments and backed by about a billion GPS pings a day from more than 100,000 vehicles. Three accuracy notes for the interview. First, Delhivery's exact facility counts, sort centers, gateways, and delivery centers, vary across sources and periods, so use ranges rather than a single number. Second, no Delhivery source names its streaming or database stack, so a high-throughput scan pipeline is described as the industry pattern, not a stated fact. Third, the network routing and load-optimization algorithms are the standard logistics approach rather than published Delhivery designs.
Sources
- Delhivery Data Sciences, Learning to Decode Unstructured Indian Addresses: the AddFix machine-learning address-resolution system, by Delhivery's Head of Data Sciences
- Delhivered by Data, Delhivery's data platform and its algorithmic, predictive, and descriptive pillars, by a former Chief Data Scientist
- Entrackr, Delhivery FY25 results: about 8,931.9 crore rupees revenue and its first full profitable year
- The Indian Transport and Logistics News, Delhivery crosses 107 million shipments in October 2025 with a 7.2 million single-day peak
- IndianWeb2, Delhivery launches Delhivery Maps: vehicle-aware routing and geocoding tested on billions of shipments with a billion GPS pings a day
- Business Standard, Delhivery Q4 and FY25 results: revenue and profitability detail
Lessons to study before this interview
If any of these topics are fuzzy, the interviewer will catch it. Each lesson is 15 to 60 minutes with diagrams, code, and a quiz.
Geospatial Indexing
intermediate / database types storage
Message Queues
intermediate / messaging event systems
Cache-Aside Pattern
foundation / caching strategies
Database Sharding
foundation / database fundamentals
Load Balancing
foundation / core fundamentals
High Availability
advanced / reliability resilience
Rate Limiting for Resilience
advanced / reliability resilience