Skip to main content
biology computer-science library-and-museum-studies political-science

Sharding

Description

Sharding partitions a single logical dataset across multiple physical stores by applying a key function to each record. Each shard holds a disjoint slice of the data; reads and writes are routed to the owning shard via the same key function. The structural move is “split then route” — and the choice of key function is the load-bearing decision, because it determines both the routing logic and the failure modes (hot shards, cross-shard transactions, resharding pain). The diagnostic shape distinguishes sharding from related concepts: replication duplicates (same data, multiple stores, redundancy); sharding splits (different data, multiple stores, scale). The two co-occur in practice — most production systems are sharded and replicated within each shard — but solve different problems.

Triggers

User-initiated: User describes a system that can’t fit on a single store (capacity, throughput, or fault-domain), or describes a per-tenant / per-region / per-customer split. Vocabulary cues: “shard,” “partition,” “horizontal scale,” “split by key,” “per-tenant DB.” Agent-initiated: Engine notices a system whose capacity ceiling is the single-store ceiling, or whose latency profile shows tail-latency dominated by one hot subset of data. Candidate inference: “this dataset wants to be sharded — what’s the key function, and does it distribute uniformly?” Situation-shape signals: Single-store capacity exhausted; per-tenant isolation needed; geographic distribution required; need to scale writes (not just reads — which is what replication solves).

Exclusions

  • Single-store fits comfortably — sharding is premature; the operational complexity (resharding, cross-shard transactions, routing layer) doesn’t earn its keep.
  • Most queries cross shards — if the access pattern doesn’t align with the shard key, every query fans out to all shards; you’ve paid sharding’s cost without earning its benefit.
  • Strong cross-shard ACID required — distributed transactions across shards are expensive and operationally fragile; a single store with vertical scaling may be the cheaper answer until you’re forced off it.

Structure

Internal structure of sharding: a table of its component slots and the concepts that fill them. = container (the shards) + key-function-as-grain (what cuts the data) + router (multi-hop-routing through the routing layer). The grain choice is upstream of everything else: get the key wrong (e.g., shard a multi-tenant system by user_id when most queries are per-tenant), and the routing layer pays the cost on every cross-shard query.

Relationships

Relationship neighborhood of sharding: a graph of the concepts it connects to and the concepts it is a part of.
  • replication — sharding + replication is the standard distributed-storage pattern: split for scale, replicate within each shard for redundancy.
  • grain — the shard key is a grain choice; wrong grain → hot shards or cross-shard pain.
  • load-balancing — uniform key distribution = uniform load; non-uniform key → load-balancing problem reappears at the shard level.
  • multi-hop-routing — the router is the always-traversed hub between client and shard.

Examples

Database partitioning · computer-science

Postgres declarative partitioning, MySQL with Vitess, MongoDB sharding, DynamoDB partition keys. The canonical engineering case.

Library card catalogs (pre-digital) · library-and-museum-studies

alphabetical drawers are range-partitioned shards; the librarian routes via the key (first letter of surname).
A content delivery network (CDN) distributes cached content across geographically-distributed points of presence (POPs); each POP holds a subset of the origin’s content, primarily the items recently requested by users near that POP. The sharding key is effectively geographic-region × content-popularity: each edge node holds the slice of content its region’s user base is currently interested in. The router is the DNS resolution layer (often combined with anycast routing): when a user requests a hostname, DNS resolves it to the POP closest to them, sending the request to the shard that owns the corresponding slice.The structural insight is that the sharding key need not be a property of the data alone; it can be a joint property of data and the access pattern (where requesters are, what they ask for). CDNs effectively shard on expected demand topology, which is what allows the system to satisfy most requests with a small fraction of the origin’s total content at any single edge.Inference: Sharding designs benefit from considering the request-routing topology, not only the data-distribution topology. A shard key chosen purely from intrinsic data properties (user-id, region, type) can create hot shards if the request pattern is skewed against it. A shard key that incorporates access locality (where the request comes from, what session it is part of, what other items are co-accessed) often gives better load distribution at the cost of more complex routing.
multicellular organisms shard genome execution across cell types; the regulatory network is the router (which genes express in which tissue).
Amazon’s Dynamo paper described a partitioned, replicated key-value store built around consistent hashing. Each key is hashed onto a ring; the node closest to that point on the ring (and its successors, for replication) owns the partition. Adding or removing a node moves only the keys near the affected ring positions rather than re-partitioning the whole dataset.The example instantiates sharding’s “split-then-route” shape with consistent hashing as the routing function. The choice of hash-on-key as partition trades range-query support for near-uniform distribution and minimal data movement under topology change — the canonical tradeoff sharding designs surface.
states are shards of national policy; each state has the same schema (constitution-bounded) but different contents (state law). Cross-state transactions require routing through the federal layer.
Chapter 6 of Designing Data-Intensive Applications is the canonical engineering treatment of sharding (called “partitioning” in the text). Kleppmann lays out the two dominant partition strategies — range-based and hash-based — and the routing problem that follows from each: how a client finds the node holding a given key, and how the system rebalances when nodes join or leave.The chapter is the cross-domain reference because it factors sharding into its constitutive parts (a partition function from key to partition; a routing layer that maps partition to node; a rebalancing strategy under topology change) in a way that survives translation to non-database settings — CDN edge sharding, microservice data ownership, and federated systems reuse the same decomposition.
service boundaries are shard boundaries on the data layer; each service owns its slice.
Vitess is the open-source sharding system that YouTube built and ran in production to scale MySQL past the point where a single instance could hold the metadata workload, and later donated to the CNCF. Its defining architectural move is making the router an explicit, separate component (vtgate) sitting between clients and the underlying MySQL shards (vttablet-fronted instances). Clients speak the standard MySQL wire protocol to vtgate, which parses queries, consults the schema-aware routing rules (which tables are sharded by which keysspace + vindex), fans queries out to the right shards, and recombines the results. Schema changes, resharding (splitting hot shards), and cross-shard transactions are all coordinated through this routing layer rather than baked into the application code. The architecture makes the shard-key choice and the routing logic first-class, debuggable artifacts rather than ad-hoc client-side conditionals scattered through the application.Inference: When sharding becomes load-bearing for a system’s scale, making the router its own component pays for itself through what it makes visible: hot-shard diagnostics, resharding orchestration, cross-shard query planning, and per-tenant routing rules all become operations against a single addressable surface rather than archaeology across application code. The Vitess design also surfaces the inverse warning — if the access pattern doesn’t align with the shard key (most queries cross shards), the explicit router still serves them, but every query becomes a fan-out and the system pays sharding’s cost without earning its benefit. The router-as-layer move only pays off when most queries route to a single shard via the key; when they don’t, the diagnostic is to revisit the key function, not to optimize the router.