Database Sharding Strategies: A Practical Guide to Horizontal Scaling
Your PostgreSQL database handles 10,000 queries per second beautifully. Then your application grows, the table hits 500 million rows, and suddenly queries that used to take 5ms take 500ms. Vertical scaling — bigger hardware — buys time, but database sharding is the long-term answer for horizontal scaling. Therefore, this guide covers when to shard, how to choose a sharding strategy, and the tools that make it manageable.
When to Shard (And When Not To)
Sharding adds enormous complexity to your application. Before you shard, exhaust these alternatives: optimize queries and add indexes, implement read replicas for read-heavy workloads, add caching with Redis or Valkey, archive old data to cold storage, partition tables (single database, multiple physical segments), and vertically scale your hardware. Moreover, many applications never need sharding — a properly optimized PostgreSQL instance on modern hardware can handle billions of rows and tens of thousands of queries per second.
Shard when: a single database server cannot handle your write throughput, your dataset exceeds what fits in a single server’s storage, regulatory requirements mandate data residency in specific regions, or you need sub-millisecond latency for geographically distributed users. Additionally, if your write volume exceeds what a single primary can handle and read replicas do not help (because your bottleneck is writes, not reads), sharding becomes necessary.
Hash Sharding vs Range Sharding
The sharding key determines which shard holds each row. Choosing the right key and strategy is the most critical decision.
HASH SHARDING:
shard_id = hash(tenant_id) % num_shards
Pros:
- Even data distribution across shards
- No hotspots (assuming good hash function)
- Simple to implement
Cons:
- Range queries across shards are expensive
- Adding shards requires resharding (rehashing all data)
- Cannot colocate related data easily
Best for: Multi-tenant SaaS, user-partitioned data
RANGE SHARDING:
shard_id = range_lookup(created_date)
Example: Jan-Mar → shard1, Apr-Jun → shard2, ...
Pros:
- Range queries hit a single shard
- Easy to add new shards (just extend the range)
- Natural data archiving (old shards become cold storage)
Cons:
- Hotspots on the latest shard (all new writes go there)
- Uneven data distribution over time
- Requires rebalancing as data grows
Best for: Time-series data, log storage, IoT data
DIRECTORY/LOOKUP SHARDING:
shard_id = lookup_table[entity_id]
Pros:
- Complete control over placement
- Can move individual entities between shards
- Supports complex sharding logic
Cons:
- Lookup table is a single point of failure
- Extra network hop for every query
- Lookup table itself needs to be highly available
Best for: Complex multi-tenant systems with varying sizesFor most applications, hash sharding by tenant_id or user_id is the right starting point. It distributes data evenly and ensures all data for a single tenant lives on one shard, eliminating cross-shard queries for tenant-scoped operations.
Cross-Shard Queries: The Hard Problem
The moment you shard, any query that spans multiple shards becomes expensive. A simple SELECT COUNT(*) FROM orders now requires querying every shard, collecting results, and aggregating them in your application layer. Joins across shards are even worse — they require fetching data from multiple shards and joining in memory.
// Cross-shard query example with a sharding middleware
public class ShardedOrderRepository {
private final Map shards;
private final ShardRouter router;
// Single-shard query: fast, simple
public Order findByTenantAndId(String tenantId, String orderId) {
DataSource shard = router.getShard(tenantId);
return jdbcTemplate(shard).queryForObject(
"SELECT * FROM orders WHERE id = ?", orderId
);
}
// Cross-shard query: scatter-gather pattern
public long countAllOrders() {
return shards.values().parallelStream()
.mapToLong(shard ->
jdbcTemplate(shard).queryForObject(
"SELECT COUNT(*) FROM orders", Long.class
))
.sum();
}
// Cross-shard search: fan out, merge, sort
public List searchOrders(String query, int limit) {
return shards.values().parallelStream()
.flatMap(shard ->
jdbcTemplate(shard).query(
"SELECT * FROM orders WHERE description ILIKE ? ORDER BY created_at DESC LIMIT ?",
orderMapper, "%" + query + "%", limit
).stream())
.sorted(Comparator.comparing(Order::getCreatedAt).reversed())
.limit(limit)
.collect(Collectors.toList());
}
} The best strategy is to design your sharding key so that most queries hit a single shard. If you shard by tenant_id, then all queries within a tenant are single-shard queries. Cross-tenant analytics go through a separate data warehouse (like ClickHouse or BigQuery) that receives data via CDC, not through the sharded operational database. Consequently, your operational database stays fast and your analytics are handled by purpose-built tools.
Sharding Tools: Vitess and Citus
You do not need to build sharding infrastructure from scratch. Vitess (originally built by YouTube for MySQL) and Citus (PostgreSQL extension) handle the hard parts — routing queries, managing schema changes across shards, and rebalancing data.
Vitess sits between your application and MySQL shards. Your application connects to Vitess as if it were a single MySQL instance. Vitess parses every query, determines which shard(s) to hit, executes the query, and merges results. It handles online resharding (splitting one shard into two without downtime) and supports VReplication for cross-shard data movement.
Citus extends PostgreSQL with distributed tables. You call SELECT create_distributed_table('orders', 'tenant_id') and Citus transparently distributes the data across worker nodes. Standard PostgreSQL queries work — Citus rewrites them into distributed query plans. It supports colocated joins (joining two tables sharded by the same key) and reference tables (small tables replicated to every shard).
Migration Strategy: Monolith to Sharded
Never migrate to a sharded architecture in a single step. Use a phased approach:
- Phase 1: Add the sharding key to every table that will be sharded. Deploy application changes to always include this key in queries.
- Phase 2: Set up the sharding proxy (Vitess/Citus) with a single shard containing all data. Verify everything works identically.
- Phase 3: Split into two shards. Route a small percentage of tenants to the new shard. Monitor for correctness and performance.
- Phase 4: Gradually rebalance tenants across shards until load is even. Add more shards as needed.
Throughout this process, maintain the ability to roll back. Keep the original single database available until you are confident the sharded setup handles your workload correctly. Additionally, invest heavily in integration tests that exercise cross-shard scenarios — these are the queries most likely to break.
Related Reading:
- Zero-Downtime Database Migrations
- Connection Pooling with PgBouncer
- SQL Query Optimization for PostgreSQL
Resources:
In conclusion, database sharding enables horizontal scaling but comes with significant complexity. Exhaust simpler alternatives first — indexing, caching, read replicas, and partitioning. When you do shard, choose hash sharding by tenant or user ID, use Vitess or Citus to handle the infrastructure, and design your queries so most operations hit a single shard.