database sharding horizontal scaling

Database Sharding Strategies: A Practical Guide to Horizontal Scaling

Your PostgreSQL database handles 10,000 queries per second beautifully. Then your application grows, the table hits 500 million rows, and suddenly queries that used to take 5ms take 500ms. Vertical scaling — bigger hardware — buys time, but database sharding is the long-term answer for horizontal scaling. Therefore, this guide covers when to shard, how to choose a sharding strategy, and the tools that make it manageable.

When to Shard (And When Not To)

Sharding adds enormous complexity to your application. Before you shard, exhaust these alternatives: optimize queries and add indexes, implement read replicas for read-heavy workloads, add caching with Redis or Valkey, archive old data to cold storage, partition tables (single database, multiple physical segments), and vertically scale your hardware. Moreover, many applications never need sharding — a properly optimized PostgreSQL instance on modern hardware can handle billions of rows and tens of thousands of queries per second.

Shard when: a single database server cannot handle your write throughput, your dataset exceeds what fits in a single server’s storage, regulatory requirements mandate data residency in specific regions, or you need sub-millisecond latency for geographically distributed users. Additionally, if your write volume exceeds what a single primary can handle and read replicas do not help (because your bottleneck is writes, not reads), sharding becomes necessary.

Hash Sharding vs Range Sharding

The sharding key determines which shard holds each row. Choosing the right key and strategy is the most critical decision.

HASH SHARDING:
  shard_id = hash(tenant_id) % num_shards

  Pros:
  - Even data distribution across shards
  - No hotspots (assuming good hash function)
  - Simple to implement

  Cons:
  - Range queries across shards are expensive
  - Adding shards requires resharding (rehashing all data)
  - Cannot colocate related data easily

  Best for: Multi-tenant SaaS, user-partitioned data

RANGE SHARDING:
  shard_id = range_lookup(created_date)
  Example: Jan-Mar → shard1, Apr-Jun → shard2, ...

  Pros:
  - Range queries hit a single shard
  - Easy to add new shards (just extend the range)
  - Natural data archiving (old shards become cold storage)

  Cons:
  - Hotspots on the latest shard (all new writes go there)
  - Uneven data distribution over time
  - Requires rebalancing as data grows

  Best for: Time-series data, log storage, IoT data

DIRECTORY/LOOKUP SHARDING:
  shard_id = lookup_table[entity_id]

  Pros:
  - Complete control over placement
  - Can move individual entities between shards
  - Supports complex sharding logic

  Cons:
  - Lookup table is a single point of failure
  - Extra network hop for every query
  - Lookup table itself needs to be highly available

  Best for: Complex multi-tenant systems with varying sizes

For most applications, hash sharding by tenant_id or user_id is the right starting point. It distributes data evenly and ensures all data for a single tenant lives on one shard, eliminating cross-shard queries for tenant-scoped operations.

Database sharding data distribution visualization — Hash sharding distributes data evenly — range sharding optimizes for time-based queries

Cross-Shard Queries: The Hard Problem

The moment you shard, any query that spans multiple shards becomes expensive. A simple SELECT COUNT(*) FROM orders now requires querying every shard, collecting results, and aggregating them in your application layer. Joins across shards are even worse — they require fetching data from multiple shards and joining in memory.

// Cross-shard query example with a sharding middleware
public class ShardedOrderRepository {

    private final Map shards;
    private final ShardRouter router;

    // Single-shard query: fast, simple
    public Order findByTenantAndId(String tenantId, String orderId) {
        DataSource shard = router.getShard(tenantId);
        return jdbcTemplate(shard).queryForObject(
            "SELECT * FROM orders WHERE id = ?", orderId
        );
    }

    // Cross-shard query: scatter-gather pattern
    public long countAllOrders() {
        return shards.values().parallelStream()
            .mapToLong(shard ->
                jdbcTemplate(shard).queryForObject(
                    "SELECT COUNT(*) FROM orders", Long.class
                ))
            .sum();
    }

    // Cross-shard search: fan out, merge, sort
    public List searchOrders(String query, int limit) {
        return shards.values().parallelStream()
            .flatMap(shard ->
                jdbcTemplate(shard).query(
                    "SELECT * FROM orders WHERE description ILIKE ? ORDER BY created_at DESC LIMIT ?",
                    orderMapper, "%" + query + "%", limit
                ).stream())
            .sorted(Comparator.comparing(Order::getCreatedAt).reversed())
            .limit(limit)
            .collect(Collectors.toList());
    }
}

The best strategy is to design your sharding key so that most queries hit a single shard. If you shard by tenant_id, then all queries within a tenant are single-shard queries. Cross-tenant analytics go through a separate data warehouse (like ClickHouse or BigQuery) that receives data via CDC, not through the sharded operational database. Consequently, your operational database stays fast and your analytics are handled by purpose-built tools.

Sharding Tools: Vitess and Citus

You do not need to build sharding infrastructure from scratch. Vitess (originally built by YouTube for MySQL) and Citus (PostgreSQL extension) handle the hard parts — routing queries, managing schema changes across shards, and rebalancing data.

Vitess sits between your application and MySQL shards. Your application connects to Vitess as if it were a single MySQL instance. Vitess parses every query, determines which shard(s) to hit, executes the query, and merges results. It handles online resharding (splitting one shard into two without downtime) and supports VReplication for cross-shard data movement.

Citus extends PostgreSQL with distributed tables. You call SELECT create_distributed_table('orders', 'tenant_id') and Citus transparently distributes the data across worker nodes. Standard PostgreSQL queries work — Citus rewrites them into distributed query plans. It supports colocated joins (joining two tables sharded by the same key) and reference tables (small tables replicated to every shard).

Distributed database scaling architecture — Vitess and Citus handle query routing, resharding, and cross-shard operations transparently

Migration Strategy: Monolith to Sharded

Never migrate to a sharded architecture in a single step. Use a phased approach:

Phase 1: Add the sharding key to every table that will be sharded. Deploy application changes to always include this key in queries.
Phase 2: Set up the sharding proxy (Vitess/Citus) with a single shard containing all data. Verify everything works identically.
Phase 3: Split into two shards. Route a small percentage of tenants to the new shard. Monitor for correctness and performance.
Phase 4: Gradually rebalance tenants across shards until load is even. Add more shards as needed.

Throughout this process, maintain the ability to roll back. Keep the original single database available until you are confident the sharded setup handles your workload correctly. Additionally, invest heavily in integration tests that exercise cross-shard scenarios — these are the queries most likely to break.

Database migration planning and monitoring — Migrate incrementally — start with one shard, verify, then expand

Related Reading:

Resources:

In conclusion, database sharding enables horizontal scaling but comes with significant complexity. Exhaust simpler alternatives first — indexing, caching, read replicas, and partitioning. When you do shard, choose hash sharding by tenant or user ID, use Vitess or Citus to handle the infrastructure, and design your queries so most operations hit a single shard.

Database Sharding Strategies: When and How to Scale Horizontally