PostgreSQL Partitioning: Managing Large Tables

There’s a particular kind of dread that comes with watching a PostgreSQL query’s EXPLAIN output. You see the index scan. You see the filter. And then you see Rows Removed by Filter: 847,293,881 and realize your “optimized” query is scanning nearly a billion rows because the index can’t save you when the table itself is the bottleneck.

That was our orders table. 800 million rows. Growing at 2 million per day. Queries by date range—our most common access pattern—were scanning the entire table because PostgreSQL had to check every row to find the ones in last week’s date range, even with an index on created_at.

Vacuum was taking 6 hours. Backups were a negotiation with the infra team. Adding an index required planning a maintenance window that never seemed to arrive.

Table partitioning fixed what indexing couldn’t. Instead of one massive table, we had dozens of smaller ones—queries touched only the partitions they needed, maintenance operated partition-by-partition, and old data dropped with a DROP TABLE instead of a DELETE that locked everything for hours.

When Partitioning Makes Sense

Partitioning isn’t free—it adds query planning complexity and requires thoughtful partition key selection. But it’s the right tool when:

Table exceeds 100GB and keeps growing
Queries consistently filter on a specific column (date, region, tenant ID)
You need to drop old data regularly (retention policies)
Maintenance windows can’t keep up with vacuum/analyze on the monolith
Different partitions have different access patterns (hot recent data vs. cold archive)

If your queries don’t filter on the partition key, partitioning makes things worse—PostgreSQL scans every partition. Don’t partition for the sake of partitioning.

Choosing a Partition Strategy

PostgreSQL supports three partitioning methods. Pick based on your query patterns:

Method	Partition Key Example	Best For
Range	Date, numeric ID ranges	Time-series, sequential data
List	Country code, status	Categorical with known values
Hash	User ID, UUID	Even distribution, no natural range

Our orders table screamed range partitioning by date. Analytics queries always included date ranges. Retention policy dropped data older than 7 years. Range partitioning made both patterns natural.

Range Partitioning: The Workhorse

Range partitioning divides data by contiguous ranges—most commonly time:

-- Parent table (doesn't store data directly)
CREATE TABLE orders (
    id          BIGSERIAL,
    user_id     INTEGER NOT NULL,
    total       DECIMAL(10,2) NOT NULL,
    status      VARCHAR(20) NOT NULL,
    created_at  TIMESTAMP NOT NULL,
    updated_at  TIMESTAMP NOT NULL
) PARTITION BY RANGE (created_at);

-- Create partitions (one per quarter)
CREATE TABLE orders_2022_q1 PARTITION OF orders
    FOR VALUES FROM ('2022-01-01') TO ('2022-04-01');

CREATE TABLE orders_2022_q2 PARTITION OF orders
    FOR VALUES FROM ('2022-04-01') TO ('2022-07-01');

CREATE TABLE orders_2022_q3 PARTITION OF orders
    FOR VALUES FROM ('2022-07-01') TO ('2022-10-01');

CREATE TABLE orders_2022_q4 PARTITION OF orders
    FOR VALUES FROM ('2022-10-01') TO ('2023-01-01');

Queries with date filters now hit only relevant partitions:

-- Only scans orders_2022_q2 — partition pruning in action
SELECT COUNT(*), SUM(total)
FROM orders
WHERE created_at >= '2022-04-01'
  AND created_at < '2022-07-01';

Verify partition pruning with EXPLAIN:

EXPLAIN SELECT * FROM orders WHERE created_at >= '2022-05-01';

-- Look for: "Partitions removed: 3" or specific partition names
-- Bad sign: scanning all partitions

Automating Partition Creation

Don’t create partitions manually forever. A scheduled job creates future partitions:

CREATE OR REPLACE FUNCTION create_orders_partition(partition_date DATE)
RETURNS void AS $$
DECLARE
    partition_name TEXT;
    start_date DATE;
    end_date DATE;
BEGIN
    start_date := date_trunc('quarter', partition_date);
    end_date := start_date + INTERVAL '3 months';
    partition_name := 'orders_' || to_char(start_date, 'YYYY_"q"Q');
    
    EXECUTE format(
        'CREATE TABLE IF NOT EXISTS %I PARTITION OF orders
         FOR VALUES FROM (%L) TO (%L)',
        partition_name, start_date, end_date
    );
    
    RAISE NOTICE 'Created partition: %', partition_name;
END;
$$ LANGUAGE plpgsql;

-- Create next quarter's partition (run monthly via cron)
SELECT create_orders_partition(CURRENT_DATE + INTERVAL '3 months');

We run this on the first of each month, always staying one quarter ahead. Missing a partition means inserts fail—don’t learn this the hard way.

Dropping Old Data

This is where partitioning pays for itself. Seven-year retention policy?

-- Without partitioning: DELETE FROM orders WHERE created_at < '2015-01-01'
-- Locks table, generates massive WAL, takes hours

-- With partitioning:
DROP TABLE orders_2015_q1;
DROP TABLE orders_2015_q2;
-- Instant. Metadata operation. No WAL explosion.

We automated retention too:

CREATE OR REPLACE FUNCTION drop_old_orders_partitions(retention_years INTEGER DEFAULT 7)
RETURNS void AS $$
DECLARE
    partition_record RECORD;
    cutoff_date DATE;
BEGIN
    cutoff_date := date_trunc('quarter', CURRENT_DATE - (retention_years || ' years')::INTERVAL);
    
    FOR partition_record IN
        SELECT inhrelid::regclass AS partition_name
        FROM pg_inherits
        JOIN pg_class parent ON pg_inherits.inhparent = parent.oid
        WHERE parent.relname = 'orders'
    LOOP
        -- Extract date from partition name and compare
        -- (Implementation depends on naming convention)
        IF partition_is_older_than(partition_record.partition_name, cutoff_date) THEN
            EXECUTE format('DROP TABLE %s', partition_record.partition_name);
            RAISE NOTICE 'Dropped old partition: %', partition_record.partition_name;
        END IF;
    END LOOP;
END;
$$ LANGUAGE plpgsql;

List Partitioning: Categorical Data

When data divides naturally by category, list partitioning works well:

CREATE TABLE events (
    id          BIGSERIAL,
    event_type  VARCHAR(50) NOT NULL,
    user_id     INTEGER,
    payload     JSONB,
    created_at  TIMESTAMP NOT NULL
) PARTITION BY LIST (event_type);

CREATE TABLE events_pageview PARTITION OF events
    FOR VALUES IN ('pageview', 'page_view');

CREATE TABLE events_click PARTITION OF events
    FOR VALUES IN ('click', 'button_click', 'link_click');

CREATE TABLE events_purchase PARTITION OF events
    FOR VALUES IN ('purchase', 'add_to_cart', 'checkout_start');

-- Catch-all for new event types
CREATE TABLE events_default PARTITION OF events DEFAULT;

The DEFAULT partition catches anything not explicitly listed—critical for avoiding insert failures when marketing invent new event types on Friday afternoon.

List partitioning shines for multi-tenant scenarios:

CREATE TABLE tenant_data (
    id          BIGSERIAL,
    tenant_id   VARCHAR(50) NOT NULL,
    data        JSONB,
    created_at  TIMESTAMP
) PARTITION BY LIST (tenant_id);

CREATE TABLE tenant_data_enterprise PARTITION OF tenant_data
    FOR VALUES IN ('acme-corp', 'bigco-inc', 'megacorp');

CREATE TABLE tenant_data_smb PARTITION OF tenant_data
    FOR VALUES IN ('startup-a', 'startup-b', 'startup-c');

CREATE TABLE tenant_data_default PARTITION OF tenant_data DEFAULT;

Enterprise tenants get dedicated partitions with tailored indexes. SMB tenants share a partition. Query isolation without separate databases.

Hash Partitioning: Even Distribution

When there’s no natural range or category, hash partitioning distributes data evenly:

CREATE TABLE user_sessions (
    id          BIGSERIAL,
    user_id     INTEGER NOT NULL,
    session_token VARCHAR(255),
    data        JSONB,
    expires_at  TIMESTAMP
) PARTITION BY HASH (user_id);

CREATE TABLE user_sessions_0 PARTITION OF user_sessions
    FOR VALUES WITH (MODULUS 8, REMAINDER 0);

CREATE TABLE user_sessions_1 PARTITION OF user_sessions
    FOR VALUES WITH (MODULUS 8, REMAINDER 1);

-- ... through user_sessions_7

Eight partitions, data distributed by hash(user_id) % 8. Queries filtering by user_id hit exactly one partition. Queries without user_id scan all eight—avoid those.

Hash partitioning helps with:

Lock contention (writes spread across partitions)
Parallel vacuum/analyze
Index size (smaller indexes per partition)

It doesn’t help with data retention—you can’t drop a hash partition without dropping data across all time periods.

Indexes on Partitioned Tables

Indexes work differently on partitioned tables. You can:

Create indexes on the parent (propagates to all partitions):

CREATE INDEX idx_orders_user_id ON orders (user_id);
-- Creates idx_orders_user_id on every partition automatically

Create partition-specific indexes when access patterns differ:

-- Recent partitions: index everything
CREATE INDEX idx_orders_2022_q4_status ON orders_2022_q4 (status);

-- Old partitions: maybe skip indexes that aren't queried
-- (Archive partitions are scan-only)

Our strategy: full indexes on partitions newer than 1 year, minimal indexes on archive partitions. Reduced index maintenance on data nobody queries with filters.

Migrating an Existing Table

You can’t just ALTER TABLE ... PARTITION BY on an existing 800-million-row table. Migration requires a cutover:

Strategy 1: Create New, Copy, Swap

-- Step 1: Create partitioned table structure
CREATE TABLE orders_new (...) PARTITION BY RANGE (created_at);
-- Create all partitions

-- Step 2: Copy data in batches (avoid long locks)
INSERT INTO orders_new
SELECT * FROM orders
WHERE created_at >= '2022-01-01' AND created_at < '2022-04-01';

-- Repeat for each partition/date range

-- Step 3: Swap tables (brief lock)
BEGIN;
ALTER TABLE orders RENAME TO orders_old;
ALTER TABLE orders_new RENAME TO orders;
COMMIT;

-- Step 4: Verify, then drop old table
DROP TABLE orders_old;

Strategy 2: Logical Replication

For zero-downtime migration, use logical replication:

Create partitioned table
Set up logical replication from old → new
Wait for sync
Brief cutover window to switch application
Drop old table

We used Strategy 1 with batched copies during low-traffic hours. Took a weekend, but the query performance improvement was immediate.

Monitoring Partitioned Tables

New metrics to watch:

-- Partition sizes
SELECT
    child.relname AS partition_name,
    pg_size_pretty(pg_relation_size(child.oid)) AS size,
    pg_stat_get_live_tuples(child.oid) AS row_count
FROM pg_inherits
JOIN pg_class parent ON pg_inherits.inhparent = parent.oid
JOIN pg_class child ON pg_inherits.inhrelid = child.oid
WHERE parent.relname = 'orders'
ORDER BY pg_relation_size(child.oid) DESC;

-- Verify partition pruning is working
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders WHERE created_at >= '2022-06-01';
-- Check "Partitions removed" in output

-- Find queries NOT using partition pruning (scanning all partitions)
SELECT query, calls, mean_time
FROM pg_stat_statements
WHERE query LIKE '%orders%'
ORDER BY mean_time DESC;

Alert when:

A partition exceeds expected size (bad data distribution?)
Partition count grows unexpectedly (automation failure?)
Queries scan all partitions (missing partition key in WHERE?)

Common Mistakes

Wrong partition key. We initially considered partitioning by user_id. But 90% of queries filtered by date. Partitioning by user_id would have meant scanning all partitions for every date-range query.

Partitions too small. One partition per day on moderate-volume data creates hundreds of partitions. PostgreSQL handles many partitions, but query planning overhead increases. Quarterly or monthly partitions are usually the sweet spot.

Partitions too large. One partition per year defeats the purpose. If your retention policy drops quarterly data and queries target months, quarterly partitions match both patterns.

Forgetting the DEFAULT partition. Inserts fail hard when no partition matches. Always have a DEFAULT or automated partition creation.

No partition pruning verification. Assumed partitioning was working. Discovered six months later that OR conditions in queries defeated pruning:

-- BAD: Partition pruning fails with OR
SELECT * FROM orders
WHERE created_at >= '2022-01-01' OR status = 'pending';

-- GOOD: UNION ALL preserves pruning
SELECT * FROM orders WHERE created_at >= '2022-01-01'
UNION ALL
SELECT * FROM orders WHERE status = 'pending' AND created_at >= '2022-01-01';

Results After Partitioning

Our numbers after migrating the orders table:

Metric	Before	After
Date-range query p99	45s	120ms
Vacuum duration	6 hours	20 min/partition
Data retention drop	4-hour DELETE	Instant DROP
Backup size (incremental)	Full table scan	Changed partitions only
Index rebuild	Maintenance window	Per-partition, online

The 800-million-row problem didn’t disappear—we just made it manageable by not treating it as one problem.

Conclusion

PostgreSQL partitioning turns one overwhelming table into many manageable ones. Choose your partition key based on query patterns, not convenience. Automate partition creation and retention. Verify partition pruning with EXPLAIN. And start before you hit 800 million rows—migrating under pressure is no fun.

Range partitioning for time-series data. List partitioning for categorical splits. Hash partitioning for even distribution when nothing else fits. Match the strategy to your access patterns, and your future self won’t be staring at EXPLAIN output showing 847 million rows removed by filter.

Further Resources:

PostgreSQL Table Partitioning — Official documentation
pg_partman — Partition management extension
PostgreSQL Partition Pruning — How pruning works
Citus Data: Time-Series Best Practices — Advanced partitioning patterns

PostgreSQL partitioning from June 2022, covering range, list, and hash partitioning.