Time-Based Chunk Partitioning Strategies

Every TimescaleDB hypertable answers one question on the hot path: given a time predicate, which slices of data must the planner actually read? Time-based chunk partitioning is the mechanism that makes that answer cheap. By dividing a hypertable into contiguous, time-bound child tables — chunks — the engine turns a full-table scan over billions of rows into a targeted scan of a handful of recent segments. Get the chunk boundaries right and ingestion parallelizes, retention becomes an instant DROP TABLE, and compression runs in predictable batches. Get them wrong and you inherit catalog bloat, planner overhead, and vacuum storms. This page is the implementation guide for sizing, creating, and validating time partitioning under high-frequency IoT telemetry, and sits within the broader Core Hypertable Architecture & Partitioning Strategy work.

The core payoff is constraint exclusion: at plan time, PostgreSQL compares each chunk’s time constraints against the query’s WHERE clause and discards chunks that cannot contain matching rows before a single page is read.

Prerequisites

This guide assumes a working TimescaleDB installation and a hypertable that ingests append-mostly time-series data. Before applying any of the partitioning changes below, confirm the following:

TimescaleDB 2.10 or later on PostgreSQL 14+ (SELECT extversion FROM pg_extension WHERE extname = 'timescaledb';)
The extension is loaded via shared_preload_libraries = 'timescaledb' and the server has been restarted
max_worker_processes leaves headroom for background jobs (retention, compression, aggregate refresh) beyond your application connections
A base table exists with a timestamptz (or timestamp) partitioning column and a NOT NULL constraint on it
You have an estimate of ingestion rate (rows/sec) and average row width (bytes) — needed to size the interval
The role running DDL owns the table or holds sufficient privileges (see role-based access boundaries for hypertables)

The functions used throughout — create_hypertable, set_chunk_time_interval, and the timescaledb_information catalog views — ship with the core extension; no additional modules are required.

sql

-- Minimal base table for device telemetry
CREATE TABLE sensor_readings (
    time        timestamptz   NOT NULL,
    device_id   bigint        NOT NULL,
    metric      text          NOT NULL,
    value       double precision,
    PRIMARY KEY (device_id, time)
);

Step-by-Step Implementation

The four steps below map directly to the lifecycle of a chunk: create the hypertable, size the interval, watch chunks form on ingest, and let the planner prune them on read.

Step 1 — Convert the table to a hypertable

create_hypertable promotes an ordinary table to a hypertable and registers the time dimension. Setting chunk_time_interval at creation time avoids a later resize migration.

sql

SELECT create_hypertable(
    'sensor_readings',
    by_range('time', INTERVAL '1 day'),   -- TimescaleDB 2.13+ dimension builder
    if_not_exists => true
);

On releases before the by_range dimension builder, pass the interval directly:

sql

SELECT create_hypertable(
    'sensor_readings', 'time',
    chunk_time_interval => INTERVAL '1 day',
    if_not_exists => true
);

Step 2 — Size the interval to the working set

The single most important rule: the most recent (uncompressed) chunk, plus its indexes, should fit comfortably in memory so that inserts and recent-window queries stay off disk. A practical target is that active chunks occupy no more than ~25% of shared_buffers (or of available RAM on a dedicated box), leaving room for the write-ahead log, other backends, and the OS page cache.

The estimated on-disk size of one chunk is:

S_{chunk} \approx R \times W \times I \times (1 + f_{idx})

where $R$ is ingestion rate in rows/sec, $W$ is average bytes per row, $I$ is the interval in seconds, and $f_{idx}$ is the index overhead fraction. Solving for the interval that fills a memory budget $B_{mem}$ :

I \le \frac{B_{mem}}{R \times W \times (1 + f_{idx})}

Deriving that budget rigorously — including retention-tier alignment and background-worker capacity — is covered in depth in how to calculate the optimal chunk_interval for IoT sensor data. Apply the result with set_chunk_time_interval, which affects only chunks created after the call:

sql

SELECT set_chunk_time_interval('sensor_readings', INTERVAL '6 hours');

Step 3 — Let chunks form on ingest

You do not create chunks manually. As rows arrive, TimescaleDB routes each one to the chunk covering its time value, creating a new chunk on demand when a row falls outside every existing range. Each chunk is a real PostgreSQL table with its own indexes, statistics, and vacuum state, which is what lets retention drop it whole. Rows that arrive late still route correctly, but very old timestamps can spawn tiny back-dated chunks — see handling out-of-order data insertion for the ingestion patterns that keep the catalog tidy.

Step 4 — Verify pruning on the read path

Confirm that a time-bounded query touches only the chunks it should. EXPLAIN shows the planner excluding non-matching chunks:

sql

EXPLAIN (COSTS OFF)
SELECT device_id, avg(value)
FROM sensor_readings
WHERE time > now() - INTERVAL '7 days'
GROUP BY device_id;

A correctly partitioned hypertable lists only the chunks intersecting the last seven days under the Append node; every older chunk is absent from the plan. If historical chunks still appear, the predicate is not sargable against the time dimension (for example, it wraps time in a non-immutable function), and pruning is defeated.

Configuration Parameters Reference

Parameter	Type	Recommended value	Effect
`chunk_time_interval`	interval	1 recent-window worth of data (e.g. `6 hours`–`1 day`)	Physical grouping on disk; drives chunk count, pruning granularity, and compression batch size
`by_range` dimension	dimension	`by_range('time', <interval>)`	Declares the primary time partitioning column on 2.13+
`create_default_indexes`	boolean	`true`	Auto-creates a descending index on the time column for fast recent-window scans
`timescaledb.max_open_chunks_per_insert`	integer	Default (1024) unless writing to many chunks per statement	Caps chunk handles held open during a multi-chunk `INSERT`; raise for wide backfills
`timescaledb.max_cached_chunks_per_hypertable`	integer	Default; raise for very high chunk counts	Size of the per-hypertable chunk metadata cache used during planning
`enable_partitionwise_aggregate`	boolean	`on`	Lets the planner push aggregates down to individual chunks for parallelism

Set the session/GUC parameters in postgresql.conf (or via ALTER SYSTEM) and the hypertable-scoped ones through create_hypertable / set_chunk_time_interval.

Integration With Adjacent Features

Time partitioning is the substrate the rest of the lifecycle stands on, and every downstream policy is expressed in terms of chunk boundaries.

Compression. Chunks become eligible for columnar compression models for high-frequency telemetry only once they age past a compress_after threshold, and the compressor operates one chunk at a time. Aligning the interval with the compression window keeps batches uniform; the scheduling side is handled by automated chunk compression scheduling.

Retention. A retention policy calls drop_chunks under the hood, which detaches and drops whole chunks whose range falls entirely outside the retention window — no per-row DELETE, no VACUUM churn. That is why retention windows must line up with chunk boundaries; the mapping from business SLAs to drop_after intervals lives in TTL policy mapping and enforcement, part of the broader data retention and compression lifecycle automation work.

Continuous aggregates. A continuous aggregate materializes rollups by tracking which source chunks changed, so its incremental refresh policy scheduling processes only the deltas in newly written chunks instead of re-scanning history.

Space partitioning. For multi-tenant fleets, a second dimension layered on top of time isolates devices or customers per chunk. See space partitioning for multi-tenant IoT and, for the indexing implications, chunk indexing on high-cardinality tags. If you are weighing native chunking against rolling your own, the chunk partitioning versus PostgreSQL table inheritance comparison lays out why the automated model wins operationally.

The following automation module ties these threads together: idempotent retention and compression policy management plus continuous-aggregate refresh, safe to run from CI/CD or a background worker.

python

import os
import logging
import psycopg

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)


class TimescaleLifecycleManager:
    """Idempotent TimescaleDB lifecycle automation for retention, compression, and CA refresh."""

    def __init__(self, dsn: str):
        self.dsn = dsn

    def ensure_retention_policy(self, hypertable: str, drop_after: str) -> None:
        """Creates or updates a retention policy without duplicating background jobs."""
        with psycopg.connect(self.dsn) as conn:
            with conn.cursor() as cur:
                cur.execute("""
                    SELECT job_id FROM timescaledb_information.jobs
                    WHERE hypertable_schema = 'public' AND hypertable_name = %s
                    AND proc_name = 'policy_retention'
                """, (hypertable,))
                job = cur.fetchone()

                if job:
                    cur.execute(
                        "SELECT alter_job(%s, schedule_interval => INTERVAL '1 hour')",
                        (job[0],),
                    )
                    logger.info("Updated existing retention policy for %s", hypertable)
                else:
                    # add_retention_policy takes a regclass; bind the qualified name
                    # as text (regclass accepts the implicit cast) plus the interval.
                    cur.execute(
                        "SELECT add_retention_policy(%s, drop_after => %s::interval, if_not_exists => true)",
                        (f"public.{hypertable}", drop_after),
                    )
                    logger.info("Created new retention policy for %s", hypertable)
                conn.commit()

    def ensure_compression_policy(self, hypertable: str, compress_after: str) -> None:
        """Idempotently applies compression scheduling."""
        with psycopg.connect(self.dsn) as conn:
            with conn.cursor() as cur:
                cur.execute("""
                    SELECT job_id FROM timescaledb_information.jobs
                    WHERE hypertable_schema = 'public' AND hypertable_name = %s
                    AND proc_name = 'policy_compression'
                """, (hypertable,))
                job = cur.fetchone()

                if not job:
                    cur.execute(
                        "SELECT add_compression_policy(%s, compress_after => %s::interval, if_not_exists => true)",
                        (f"public.{hypertable}", compress_after),
                    )
                    logger.info("Created compression policy for %s", hypertable)
                conn.commit()

    def refresh_continuous_aggregate(self, ca_name: str, window_start: str | None = None, window_end: str | None = None) -> None:
        """Triggers incremental or full refresh of a continuous aggregate."""
        # refresh_continuous_aggregate() cannot run inside a transaction block,
        # so use an autocommit connection. All values are bound parameters.
        with psycopg.connect(self.dsn, autocommit=True) as conn:
            with conn.cursor() as cur:
                if window_start and window_end:
                    cur.execute(
                        "CALL refresh_continuous_aggregate(%s, %s, %s)",
                        (ca_name, window_start, window_end),
                    )
                    logger.info("Refreshed CA %s for window [%s, %s]", ca_name, window_start, window_end)
                else:
                    cur.execute(
                        "CALL refresh_continuous_aggregate(%s, NULL, NULL)",
                        (ca_name,),
                    )
                    logger.info("Full refresh triggered for CA %s", ca_name)


# Usage Example
if __name__ == "__main__":
    DSN = os.getenv("DATABASE_URL", "postgresql://user:pass@localhost:5432/iot_db")
    manager = TimescaleLifecycleManager(DSN)

    manager.ensure_retention_policy("sensor_readings", drop_after="90 days")
    manager.ensure_compression_policy("sensor_readings", compress_after="7 days")
    manager.refresh_continuous_aggregate("daily_sensor_rollups")

This implementation guarantees idempotency by querying timescaledb_information.jobs before applying DDL, preventing duplicate background workers. It uses bound parameters throughout to eliminate injection risks and context managers for deterministic connection cleanup. In production, front it with a connection pooler (PgBouncer) and schedule it via cron, a Kubernetes CronJob, or Airflow.

Performance Validation

Once the interval is set and data is flowing, verify the partitioning is behaving as designed by querying the catalog views directly.

Chunk count and size distribution. A hypertable with tens of thousands of tiny chunks signals an undersized interval; a handful of multi-gigabyte chunks signals the opposite.

sql

SELECT
    hypertable_name,
    count(*)                             AS chunk_count,
    pg_size_pretty(sum(total_bytes))     AS total_size,
    pg_size_pretty(avg(total_bytes)::bigint) AS avg_chunk_size
FROM timescaledb_information.chunks c
JOIN LATERAL chunk_detailed_size(format('%I.%I', c.chunk_schema, c.chunk_name)::regclass) s ON true
WHERE hypertable_name = 'sensor_readings'
GROUP BY hypertable_name;

Interval and dimension settings. Confirm the active interval on the time dimension:

sql

SELECT hypertable_name, column_name, time_interval
FROM timescaledb_information.dimensions
WHERE hypertable_name = 'sensor_readings' AND dimension_type = 'Time';

Compression progress per chunk. Track how many chunks the compression policy has processed — a lagging count often traces back to a mismatch between chunk_time_interval and compress_after:

sql

SELECT
    count(*) FILTER (WHERE is_compressed)     AS compressed_chunks,
    count(*) FILTER (WHERE NOT is_compressed) AS uncompressed_chunks
FROM timescaledb_information.chunks
WHERE hypertable_name = 'sensor_readings';

Troubleshooting

Old chunks still scanned despite a time filter. The plan shows every chunk under the Append node. Cause: the predicate is not directly comparable against the time column — a function wraps time, or the bound is a non-immutable expression evaluated per row. Resolution: filter on the raw column with a constant or now()-relative bound (time > now() - INTERVAL '7 days'), and keep the time column free of casts.

ERROR: cannot change configuration on already existing chunks. Raised when you expect set_chunk_time_interval to resize existing data. It does not — the new interval applies only to chunks created afterward. Resolution: accept the mixed-interval history, or move data into a freshly sized hypertable if uniformity is required.

Explosion of tiny back-dated chunks. timescaledb_information.chunks shows many chunks with a handful of rows and old ranges. Cause: late or clock-skewed timestamps from reconnecting gateways spawning historical chunks. Resolution: validate timestamps at ingest and batch backfills; see the out-of-order insertion patterns linked above.

ERROR: tuple decompression limit exceeded on insert. Writes are landing in already-compressed chunks. Cause: late data targeting a chunk past compress_after. Resolution: widen the compression window so the active write range stays uncompressed, or route backfills through a staging path before compression runs.

Retention leaves partial data behind. A drop_after window that does not align with chunk boundaries keeps a chunk alive until its entire range clears the window, so data appears to linger past its TTL. Resolution: set the interval as an even divisor of the retention window (e.g. daily chunks under a 90-day TTL).

Frequently Asked Questions

Can I change chunk_time_interval after data already exists?

Yes, with set_chunk_time_interval, but the change is forward-only. Existing chunks keep their original size; only chunks created after the call use the new interval. A hypertable can safely carry a mix of interval sizes — the planner prunes each chunk by its own range regardless of how it was sized.

What happens to a chunk that spans the retention boundary?

Nothing until the chunk’s entire time range falls outside the retention window. drop_chunks operates on whole chunks and never truncates one partially, so a chunk straddling the boundary is retained until its newest edge ages out. This is why aligning the interval with the retention window matters for precise TTL enforcement.

Does a smaller interval make queries faster?

Only up to a point. Smaller chunks improve pruning granularity and keep the active chunk in memory, but past a threshold the planner pays for evaluating thousands of chunk constraints and the catalog grows. The sweet spot balances pruning precision against planning overhead — the sizing formula and worked numbers are in the dedicated interval-calculation guide.

How many chunks is too many?

There is no hard cap, but planning cost and catalog memory scale with chunk count per hypertable. Tens of thousands of chunks per hypertable is a practical warning sign; if you are there with modest data volume, the interval is almost certainly too small. Consolidating to a larger interval (going forward) or archiving old chunks via retention brings the count back down.

Should I set chunk_time_interval per space partition?

No. The time interval is a property of the time dimension and applies uniformly across all space partitions. To isolate tenants or devices, add a space dimension rather than trying to vary the time interval per partition.

← Back to Core Hypertable Architecture & Partitioning Strategy

Time-Based Chunk Partitioning Strategies

# Prerequisites

# Step-by-Step Implementation

# Step 1 — Convert the table to a hypertable

# Step 2 — Size the interval to the working set

# Step 3 — Let chunks form on ingest

# Step 4 — Verify pruning on the read path

# Configuration Parameters Reference

# Integration With Adjacent Features

# Performance Validation

# Troubleshooting