Setting Up Automatic Refresh Policies for 5-Minute Intervals

A five-minute refresh cadence keeps operational dashboards and alerting pipelines current without letting the background scheduler starve your ingestion path — this page derives the exact start_offset, end_offset, and schedule_interval that deliver it. It is a focused recipe inside the broader continuous aggregate refresh lifecycle and assumes you already have a materialized rollup in place; if you do not, start with the refresh policy design and scheduling guide first.

Input Profiling: Metrics to Gather First

The three policy offsets are not arbitrary — each is a function of measurable properties of your ingestion stream. Collect these numbers before writing any DDL:

Bucket width (b) — fixed at 5 minutes here, since the aggregate groups on time_bucket('5 minutes', ts). This is the smallest unit the policy can re-materialize.
Maximum ingestion lateness (L) — how far behind wall-clock a row’s timestamp can be when it finally commits. Gateway buffering, cellular backhaul, and batch flushes all push L up. Measure it as the 99th-percentile gap between now() and the event time of freshly inserted rows.
Refresh duration (t_refresh) — how long one incremental refresh takes. On a fresh aggregate you will not have this yet; estimate 5–30 s and correct it later from timescaledb_information.job_stats.
Ingestion rate — rows/sec per device and total. This drives whether a single refresh can finish inside the 5-minute window at all.
Retention horizon — how long raw chunks survive, because the policy must never reach back past data that has already been dropped.

Late-arriving data is the variable most engineers under-measure. If L exceeds your end_offset, the policy will finalize a bucket before its last rows land, and those rows will silently never appear in the rollup unless a later window re-covers them.

Calculating the Three Offsets

The refresh window on each run spans from now() - start_offset to now() - end_offset. Two correctness invariants govern the choice.

Invariant 1 — skip the open and still-settling buckets. The end_offset must exclude both the currently-filling bucket and any bucket that could still receive late rows:

\text{end\_offset} \geq \max(b,\ L)

Invariant 2 — consecutive windows must overlap. So no bucket is ever skipped when a run is delayed, the window must be at least one schedule interval wide plus one refresh duration:

\text{start\_offset} - \text{end\_offset} \;\geq\; \text{schedule\_interval} + t_{\text{refresh}}

With schedule_interval = b = 5 min for a per-bucket cadence, and a modest L under a minute, a robust default is end_offset => 5 minutes, start_offset => 15 minutes. The values shipped in the diagram above (10 min / 1 min) are the aggressive-freshness variant: valid only when late data is negligible, because a 1-minute end_offset re-materializes the open bucket on every run.

The policy itself is idempotent — safe to run on every deploy:

sql

-- 1. Create the continuous aggregate with explicit 5-minute bucketing.
CREATE MATERIALIZED VIEW IF NOT EXISTS sensor_metrics_5m
WITH (timescaledb.continuous) AS
SELECT
  time_bucket('5 minutes', ts) AS bucket,
  device_id,
  AVG(temperature) AS avg_temp,
  MAX(temperature) AS max_temp,
  COUNT(*)         AS reading_count
FROM raw_sensor_data
GROUP BY bucket, device_id
WITH NO DATA;

-- 2. Attach the 5-minute refresh policy using the calculated offsets.
SELECT add_continuous_aggregate_policy(
  'sensor_metrics_5m',
  start_offset      => INTERVAL '15 minutes',
  end_offset        => INTERVAL '5 minutes',
  schedule_interval => INTERVAL '5 minutes',
  if_not_exists     => TRUE
);

The schedule_interval sets the cadence; the background worker fires every five minutes. Overlapping executions are serialized, not parallelized — if one refresh runs long, the next queues rather than spawning a second worker, which keeps watermark progression deterministic. Sizing the offsets around time_bucket() grouping is the same discipline covered in creating continuous aggregates with time_bucket_gapfill, where bucket alignment also drives correctness.

Worked Example: A 5,000-Device Sensor Fleet

Take a realistic industrial deployment:

Fleet: 5,000 devices, each emitting one reading every 10 seconds → 500 rows/sec (~43 M rows/day).
Lateness: 99th-percentile L measured at 45 seconds (LTE gateways with a 30 s flush buffer).
Refresh duration: first week of job_stats shows last_run_duration averaging 8 s, peaking at 22 s during backfills.

Apply the formulas:

end_offset = max(b, L) = max(5 min, 45 s) = 5 min. The bucket width dominates lateness, so a 5-minute end_offset already covers the 45 s of late data with room to spare.
start_offset ≥ end_offset + schedule_interval + t_refresh = 5 min + 5 min + 22 s ≈ 10.4 min. Round up to 15 minutes for headroom against the occasional 22 s peak and to survive one missed run.

The result is exactly the policy shown above. Every five minutes the worker re-materializes buckets from 15 minutes ago to 5 minutes ago — three buckets per run, overlapping the previous run by two buckets, so a delayed or skipped execution never leaves a permanent gap. At 500 rows/sec the incremental scan touches roughly 150,000 raw rows per run, well inside the 5-minute budget. If you later add hourly rollups on top of this aggregate, coordinate their offsets the same way you would for hourly and daily cascading rollups.

Edge Cases and When to Deviate

The formulas hold for steady-state streaming ingestion. These conditions break them:

Heavy late data (L > b). Field gateways that reconnect after hours of offline buffering push L into the tens of minutes. Raise end_offset to cover that horizon, or the rollup will under-count. See handling out-of-order data insertion for the ingestion-side fixes that keep L bounded.
Refresh outruns the schedule. If t_refresh consistently approaches 5 minutes, jobs queue and freshness degrades. Widen the schedule_interval, raise work_mem, or split the window — the tactics in incremental refresh performance tuning for large datasets apply directly.
You need sub-5-minute freshness. Enable the real-time aggregate (timescaledb.materialized_only = false) so queries union the materialized rollup with a live computation over the unmaterialized tail, instead of chasing an ever-tighter end_offset.
Backfilling history. A large historical INSERT will not be picked up by the forward-looking policy window. Run a one-time CALL refresh_continuous_aggregate('sensor_metrics_5m', '2026-01-01', '2026-02-01') to materialize the past explicitly.
Retention shorter than start_offset. If raw chunks are dropped before the window reaches them, the refresh silently produces nothing for that range. Keep the raw TTL retention window wider than start_offset, and attach a separate add_retention_policy to the aggregate itself — dropping raw chunks does not remove already-materialized rollup rows.

Verification

Confirm the policy registered with the offsets you intended, then watch it actually run. First, read the policy configuration straight from the job catalog:

sql

-- Confirm the policy exists and carries the calculated offsets.
SELECT
  j.job_id,
  j.schedule_interval,
  j.config ->> 'start_offset' AS start_offset,
  j.config ->> 'end_offset'   AS end_offset
FROM timescaledb_information.jobs j
JOIN timescaledb_information.continuous_aggregates ca
  ON ca.materialization_hypertable_name = j.hypertable_name
WHERE ca.view_name = 'sensor_metrics_5m'
  AND j.proc_name  = 'policy_refresh_continuous_aggregate';

Then verify the worker is firing on cadence and finishing inside the budget:

sql

-- Health check: last run should be < 5 minutes old and succeed quickly.
SELECT
  js.job_id,
  js.last_run_status,
  js.last_successful_finish,
  js.last_run_duration,
  js.total_runs,
  js.total_failures
FROM timescaledb_information.job_stats js
JOIN timescaledb_information.continuous_aggregates ca
  ON ca.materialization_hypertable_name = js.hypertable_name
WHERE ca.view_name = 'sensor_metrics_5m';

If last_run_duration trends toward the schedule_interval, or total_failures climbs, the policy needs tuning. The Python routine below wraps that check for orchestration pipelines — it reads the job stats and widens the cadence automatically when a run exceeds its duration threshold:

python

import logging
import psycopg
from psycopg.rows import dict_row

logging.basicConfig(level=logging.INFO, format="%(asctime)s | %(levelname)s | %(message)s")

def monitor_and_adjust_policy(conn_string: str, view_name: str, max_duration_seconds: int = 240) -> None:
    """Widen a 5-minute refresh policy to 10 minutes if the last run ran long."""
    with psycopg.connect(conn_string, row_factory=dict_row) as conn:
        with conn.cursor() as cur:
            # A continuous aggregate's job is keyed to its materialization
            # hypertable, so resolve it through the continuous_aggregates view.
            cur.execute(
                """
                SELECT js.job_id, js.last_run_status, js.last_run_duration,
                       js.total_runs, js.total_failures
                FROM timescaledb_information.job_stats js
                JOIN timescaledb_information.continuous_aggregates ca
                  ON ca.materialization_hypertable_name = js.hypertable_name
                WHERE ca.view_name = %s
                """,
                (view_name,),
            )
            stats = cur.fetchone()
            if not stats:
                logging.warning("No job stats found for view: %s", view_name)
                return

            duration = stats["last_run_duration"]
            if duration and duration.total_seconds() > max_duration_seconds:
                logging.info("Run took %.1fs; widening schedule to 10m.", duration.total_seconds())
                # alter_job is a function, so invoke it with SELECT (not CALL).
                cur.execute(
                    "SELECT alter_job(%s, schedule_interval => INTERVAL '10 minutes')",
                    (stats["job_id"],),
                )
                conn.commit()
            else:
                secs = duration.total_seconds() if duration else 0
                logging.info("Policy healthy. Last duration: %.1fs", secs)

if __name__ == "__main__":
    monitor_and_adjust_policy("postgresql://user:pass@host:5432/iot_db", "sensor_metrics_5m")

Run the monitor on its own five-minute schedule and the pair becomes self-correcting: the SQL policy keeps the rollup fresh, and the Python check backs it off before a slow run cascades into a queue backlog.

Refresh Policy Design & Scheduling — the offset-tuning framework this recipe specializes.
Creating Continuous Aggregates with time_bucket_gapfill — bucket alignment and gap handling in the view definition.
Incremental Refresh Performance Tuning for Large Datasets — what to do when a refresh outruns its schedule.
Troubleshooting Stale Continuous Aggregates in Production — diagnosing watermark drift and missed windows.
TTL Policy Mapping & Enforcement — keeping raw retention wider than your start_offset.

← Up to Refresh Policy Design & Scheduling · Part of Continuous Aggregate Creation & Refresh Management

Setting Up Automatic Refresh Policies for 5-Minute Intervals

# Input Profiling: Metrics to Gather First

# Calculating the Three Offsets

# Worked Example: A 5,000-Device Sensor Fleet

# Edge Cases and When to Deviate

# Verification

# Related

Input Profiling: Metrics to Gather First

Calculating the Three Offsets

Worked Example: A 5,000-Device Sensor Fleet

Edge Cases and When to Deviate

Verification

Related