Deadlock when attempting to materialize multiple continuous aggregates #2308

nrlulz · 2020-09-03T00:42:03Z

Relevant system information:

OS: Ubuntu Server 18.04
PostgreSQL version (output of postgres --version): 12.4
TimescaleDB version (output of \dx in psql): 1.7.3
Installation method: 1.7.1 AMI

Describe the bug
I am unable to materialize more than one continuous aggregate view at a time due to deadlock

To Reproduce

CREATE EXTENSION IF NOT EXISTS timescaledb CASCADE;
CREATE TABLE readings (ts timestamptz NOT NULL, device int4 NOT NULL, value float8 NOT NULL);
SELECT * FROM create_hypertable('readings', 'ts');
INSERT INTO readings VALUES ('2020-01-01 00:00:00+00', 1, 1.234), ('2020-02-01 00:00:00+00', 1, 5.678);
CREATE VIEW readings_15m WITH (timescaledb.continuous) AS SELECT time_bucket('15m', ts) AS ts, device, AVG(value) AS value FROM readings GROUP BY device, time_bucket('15m', ts);
CREATE VIEW readings_60m WITH (timescaledb.continuous) AS SELECT time_bucket('60m', ts) AS ts, device, AVG(value) AS value FROM readings GROUP BY device, time_bucket('60m', ts);

Expected behavior
The two continuous aggregates would materialize concurrently. I feel like this used to be possible, but I am not sure if this is supposed to be a supported thing or if it's a bug.

Actual behavior
One begins materializing, the other fails with a deadlock detected error.

Additional context
Output of SELECT * FROM timescaledb_information.continuous_aggregate_stats;

view_name   |completed_threshold   |invalidation_threshold|job_id|last_run_started_at|last_successful_finish|last_run_status|job_status|last_run_duration|next_scheduled_run |total_runs|total_successes|total_failures|total_crashes|
------------|----------------------|----------------------|------|-------------------|----------------------|---------------|----------|-----------------|-------------------|----------|---------------|--------------|-------------|
readings_15m|2020-01-31 16:15:00-08|2020-01-31 16:15:00-08|  1008|2020-09-02 17:17:39|   2020-09-02 17:17:39|Success        |Scheduled |  00:00:00.019543|2020-09-02 17:47:39|       149|            149|             0|            0|
readings_60m|2020-01-01 12:00:00-08|2020-01-31 16:15:00-08|  1009|2020-09-02 17:17:36|   2020-09-02 17:17:36|Failed         |Scheduled |  00:00:01.005059|2020-09-02 19:15:45|         2|              1|             1|            0|

Relevant slice of postgresql log

2020-09-03 00:17:36.971 UTC [28395] LOG:  new materialization range for public.readings (time column ts) larger than allowed in one run, truncating 2020-02-01 00:15:00+00 to 2020-01-01 10:00:00+00
2020-09-03 00:17:36.971 UTC [28395] LOG:  materializing continuous aggregate public.readings_15m: nothing to invalidate, new range up to 2020-01-01 10:00:00+00
2020-09-03 00:17:37.972 UTC [28396] LOG:  job 1009 threw an error
2020-09-03 00:17:37.972 UTC [28396] ERROR:  deadlock detected
2020-09-03 00:17:37.972 UTC [28396] DETAIL:  Process 28396 waits for ShareLock on transaction 97084915; blocked by process 28395.
        Process 28395 waits for AccessExclusiveLock on relation 12398811 of database 12398506; blocked by process 28396.
        Process 28396: <command string not enabled>
        Process 28395: <command string not enabled>
2020-09-03 00:17:37.972 UTC [28396] HINT:  See server log for query details.
2020-09-03 00:17:37.972 UTC [28396] CONTEXT:  while locking tuple (0,42) in relation "continuous_aggs_invalidation_threshold"
2020-09-03 00:17:37.974 UTC [28395] LOG:  the materialize continuous aggregate job is scheduled to run again immediately
2020-09-03 00:17:37.974 UTC [19360] LOG:  background worker "Continuous Aggregate Background Job" (PID 28396) exited with exit code 1

The text was updated successfully, but these errors were encountered:

RedShift1 · 2020-09-04T14:21:48Z

Does the 2.0.0 milestone mean it won't be fixed in a next 1.7.x release? Seems like a pretty serious bug.

erimatnor · 2020-09-15T11:52:39Z

@RedShift1 no it doesn't mean we won't fix it for the 1.7.x branch.

erimatnor · 2020-09-15T11:55:12Z

@nrlulz is this issue happening continuously, i.e., does one of the background jobs always fail or does it eventually resolve itself?

nrlulz · 2020-09-15T16:00:33Z

It is no single job that fails. It eventually resolves itself, I just can only run one at a time.

erimatnor · 2020-10-21T06:30:35Z

We have yet to encounter this issue on 2.0 RC, so retargeting this for next 1.7.x release.

erimatnor · 2021-01-18T09:25:43Z

We're trying to verify whether this is still a problem in 2.0. If so, we'll try to target a fix for that branch instead.

NunoFilipeSantos · 2021-10-01T12:05:13Z

@nrlulz can you please try against our latest TimescaleDB version (2.4.2) and get back to us?

nrlulz · 2021-10-01T15:54:39Z

@NunoFilipeSantos I have not seen this happen in any of our databases since upgrading to 2.0. They are all running 2.4.2 now.

NunoFilipeSantos · 2021-10-01T16:13:54Z

That is wonderful news! 🥇
Closing this issue, if you encounter it again on 2.X, feel free to reopen it.

erimatnor added bug continuous_aggregate labels Sep 3, 2020

erimatnor added this to the 2.0.0 milestone Sep 3, 2020

erimatnor modified the milestones: 2.0.0, 1.7.5 Sep 21, 2020

erimatnor modified the milestones: 2.0.0, 1.7.5 Oct 21, 2020

erimatnor added the severity-p3 label Nov 16, 2020

erimatnor removed this from the 1.7.5 milestone Jan 18, 2021

phiresky mentioned this issue Sep 6, 2021

Continuous aggregate refresh policy locks out all interaction with the main hypertable if a pg_dump is running #3554

Closed

NunoFilipeSantos added 1.X waiting-for-author labels Oct 1, 2021

NunoFilipeSantos closed this as completed Oct 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deadlock when attempting to materialize multiple continuous aggregates #2308

Deadlock when attempting to materialize multiple continuous aggregates #2308

nrlulz commented Sep 3, 2020

RedShift1 commented Sep 4, 2020

erimatnor commented Sep 15, 2020

erimatnor commented Sep 15, 2020

nrlulz commented Sep 15, 2020

erimatnor commented Oct 21, 2020

erimatnor commented Jan 18, 2021

NunoFilipeSantos commented Oct 1, 2021

nrlulz commented Oct 1, 2021

NunoFilipeSantos commented Oct 1, 2021

Deadlock when attempting to materialize multiple continuous aggregates #2308

Deadlock when attempting to materialize multiple continuous aggregates #2308

Comments

nrlulz commented Sep 3, 2020

RedShift1 commented Sep 4, 2020

erimatnor commented Sep 15, 2020

erimatnor commented Sep 15, 2020

nrlulz commented Sep 15, 2020

erimatnor commented Oct 21, 2020

erimatnor commented Jan 18, 2021

NunoFilipeSantos commented Oct 1, 2021

nrlulz commented Oct 1, 2021

NunoFilipeSantos commented Oct 1, 2021